2008-03-20

Hooray for OCLC Pica customer response !

In my post about our Google Book Search implementation, I mentioned that we could only do this for records containing an ISBN. Google also accepts other identifiers, like oclc numbers and the numbers of the Library of Congres.
We don't have that data recorded, but all of our records end up in Worldcat. Time to get in touch with OCLC pica, the European branch of OCLC, which manages our Dutch Union Catalog and makes sure these records are also uploaded to Worldcat.

I had a short but fruitful email communication. Their first reaction helped me to understand that I can locate a corresponding record in Worldcat using a URL containing the PPN (Pica Production Number) which we do record, since that is the identifier for the Dutch Union Catalog. That's neat. I can now point to Worldcat's 'Find in a Library close to you' page, from the catalog record, for books out on loan or non Wageningen UR users. On the resulting page the OCLC number is present. I could do some page scraping (which is pretty easy, since we only use XML tools and worldcat returns XHTML pages (bravo !)), but it is not elegant and pretty slow as well. I mentioned this to OCLC pica and also pointed that the link from Worldcat to our local catalog should always be based upon the PPN. (Worldcat only does this for non ISBN holding records and accidentally does this using the OCLC number in stead of the PPN). OCLC Pica responded quickly that it was indeed better to have a small service to request for the OCLC number when providing a PPN and that they would make this available to me before the end of March. I was astonished. Isn't that great. When I thanked for this immediate response, I took the liberty to request if they would add the Library of Congres number with the response as well. Thanks Martin.

2008-03-19

Hooray for Google customer response !

It was not easy to find an appropriate response form on Google's web site to complain about my problems using the Google Books API. I found a form that was supposedly for authors and publishers wanting to advocate their book on Google Books and used it. Google responded today:
Thank you for notifying us of this problem regarding our API. I have forwarded these issues on to our specialists, who will look into the matter. Please feel free to reply to this email if you have any further details about the difficulties you are experiencing.

Sincerely,
Greg
The Google Book Search Team


I noticed earlier (when Google introduced URL resolving in Google Scholar) that they can be quite responsive. Of course I haven't got a solution yet. But have you ever had a response on your problems with Microsoft, even though we pay them for their products ?

2008-03-18

Google books API. Do they really want you to use it ......

Last week there has been a lot of discussion about the Google Books API, allowing one to check whether Google has a book description, can provide you with a cover and tell you whether it has scanned the book completely or partly. Examples for scripts appeared on the Google books site, Tim Spalding gave examples on the LibraryThing Thingology blog and Godmar Back responded with some alternate scripts on the code4lib discussion list.
Ex Libris announced proudly that they had implemented the 'About this book' product into their products and that it only took a week to get the link in place. Sunday evening at 11:00 pm. I decided to see wether it would be difficult to implement this into our OPAC. Just after midnight I had implemented it and it has been running since.
As Wouter Gerritsma explains in his blog, we can only check Google for a book, when we have an ISBN. Now we want to be able to do it for books that have not got a ISBN, using the OCLC number which we have not registered in our records. However, we do have a PPN (Pica Production Number) and OCLC Pica makes sure our titles end up in worldcat, so we should be able to get hold of the OCLC number.

So far so good.
But Google has some policies that obstruct the usage of their API. A product like SFX may suffer severely from this. (Depending on the way they are going to implement this) It surely affected our implementation severely and now I am trying to find a way to get around this.

I don't know if you have ever experienced to end up with the We're sorry .... message of Google, telling you that you probably are infected with spyware or some virus. (Some people are really shocked when they see this warning !!)
Google sends this message when it detects 'anomalous queries' from one single IP adress. We occasionally see this error in Wageningen and I am not sure if some computer on the university network does some extreme Google access or whether it is just busy with people searching Google. All the request from the network look like coming from one or just a few computers to Google, due to the network address translation on the firewall. Anyway, Google books seems to suffer much harder from this problem than other Google services. Just a few hours after implementation, the API did not respond with a JSON object (containing the requested information for this service) but with an ordinary html page, the 'We're sorry page' messing up this service completely.
I can hardly believe this is just caused by implementing this service, so I have now defined a ProxyPass directive on the web server so requests to Google for the API go via our library web server. Google will see all requests coming from this server now. This way we avoid it to see the requests coming from the firewall gateway and we will not suffer from all other Wageningen UR PC's searching Google. If this does not solve the problem, I will be sure that Google will see normal usage of the API as unwanted traffic. If so, what kind of API are they offering us ?? For the Google Map API or the Google Custom Search API they have a so called access key to use this service, I guess that would be the way to go for this API to prevent unwanted use.