On OpenSearch and libraries

2009 February 8
by Karen

It is interesting sometimes how what I’m working on serendipitously intersects with discussions that other library folks are having. Take for example a relatively recent discussions on Bibliographic Wilderness about OpenSearch and SRU/W. I’ve been working on a OpenSearch interface to the data in our library website and also doing development with the WorldCat search API for 9 months. As a result, I’ve been getting a much better sense of OpenSearch and its capablities. The WorldCat Search API offers both SRU/W and OpenSearch. The major difference between the two? Formats output and search syntax and indexes. If you want MARCXML or Dublin Core then you need to search via SRU/W. But this doesn’t have to be the case. OpenSearch can be setup so that results can be returned in any format. Evergreen does just this. Don’t believe me, check out UPEI or Georgia PINES Evergreen OpenSearch Description file. It defines the syntax for retrieving OpenSearch results in a variety of formats (MARCXML, MODS, RSS, Atom, and HTML)

The more annoying rub is that most implementations of OpenSearch only allow for simplistic searching. But as Ross points out, this also doesn’t have to be the case. CQL could be used with OpenSearch. I’m not sure if the OpenSearch interface to Evergreen would take CQL but boy wouldn’t it be nice.

Why bother adding these features to OpenSearch instead of just using SRU/W? Well for one it makes your content more findable. OpenSearch is a standard which has been adopted by the web at large. By using it libraries expose their data to a wider audience and allow it to be incorporated into mashups. It also would be interesting to consider embedding MODS or MARCXML in Atom. This is possible and there is an article which discusses one library doing this in a 2007 issue of the Journal of Web Librarianship. Yet again the advantages is putting the data in a format which the rest of the web world understands. However, embedding MODS or MARCXML allows for the original format for the data to be retained. I like this idea alot because I don’t think all bibliographic data elements map well to the standard Atom or RSS feeds. So, you are forced to make less than desirable choices in crosswalking the data. As a result, data can be lost because either fields aren’t mapped at all or mapped in a way that their original meaning obscured.

The upshot, I think OCLC would be well advised to expand the way in which their OpenSearch interface works perhaps in a way that allows CQL to be used in the search terms. It would make the WorldCat Search API easier for libraries to adopt. Particularly when the library wants a very limited set of fields in the results set but wants to perform something better than a keyword search. (Like an ISSN or ISBN search) Libraries also need to think about building an OpenSearch interface to their collections.

5 Responses leave one →
  1. 2009 April 15

    I am not sure adding the complexity of CQL to open search is going to serve your purposes. Most open search developers aren’t going to be familiar with CQL and it is at cross purposes with the open search standard itself, namely that the open searches should be simple.

  2. 2009 April 15

    How would you suggest one do fielded searching in open search? Its pretty difficult for libraries to get by without some sort of fielded searching. The notions of about and by get merged without fielded searching. Also there are issues of limits, whether that be format or libraries with holdings. IMHO, these are pretty important. I’ve spent quite a bit of time looking at the OpenSearch Specification and can’t figure out how to do these things with OpenSearch. It seems like searching a specific field might be possible but how to search for books by William Shakespeare in the University of Houston Libraries isn’t clear to me.

    If you have any good examples or documentation of this I’d appreciate you sending them my way because it isn’t evident to me how to do this from what I’ve read thus far.

  3. 2010 March 15

    I think we can set up an OpenSearch extension for advertising CQL-capable fields, such that clients that don’t know about the extension can treat it as plain old ordinary OpenSearch with a plain old ordinary search, and still have it work right — but clients that DO know about CQL (such as one written by someone that asks OCLC, hey, how can I do a more complex search, and is told “CQL”) can use it.

    Tony Hammond has started working out an SRU extension to OpenSearch, but I don’t think he’s _quite_ hit the mark yet — what he’s spec’ed out doesn’t seem, to me, to degrade quite properly for clients that don’t know about the extension. I also think a bit more needs to be done about the “right” way to put a sufficient amount of an SRU “explain” document in an OpenSearch desc so a client can discover what fields and operators are supported.

    I agree with your post. To me, the benefit of using OpenSearch is not mainly that “software already knows OpenSearch” (although in some contexts that will be a benefit, and can’t hurt), but that OpenSearch is SO MUCH easier to work with as a developer than the very complicated SRU. I think OpenSearch plus a few fairly simple well thought out extensions to OpenSearch based on SRU will be able to do pretty much everything (or maybe even everything) that SRU can do, but be so very much simpler to work with.

    But as a first step, I think you’re absolutely right that there’s NO good reason that the OCLC OpenSearch and SRU interfaces support different return formats. The OpenSearch interface can and should support every single return type that SRU does. And the SRU interface should likewise support every return format that OpenSearch does (if it can; SRU may be less flexible; if SRU isn’t capable of returning certain formats, fine, leave em out).

    That’s a fine first step, and doesn’t require thinking about OpenSearch extensions or how to properly advertise CQL support in an OpenSearch description. If you could make that happen with the OCLC services, that would be great.

  4. 2010 March 15

    Ha, I just realized this post was from 2009 not 2010, and written before you worked for OCLC! I read it in a different context, happened to see it in my own blog’s referer logs, didn’t realize it was a year old. My comments still stand though!

  5. 2010 March 16

    Jonathan,

    What is interesting is the VIAF actually does OpenSearch with embedded CQL, which is pretty cool. The OpenSearch is built on top of VIAF’s SRU interface. Also from what I’ve seen interacting with OCLC’s Web Services SRU can respond with just about any XML format. Identities has its own schema, Terminologies uses several XML formats (SKOS, MARCXML), VIAF has several output formats.

    RE: all formats in all types of searchs in the WorldCat API. This is definitely getting discussed at work. The question is one of priorities because there are so many things that one could do to enhance the WorldCat Search API

    It amazes me how much I’ve learned in the last year since writing this post, working with the different web services. My knowledge of SRU has certainly grown as has my knowledge of different XML Schemas and what one can do with them.

Leave a Reply

Note: You can use basic XHTML in your comments. Your email address will never be published.

Subscribe to this comment feed via RSS

You must be logged in to post a
video comment.