Breakdown of the Journal info enhancement script
So a while back I posted code which added peer reviewed indicators to a Serial Solutions E-Journal list. Never being quite satisfied with how stuff works and wanting to make things better I’d rewritten and expanded the script. Now it adds Peer Reviewed indicators to Serial Solutions and an Innovative catalog full record display screen. It also adds links to display the most current table of contents for a given journal if it exists (in both the Serial Solutions and Innovative UI).
Adding Peer Review indicators
- Grab the ISSN from the page (Innovative, Serial Solutions)
- Send ISSN to xISSN service and retrieve whether or not the journal is peer reviewed
- Add Peer Reviewed indicator to the page
The hardest part of this script involve obtaining the ISSN. Serial Solutions luckily tags this in a span. Innovative puts it in a table structure so using JQuery I can use the following
$(“#fullSection td.bibInfoLabel:contains(‘ISSN’)”).next().text()
what this does is find the td with the text ISSN in it and then gets the text in the next tag.
Adding the Peer Reviewed indicator is a matter of finding the place in the HTML structure you want to add the new code and appending it. For simplicity sake in Innovative I’m just adding a new row to the table which contains the bibliographic data.
Adding a link to the table of contents
- Grab the ISSN from the page (Innovative, Serial Solutions)
- Send ISSN to xISSN service and retrieve whether or not the journal is has a table of contents RSS feed available
- If ISSN has an RSS feed available, add a link which say See Latest Table of Contents and executes the TOC script
This script build on what the Peer Review section of the script does and in addition to requesting the peer review field also gets the rssurl field from xISSN. If there is an rssurl field then a link is created and added to the page.
The tricky part of this script is the portion which brings up the Table of Contents in a popup window. What is tricky about this is the fact that the RSS feed exists on a different server and that its XML that needs to be manipulated. It isn’t the fact that data is XML part that creates the difficulty, JQuery is capable handling XML. However, we don’t really know the form (RSS 1.0, RSS 2.0 or Atom) that the feed is which makes it much more difficult. Additionally, because the data being retrieved isn’t JSON we can’t get it without creating a cross-site scripting issue. Two resolve both these issues, I’ve created a PHP script which retrieves the feed and parses it into JSON which I can access. I’m using the SimplePie library to parse the feed which saves me lots of time because it takes care of the multiple types of feeds issue.
This is my 2.0 solution to the problem. My initial solution used a PHP script that just built the popup HTML content and then configured Apache to proxy the PHP script to avoid the cross site scripting issue. I gave up on this solution because it is predicated on the person installing the Javascript being able to configure Apache on the server with the Javascript to act as a proxy. This makes the solution more complicated to configure which was unacceptable. If you want to explore the code in more depth feel free to view the full javascript and the PHP code.
This post is a hold over from before I started working for OCLC which I didn’t get published until now. I’m posting it here so that folks who saw the original content can follow-up. Future posts on OCLC Web Services will be at the OCLC DevNet Blog.
I’m interested in more about your ToC script. You’re getting it from the feed, and just figuring the entries in the feed comprise the latest ToC? (I hadn’t realized xISSN had feed info now; is it harvested from the ticToc project so the same data available there? Or is it data OCLC has harvested in another way?)
The feed theoretically includes _links_ to each of the articles, yeah? But are you just ignoring the links and not including them in your display, or are you including them? Ignoring them might make sense, since the links may or may not lead to an “appropriate copy” your users are licensed to see.
That appropriate copy/licensing problem is what, in the thought experiments in my head, kept me from using those RSS feeds at all. But I hadn’t thought of using them as a “latest table of contents”, without worrying about the links to maybe full text, which may or may not work for my users.
This project would, in my opinion, be a pretty good topic for a relatively concise short article in the Code4Lib Journal, if you’re interested.
Jonathan,
OCLC harvests the data from TicTocs for the rssurl field. Since TicTocs makes the assumption that the feed is for the most recent articles I make the same assumption in my script. The javascript right now just returns the first 5 entries but it can easily be modified to return more. The linking issue is a problem. I have it setup to go the the link for each entry which typically takes you back to the publisher’s site. Many of the feeds contain the article title, link, and a summary of the article. Sometimes the summary has source information. What would be really nice is if the feeds all had metadata to create an OpenURL in them. COiNs maybe or something else. What way one could build an OpenURL to an appropriate resolver. Having the feed built with OpenURLs that pass through the OCLC OpenURL gateway might be another solution.
If you want to see a mockup in action go to http://www.librarywebchic.net/mashups/journal_enhancements/jal.html
For me this script was a baby step towards thinking about how to get users connected with the latest information from a given journal. I’d like to take it further. If you’d like to collaborate, drop me an email coombsk [at] oclc.org
Also I’ll be demo-ing and talking about the script at code4lib as part of my presentation “7 Ways to Enhance Library Interfaces with OCLC Web Services”. So you’re welcome to ask me questions there if you’re going.
Thanks Karen. I agree sending the user through an appropriate link resolver instead of to the publisher would be best (that’s what a link resolver is for).
Sadly, there’s no standardized way for an RSS/Atom feed to include structured citation information (via OpenURL or otherwise). And even if there were such a recognized way, it would be unlikely that most of these publishers feeds would support it.
But I’m thinking that it might be worthwhile to present the latest articles (with summaries if available) even without any links at all, just as, well, table of contents, to be evaluative information on the journal itself, for the patron to see what sorts of stuff an unknown journal carries. Not sure if it’s better to leave the links out, or to include links that may lead the user to a pay wall. I guess testing could be done!
While publishers might not find it worthwhile to add this information, even though Atom and RSS don’t have fields for this information, both are extensible so a schema that supports citation information could be added and used to provide information for OpenURL building.