Use it or lose it
I’ve been working on code to include newsfeeds into a web page using Coldfusion for our libraries new content management system. The first issue with creating this kind of code is that there are various types of news feeds available: RSS .9x, RSS 1.0, RSS 2.0, and Atom feeds. I want my tool to be able to handle all these different kinds of feeds and I don’t want the person inserting the feed into the page to need to know what kind of feed it is.
So the first thing I needed to do was to create XSLTs that did the transformations. Unfortunately, it has been more than a year since I used by XSLT writing skills. The upshot is that I’ve forgotten some major concepts in building XSLT. Hence the “Use it or Lose It” every tech skill that doesn’t get used rusts to the point of being useless. This isn’t just true for me, it is true for almost every techy I’ve met. Its something I struggle with as a techy but also as a manager because I have to make sure I give my staff variety in their work to keep various skills sharp. Since I’m the XML expert I need to make sure that I do enough XML that I can do it easily when it needs doing. Plus I need to get my staff some training in this area so that they can help with this work as well.
As far as the particular thorns with my XML project, I’m struggling with XPath and selecting nodes to display. I remember this being my bane the first time around but after several weeks of work I seemed to get the knack of it and was able to construct transformations with ease. The hard drive that is my brain has erased all the data relating to that experience and I feel like I’m back at square one again.
Trying to comprehend the difference between <apply-templates> and <call template>, as well as the match versus select attributes has given me a massive headache and acid reflux. So tomorrow rather than the trial and error method I’m going to return to contemplate my Wrox XSLT book to see if it might impart some drop of wisdom so that I can make actual progress on this project!
“Use it or lose it”… true, but not the whole story. Whenever I hear complaints like this, I am reminded of one of my most memorable math teachers in Berkeley. On the first day of class, he anounced:
“Two months after this semester is over, you will have forgotten 95% of what I will teach you in this class. So, why do I bother to teach you in the first place?”
Basically, his answer was that if we ever had to re-learn anything taught in the class, we would learn much faster the second time around. Something happens in the brain the first time that you learn something, neurological connections are made that are much easier to recall than to forge in the first place. Here is where the brain as a “hard drive” is a misleading metaphor …
Also, the more techie things that you learn and forget, the easier it is to pick up new and related things because you acquire certain tendencies and habits of thought.
Of course, this doesn’t fix your project right now, but it is not as bad as you suppose.
P.S. Very interesting blog. I’ll have to pay more attention now that I am working in a library.
The O’Reilly XSLT Cookbook is your friend. It gets me through a lot of duh moments.
And if it helps, I have the same problems with XSLT that you do! I find that it helps to think of apply-templates as “do whatever else you’re going to do” whereas call-template is “do this specific thing”. Match versus select is just weird.
I agree that the “push” vs. “pull” issue is key. The result of processing is dramatically different between those two, so picking the right processing model is essential. For example, if you decide to use “push” processing and therefore use , when you encounter a new element for which you have not provided a template, then it simply gets dumped onto the output tree. This means you could start seeing spurious data appearing if your input stream throws you a new element without you being prepared for it. There are of course other issues regarding this decision, but that’s certainly one consideration.
My goal is to have it check to see which type of feed is being processed before I apply any transformation whatsoever. If it can’t tell what kind of feed it is, it won’t process it and instead send an error message in the background to let my staff know there is a problem. As the specifications for feed change over time, this will hopefully enable us to update our stylesheets.
You may well have considered this already, but have you thought of using feed2js to include feeds on your page? Could save some painful XSLT moments.
http://library.cshl.edu/feed/build.php
The project is larger in scope than just getting feeds on the page. I’m working on an XML module for our content management system that will deal with EADs, feeds, and a couple other types of XML. I’m just stuck on dealing with the feeds right now.
First, you have a good idea with studing XPath a bit before you do xslt, I had a heck of a time learning the two at once.
The whole apply/call is also easier if you’re used to a language like lisp. It’s easier to remember once you’ve really grasped the model that xslt uses in the background (at least it was for me).
First read up on the default template rules. If you forget about the default templates you start thinking of apply-templates as being far more magical then they are.
Then remember that calling apply-templates is essentially saying, lets see what templates can match what we selected. By default the children of the currently selected element are selected. If you want to select something else, do so.
The templates match, but you can select certain nodes to apply the whole set of templates.
Call-templates is much like a function call. Instead of looking to see what templates match what nodes and executing the most specific match, you just say “execute this”.
These two things combined with modes can make xslt quite powerful for going through xml documents.
(I’ve always thought that for mostly flat feeds XSLT is a little overkill, SAX seems better suited).
The naming of some of the attributes is annoying. (As well as the preceeding/following language in XPath, always forget that)
(Trying to give helpful advice right after work may not be the best idea, hopefully this should clarify it alittle more. I’ve played around with XSLT quite a bit and enjoy it.)
Hey Karen – XSLT for handling feed type variants is a black hole you might best avoid. The python-based “universal feed parser” is generally accepted as the best tool for dealing with all the variants. Something you could do is set up a cronjob and a simple db of feeds you want to “normalize”, and have a script run regularly to scrape all those random feeds into one standard format. Then you can write one single XSLT handler for the format you prefer – or not even have to use XSLT at all.
Another option, should you want to pull all the disparate feeds into one big feed, is to set up a private “planet” of all your feeds, and then just pull the feed from the planet into your public page. That would be even less work. :)
I don’t know if either of these fits your needs, but maybe something will trigger a thought. Anythng to avoid dealing with all this in XSLT alone!
-dc
Ah, so not just feeds.
The only thing I’ve seen that parses EAD nicely come in commercial systems like Ex Libris’ Digitool.
All the RSS formats (9?) and EAD can be hairy, hairy beasts.
See also:
The Myth of RSS Compatibility
http://diveintomark.org/archives/2004/02/04/incompatible-rss
Another vote for O’Reilly, their book Learning XML is awesome. I keep a copy on my desk within easy reach. It’s got a good section on XPath and XSLT.
Another tip: create a simple stylesheet that you can use to test XPath queries on documents. Some XML editors will even allow you to test arbitrary XPath expressions.
When you’ve gotten XPath down, and you really want to hurt your head, check out the Muenchian Method.