MyResearch Portal – Andrew Nagy
MyResearch Portal – Andrew Nagy
ILS agnostic web portal for students and faculty to perform research activities
Create 1 single interface for all library resource to minimize interface learning curve
Develop in-house a “framework” to combine all of our resources
- Most resources are in XML
- Digital Library: METS
- MetaLib XServer: XML
- Catalog: MARCXML
- Library Website: XHTML
Data Store
- Native XML stores allows for easy storage of complex data
- No need to develop a complete relational database and covert data – too messay
- No need to normalize data
- Just import!
Native XML Database- Could it be that simple?
eXist – Open Source
- Still in infancy-ish stages
- Platform independant
- Java Backend
- API: REST, SOAP
Berkeley DB XML – Open Source
- Proven capabilities
- Support for a wide range of platforms
- Good performance
- Decent help support
- Commercial backing
- No full-text extensions
- No inherent directories
Commercial Options
- MarkLogic
- Enticing Discounts for .edu and non-profits
- Commercial Support
- Much more complex to administrator
- Speed
Scalability Testing
- eXist not meant for searching, more for browse and fetch
- DBXML Sleepycat – rework queries and modified indexes to make these respond in 30-60 seconds
Converted MARCXML to custom format because MARCXML not helpful (elements all have the same names)
- dbxml – 1.6 -1.7 second response
Query Optimization
- This is an important step since we are dealing with infant technology
- dbxml has a query plan generator
- eXist will soon have a query plan generator and a new query optimizer
Implementation
Create a web portal using a Native XML Database
Performance
- The good
- .9 seconds
- More advanced queries can get as high as 12-15 seconds
- What happens when 10-50 simultaneous users search with advanced queries
Need to develop a lots os search query translation algorithms to missing Full Text Extension
So the answer is NOT YET!
It’s a Sun Shiny Day
- Apache SOLR to the rescue!
- SOLR implements Lucene index on XML documents
- SOLR is platform independent
- Runs as a java web app
- Interface via REST
- XML database use XQueries
Easy Implementation
- XSL Stylesheet to covert MARCXML to SOLR XML
- Coverated 492,000 in 2.5 hours
- 3 hours
- Andrew showed the final product which uses Solr in the Lightning Talks
Other options