The MetaStore project team have had a paper accepted at the XTech 2006 conference in Amsterdam.
XTech (formerly XML Europe) is all about XML technologies; topics include Semantic Web and RDF, Tagging, Annotation, Mashups, Web Services - all the new fangled "Web 2.0" stuff...
It is apparently "the premier European conference for developers, information designers and managers working with web and standards-based technologies." (yikes!) The keynote speakers are from Amazon and Yahoo. (double yikes!).
This is the abstract:
The aim of the Ingenta MetaStore project is to build a flexible and scalable repository for the storage of bibliographic metadata spanning 17 million articles and 20,000 publications.
The repository replaces several existing data stores and will act as a focal point for integration of a number of existing applications and future projects. Scalability, replication and robustness were important considerations in the repository design.
After introducing the benefits of using RDF as the data model for this repository, the paper will focus on the practical challenges involved in creating and managing a very large triple store.
The repository currently contains over 200 million triples from a range of vocabularies including FOAF, Dublin Core and PRISM.
The challenges faced range from schema design, data loading, SPARQL query performance. Load testing of the repository provided some insights into the tuning of SPARQL queries.
The paper will introduce the solutions developed to meet these challenges with the goal of helping others seeking to deploy a large triple store in a production environment. The paper will also suggest some avenues for further research and development.
Now we just need to write and deliver a presentation... eep! In case it all goes horribly wrong, I'm now recruiting for a volunteer member of the audience to faint at my signal...