Friday, December 12, 2008


Sean Gillies has written an important memo Concordia, Vocabularies, and CIDOC CRM on Concordia's current approach to using the Comité International pour la Documentation des Musées - Conceptual Reference Model (CIDOC-CRM). It should be widely read by people interested in the digital publication of resources for the ancient Mediterranean and beyond. In it he gives a preliminary indication that RDFa - a standard for embedding the Resource Description Framework in html pages - provides a better route forward for the time being. But don't take my word for this, read his whole text.

RDFa has appeared on this blog: PRAP, xhtml 2.0 and Archaeological Databases was early thinking, RDFa at Ilion is more recent, also makes use of RDFa. So Sean's memo is welcome here because his reasoning is similar to mine.

But what of CIDOC-CRM? The main CIDOC-CRM website opens with:
The CIDOC Conceptual Reference Model (CRM) provides definitions and a formal structure for describing the implicit and explicit concepts and relationships used in cultural heritage documentation.
It also notes that the CRM is an ISO standard (ISO 21127:2006). That's a good thing.

In general, the main CIDOC-CRM website doesn't do a good job of introducing itself. If you want a quick feel for how the CRM organizes concepts, try the relevant section of Princeton's QED site. You'll see that the CRM provides a well-thought out vocabulary of concepts for describing cultural heritage. Apart from the odd use of gendered language, it's useful that the CRM defines the concept E24 Physical Man-Made Thing. It will be cool when I can search the Internet for E24's within the Aegean that date to the Late Roman period. I'm guessing the CRM will play a role in enabling such functionality.

In terms of resources linked from the main CIDOC-CRM website, I've paid the most attention to the "mappings" page. I take heart in the work being done in this domain because of the implication that my use of the CRM can be indirect. This is encouraging because current self-representation by CIDOC seems to obscure notions of "best practice" in an over-abundance of detail. See this paper for an example. It is to the CRM's credit that it can represent all the concepts used there, but in many cases one does not have, nor need, this level of detail. I will be happy to use VRA, Dublin core and any other vocabularies and ontologies that gain traction in the Semantic Web world and trust that these will be mapped to the CRM.

From my perspective, that there is not a large amount of CRM-encoded original archaeological data easily available on the internet is an indication that the standard has not seen a high-degree of real world uptake. I understand that there is acceptance of the CRM and many initiatives discussing how it can be used (here) but I would very much like to see actual use with large datasets. I'm also interested in seeing projects that adopt the CRM as the original format for "born digital" data. Will that really happen?

This post represents thinking that I hope will change as we see real world adoption of standards in Cultural Heritage. I'm agnostic as to what the future holds. For the present, I'm all for exploring vocabularies and ontologies that are moving towards RDFa representations.

