Tuesday, January 26, 2010

RDFa Patterns for Ancient World References

I am continuing to experiment with semantic links within digital publications relevant to the Ancient World. Here's a snippet from the same article I drew from in the last post.
In 124, Polemon had spoken before Hadrian and persuaded him to make a gift of money and grant a series of honors to Smyrna, not least of which was a second temple to the imperial cult (IvS 697; Burrell 2004: 42-48).
The "things" I want to identify are:
  • The year 124 as an event.
  • The sophist Polemon
  • The emperor Hadrian
  • The imperial cult
  • And the two citations
And I want to do this in a standards-based way that is automatically recognizable by third-parties (or at least their software agents).

As before, I'm using RDFa. In a future post, I'll explain this choice and talk about what RDFa and RDF are, but for now I'm diving right in.

The relevant namespaces that I'm using are:
  • xmlns:dbpedia="http://dbpedia.org/resource/"
  • xmlns:cito="http://purl.org/net/cito/"
  • xmlns:ev="http://purl.org/rss/1.0/modules/event/"
  • xmlns:ex="http://example.org/"
  • xmlns:foaf="http://xmlns.com/foaf/0.1/"
  • xmlns:frbr="http://purl.org/vocab/frbr/core#"
  • xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
  • xmlns:owl="http ://www.w3.org/2002/07/owl#"
  • xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
  • xmlns:skos="http://www.w3.org/2008/05/skos#"
  • xmlns:xsd="http://www.w3.org/2001/XMLSchema"
All the markup that follows is experimental and comments are welcome, of course.

Polemon
The reference to Polemon now looks like:
<span id="id2209"
about="#id2209"
typeof="skos:Concept foaf:Person"
resource="[dbpedia:Polemon_of_Laodicea]"
rel="owl:sameAs cite"
property="rdfs:label">Polemon</span>


With the '<head>' of the document including '<base href="http://example.org/ajn2006-smyrna.html"/>', that RDFa gives the following RDF/turtle:

<http://example.org/ajn2006-smyrna.html#id2209>
owl:sameAs dbpedia:Polemon_of_Laodicea ;
a skos:Concept, foaf:Person ;
<http://www.w3.org/1999/xhtml/vocab#cite> dbpedia:Polemon_of_Laodicea ;
rdfs:label "Polemon"@en .
Some observations:
The pairing of 'id' and 'about' attributes means that I can identify a span of text and then say things about it.

I then give that span a type. Here I say that it's a skos:Concept and a foaf:Person. Which concept and which person? http://dbpedia.org/resource/Polemon_of_Laodicea. 'skos:Concept' will be used on all named-entities, and their nature will be further qualified when it's useful.

Why "owl:sameAs'. Here I follow the usage of dbpedia.org. If you look at the Polemon page, you'll see the same construct used to make the link to freebase. 'owl:sameAs' also underlies sameas.org (see the n3 for Hadrian).

The metaphor here is that I am instantiating Poleman as a concept and person present in the text. That should be recognizable and actionable. There is some redundancy in how I go about doing it, but that is in the spirit of convenience for future processors of this data.

"In 124"
This looks like:
<span id="id3724"
about="#id3724"
typeof="frbr:Event"
rel="owl:sameAs"
resource="dbpedia:124"
property="ev:startdate"
datatype="xsd:year"
content="124">In 124</span>
Same basic process. I isolate some text as individually addressable. I say what is, in this case a FRBR Event. Here I also embed a machine-readable property, the start date, into the document , but retain the inline text as the label.

But I am probably on less-firm ground here. I use FRBR because it's an LOC approved standard. I annotate the event with an RSS Event property and that's a little weak. And it might seem odd to equate the event with the dbpedia representation of the year 124. If you follow through to the wikipedia version, that does refer to Hadrian's trip east, which is the setting for Polemon's speech. In the case of a better known event, I think I'd prefer to link to a representation of that, for example http://dbpedia.org/page/Sack_of_Rome_(455). The 'owl:sameAs' on that page will eventually redirect you to the right Wiki page.

Here's the RDF/Turtle produced by the above RDFa:
<http://example.org/ajn2006-smyrna.html#id3724>
owl:sameAs <dbpedia:124> ;
ev:startdate "124"^^xsd:year ;
a frbr:Event, skos:Concept .
As above, the goal is for this to be usable in a number of contexts.

References
There are two inline references at the end of the sentence. The first is to a primary source, an inscription at Smyrna as published in Petzl, G. (1982). Die Inschriften von Smyrna. Bonn: Habelt. The second is to Barbara Burrell's Burrell, B. (2004). Neokoroi: Greek cities and Roman emperors. Cincinnati classical studies, new ser., v. 9. Leiden: Brill.

Here's the RDFa for the second:
<span id="id4616"
about="#id4616"
typeof="ex:Citation"
rel="cito:citesAsAuthority cite"
resource="http://www.worldcat.org/oclc/53013513"
property="rdfs:label">Burrell 2004: 42-48</span>
This is similar markup as previously, except I'm not instantiating it as a 'skos:Concept'. I am using the CITO ontology to indicate the relationship between the works, but note that I'm currently making up the type 'ex:Citation'. Perhaps I could use 'cito:Document' but that doesn't seem quite right. I really want to mark this span of text as being a citation but haven't found just the right RDF vocabulary. I looked at BIBO but, like CITO, it doesn't have the exact class I want. BIBO is linked with Zotero so I'd like to use it. For now, CITO has a more detailed set of relationships between citing and cited documents so I'm going with that. Worldcat also isn't great because there's confusion about the 'terms of use' but it will do for this experimental phase.

Here's the RDF/Turtle:
<http://example.org/ajn2006-smyrna.html#id4616>
cito:citesAsAuthority <http://www.worldcat.org/oclc/53013513> ;
a ex:Citation ;
<http://www.w3.org/1999/xhtml/vocab#cite> <http://www.worldcat.org/oclc/53013513> ;
rdfs:label "Burrell 2004: 42-48"@en .

The RDFa for the epigraphic reference looks like:
<span id="id9773"
about="#id9773"
typeof="ex:Citation"
rel="cito:citesAsAuthority ex:citesAsPrimarySource"
resource="http://www.worldcat.org/oclc/8935414"
property="rdfs:label"><i>IvS</i> 697</span>
The main difference here is that I'm also making up the 'ex:citesAsPrimarySource' value for the rel attribute. The concept of "Primary Source" and references thereto is important for the Humanities and we need a way of indicating its usage.

It's also important that I'm referring to the publication of the inscription, not the inscription itself. When a digital surrogate becomes available, I can point to that. In the meantime, a way of standardizing references to parts of a work would be useful. But I don't think you can just tag on a fragment identifier, as in http://www.worldcat.org/oclc/8935414#no.%20697, since the implication there is that such an ID actually exists. And it might be rude to put the same after a '?'. Something to ponder...


Instead of continuing on with each named entitiy, here's the whole sentence with RDFa visible:
<span id="id3724" about="#id3724" typeof="skos:Concept frbr:Event" rel="owl:sameAs" resource="dbpedia:124" property="ev:startdate" datatype="xsd:year" content="124">In 124</span>, <span id="id2209" about="#id2209" typeof="skos:Concept foaf:Person" resource="[dbpedia:Polemon_of_Laodicea]" rel="owl:sameAs cite" property="rdfs:label">Polemon</span> had spoken before <span id="id5130" about="#id5130" typeof="skos:Concept foaf:Person" rel="owl:sameAs cite" resource="[dbpedia:Hadrian]" property="rdfs:label">Hadrian</span> and persuaded him to make a gift of money and grant a series of honors to <span id="id39156" about="#id39156" typeof="skos:Concept geo:SpatialThing" rel="owl:sameAs cite" resource="http://pleiades.stoa.org/places/550771" property="rdfs:label">Smyrna</span>, not least of which was a second temple to the <span id="id4168" about="#4168" typeof="skos:Concept dbpedia:Religion" rel="owl:sameAs cite" resource="dbpedia:Imperial_cult_(ancient_Rome)]" property="rdfs:label">imperial cult</span> (<span id="id9773" about="#id9773" typeof="ex:Citation" rel="cito:citesAsAuthority ex:citesAsPrimarySource" resource="http://www.worldcat.org/oclc/8935414" property="rdfs:label"><i>IvS</i> 697</span>; <span id="id4616" about="#id4616" typeof="ex:Citation" rel="cito:citesAsAuthority cite" resource="http://www.worldcat.org/oclc/53013513" property="rdfs:label">Burrell 2004: 42-48</span>).
And here's the RDF/Turtle:

<http://example.org/ajn2006-smyrna.html#id3724>
owl:sameAs <dbpedia:124> ;
ev:startdate "124"^^xsd:year ;
a frbr:Event, skos:Concept .

<http://example.org/ajn2006-smyrna.html#id2209>
owl:sameAs dbpedia:Polemon_of_Laodicea ;
a skos:Concept, foaf:Person ;
<http://www.w3.org/1999/xhtml/vocab#cite> dbpedia:Polemon_of_Laodicea ;
rdfs:label "Polemon"@en .

<http://example.org/ajn2006-smyrna.html#id5130>
owl:sameAs dbpedia:Hadrian ;
a skos:Concept, foaf:Person ;
<http://www.w3.org/1999/xhtml/vocab#cite> dbpedia:Hadrian ;
rdfs:label "Hadrian"@en .

<http://example.org/ajn2006-smyrna.html#id39156>
owl:sameAs <http://pleiades.stoa.org/places/550771> ;
a geo:SpatialThing, skos:Concept ;
<http://www.w3.org/1999/xhtml/vocab#cite> <http://pleiades.stoa.org/places/550771> ;
rdfs:label "Smyrna"@en .

<http://example.org/ajn2006-smyrna.html#4168>
owl:sameAs <dbpedia:Imperial_cult_(ancient_Rome)]> ;
a dbpedia:Religion, skos:Concept ;
<http://www.w3.org/1999/xhtml/vocab#cite> <dbpedia:Imperial_cult_(ancient_Rome)]> ;
rdfs:label "imperial cult"@en .

<http://example.org/ajn2006-smyrna.html#id9773>
ex:citesAsPrimarySource <http://www.worldcat.org/oclc/8935414> ;
cito:citesAsAuthority <http://www.worldcat.org/oclc/8935414> ;
a ex:Citation ;
rdfs:label "<i>IvS</i> 697"^^rdf:XMLLiteral .

<http://example.org/ajn2006-smyrna.html#id4616>
cito:citesAsAuthority <http://www.worldcat.org/oclc/53013513> ;
a ex:Citation ;
<http://www.w3.org/1999/xhtml/vocab#cite> <http://www.worldcat.org/oclc/53013513> ;
rdfs:label "Burrell 2004: 42-48"@en .


Some of these constructs deserve more comment but this post is getting long. The only thing to add is that fairly soon I will publish a javascript toolset that starts making use of these patterns.

Friday, January 22, 2010

Referring to People and Places

Another title for this post could be "How can I achieve something by doing nothing?"

Back in 2006 I published the article 'A Box Mirror Made from Two Antinous Medallions of Smyrna.' American Journal of Numismatics Second Series 18 (2006), 63-74. It contains the following sentences:
The reverse type on this piece is one of four images — showing either the female panther on this piece, a bull, a sheep, or a ship’s prow — that appear on a series of medallions struck at Smyrna in honor of Antinous and naming Polemon as issuer. These two individuals are both historical figures and their biographical information provides the framework for dating the issue. Antinous was the companion of the emperor Hadrian who drowned in the Nile in late AD 130.
I am currently thinking about how to represent links from the "named entities" embedded within texts such as this to well-known identifiers for those concepts. That's what I want to achieve. The "doing nothing" part of my alternate title is an off-hand way of indicating that I want to make as few choices as possible. To again rephrase, the bottom line is that I'm hoping to use pre-existing standards.

BTW, pictures of the mirror are at http://numismatics.org/collection/2005.19.1.

In terms of well-known identifiers, here's the "low hanging fruit" that I see in the sample text:We could get into dates and abstract concepts such as "emperor" but I'll save that for later.

You'll note that I'm using the English Wikipedia for most of my identifiers and Pleiades for Smyrna. There is a Wikipedia article for that ancient site, but I do want to situate myself within the discipline of ancient geography. I think using the Pleiades reference meets that goal. On a slightly different topic, I was tempted to use dbpedia references – as in http://dbpedia.org/resource/Polemon_of_Laodicea – but think it's probably better practice to give the Wiki URI and let harvestors, etc. derive the dbpedia URI if they want to. Is it a disadvantage to tie the URI to a particular language?

Moving along... how to embed these references in the text? That does require an initial choice: RDFa embedded in xhtml. Here's a possible snippet that links an implicit identity with the relevant unambiguous identifier:
<span id="id7474" about="#id7474" typeof="foaf:Person" rel="owl:sameAs" resource="http://dbpedia.org/page/Polemon_of_Laodicea">Polemon</span>
With this markup I am trying to say, "the characters 'Polemon' refer to a person and that person is the same as the person represented by the URI 'http://dbpedia.org/page/Polemon_of_Laodicea'."

Why do I think I've achieved that? If I point an RDF parser – I use rapper – at this text, I get the following triples:
<http:/example.org/AJN2006-Heath.html#id7474>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://xmlns.com/foaf/0.1/Person> .

<http:/example.org/AJN2006-Heath.html#id7474>
<http://www.w3.org/1999/xhtml/vocab#cite>
<http://en.wikipedia.org/wiki/Polemon_of_Laodicea> .

<http:/example.org/AJN2006-Heath.html#id7474>
<http ://www.w3.org/2002/07/owl#sameAs>
<http://en.wikipedia.org/wiki/Polemon_of_Laodicea> .


I think this represents progress towards using a well-known standard that allows a third-party tool to extract the semantic meaning in my text. Expanding the markup I'm using, here's the whole sample text with embedded RDF:
The reverse type on this piece is one of four images — showing either the female panther on this piece, a bull, a sheep, or a ship’s prow — that appear on a series of medallions struck at <span id="id128979" about="#id128979" typeof="geonames:Feature nm:mint" rel="skos:sameAs cite" resource="http://pleiades.stoa.org/places/550771">Smyrna</span> in honor of <span id="id49178" about="#id49178" typeof="foaf:Person" rel="skos:sameAs cite" resource="http://en.wikipedia.org/wiki/Antinous">Antinous</span> and naming <span id="id7474" about="#id7474" typeof="foaf:Person" rel="cite skos:sameAs" resource="http://en.wikipedia.org/wiki/Polemon_of_Laodicea">Polemon</span> as issuer. These two individuals are both historical figures and their biographical information provides the framework for dating the issue. Antinous was the companion of the emperor <span id="id876873" about="#id876873" typeof="foaf:Person" rel="skos:sameAs cite" resource="http://en.wikipedia.org/en/Hadrian">Hadrian</span> who drowned in the <span id="id5726" about="#id5726" typeof="geoname:Feature" rel="skos:sameAs cite" resource="http://en.wikipedia.org/wiki/Nile">Nile</span> in late AD 130.


Which produces the following RDF:
<http:/example.org/AJN2006-Heath.html#id128979>
a nm:mint, geonames:Feature ;
<http://www.w3.org/1999/xhtml/vocab#cite> <http://pleiades.stoa.org/places/550771> ;
skos:sameAs <http://pleiades.stoa.org/places/550771> .

<http:/example.org/AJN2006-Heath.html#id49178>
a foaf:Person ;
<http://www.w3.org/1999/xhtml/vocab#cite> <http://en.wikipedia.org/wiki/Antinous> ;
skos:sameAs <http://en.wikipedia.org/wiki/Antinous> .

<http:/example.org/AJN2006-Heath.html#id7474>
a foaf:Person ;
<http://www.w3.org/1999/xhtml/vocab#cite> <http://en.wikipedia.org/wiki/Polemon_of_Laodicea> ;
skos:sameAs <http://en.wikipedia.org/wiki/Polemon_of_Laodicea> .

<http:/example.org/AJN2006-Heath.html#id876873>
a foaf:Person ;
<http://www.w3.org/1999/xhtml/vocab#cite> <http://en.wikipedia.org/en/Hadrian> ;
skos:sameAs <http://en.wikipedia.org/en/Hadrian> .

<http:/example.org/AJN2006-Heath.html#id5726>
<http://www.w3.org/1999/xhtml/vocab#cite> <http://en.wikipedia.org/wiki/Nile> ;
skos:sameAs <http://en.wikipedia.org/wiki/Nile> .


By way of a few observations, note that I type "Smyrna" – here id128979 - as a mint using the URI http://nomisma.org/id/mint, which is a reference to an incipient numismatic vocabulary. I don't type Hadrian as a Roman emperor. 'Smyrna' can be used in many ways so I want to be clear that I'm referring to it as a mint (in the broad numismatic sense). Hadrian's role as emperor is explicitly stated in the Wiki article and in its dbpedia equivalent. I don't think I need to repeat that here.

I'm also adding 'cite' to the rel attributes. 'cite' is one of the W3 sponsored relationships and I like how generic it is but also want to use the more specific 'skos:sameAs'.

This post is not a finished product and I don't mean to suggest that the above is the best way to achieve my goal. I welcome comments along the lines of "You should be using pre-existing standard http://...." or "What you suggest is sort of (barely?) OK but here's an improvement...". Is there a better RDFa pattern?