Monday, February 21, 2011

Quick poll: Worldcat, Library of Congress, or Both

There are lots of ways of encoding bibliographic data on the web, but this post isn't about that problem. Instead, I'm wondering what is "the community's" preference between Worldcat and the Library of Congress when creating Semantic Web/Linked Open Data references.

As an example, the URIs http://www.worldcat.org/oclc/829279 and http://lccn.loc.gov/74155758 each lead to information about John Hayes' Late Roman Pottery published in 1972.

Which one of these is preferable as the long-term description of this volume? Worldcat or LOC. The use-case is a digital publication with bibliography that ideally includes a link to one or the other or both for all printed volumes or other appropriate entities.

Perhaps a discussion will ensue in the comments but here are some quick issues:
  • There are multiple URIs for that one volume in Worldcat. http://www.worldcat.org/oclc/462730938 gets you to the Danish Union Catalog.
  • There are still concerns about the licensing of Worldcat data.
  • The LOC record is to a physical volume in a single national library and may not be intended as a description of the abstract concept (e.g. a FRBR Work). I don't know that Worldcat URIs solve this problem but they have the implication of a higher level of abstraction.


Votes and/or comments are appreciated.

7 comments:

The Other Athens said...

Also note that some works may not be owned by the Library of Congress, and thus may not have entries there, but are in WorldCat.

From a use standpoint, I like WorldCat because of the ease of linking to one's own institutional holdings - you can instantly see if your library has the work. When I link people to things, I always make them WorldCat links, but then, I am generally not thinking of the long-term.

From a single-record-for-one-work standpoint, I prefer LOC. I get so frustrated with the multiple redundant records in WorldCat. And LOC gets my vote on the stability/public availablilty axis as you note.

otacon said...

I miss the reasons of your restriction to those two options, and I propose a third.

I like openlibrary. They give an id to each work (for instance OL860107W ) and each book (OL3068980M). Their license is CC0 that is as liberal as possible AFAICT.
Also, they link back to loc, worldcat, library thing and others.
As the guys are those behind the archive I guess they are here to stay, and it's probably safe to assume they will still be there in decades.

Since it's a wiki you can find weird data (look at the cover of the book you mentioned) but the wiki philosophy has proven really good at the end of the day, so things like that don't worry me at all.

Ben said...

I said both, but for my purposes I prefer LCCN; for anything beyond a couple books, being locked into the worldcat interface and having to worry about their data restrictions is too big a problem. Plus, the duplicates thing...

What we really need is something like LCCN, but universal. OCLC should have been that, but the licensing keeps it from being a good standard. Silviot's right that Open Library is possibly preferable to either if the project succeeds; if LC volumes could be acceded into HathiTrust, Hathi identifier numbers would be as good.

John Muccigrosso said...

I'm with Silviot (though I rarely use openlibrary). I never use LoC anymore.

Ed Summers said...

I reckon the respondents who said you need both hit the nail on the head. OCLC has linkages to lots of institutions around the world, which LC knows nothing about. LC (and other National Libraries) brings some simplicity and perhaps non-commercial-longevity (at least in theory) to the table, which is also appealing.

It's an interesting thought experiment to consider what the role of OCLC would be in a world where libraries published their data and linkages to records at National Libraries on the Web. I am certain that there would still be a need for organizations (like OCLC) to aggregate, and reconcile the data, to build value added services on top of the distributed data. But having the mechanics open to view on the web, with clear licensing attached would help foster a lot more innovation and entrepreneurship around libraries.

More than anything, I think we need to work towards a place where we all take our institutional identifiers more seriously by making them world resolvable (as URLs), and then linking them to other resources elsewhere. So for example, how important are NYU's Catalog URLs to the equation? Perhaps hubs around data in New York City could prove more important to folks locally?

My hope is that we have yet to see libraries really engage with the Web...and that the humble start with Web OPACS, and subsequent efforts like the OpenLibrary, VIAF and worldcat.org are just the beginning.

Sebastian Heath said...

I agree that "both" is a good approach for now. The combination is likely to be a robust indicator of which work is being referred, one that will be parsable for many years into the future. Of course, there is the further issue of how to specify links to individual parts of volumes: chapters in monographs, contributions to edited volumes, page ranges, etc. There's ongoing work in all of these areas, none of which makes me think, "ok, problem solved".

Silviot raised the possibility of linking to http://openlibrary.org . I should have included that as an option. But I am slightly embarrassed to say that I'm willing to give the library community time to get things "right" before relying on a user-generated content solution. Embarrassed because my heart usually lies with the crowd. In this instance, the presence of long-term institutions who can fill the role of stable references to scholarly content by means of recasting existing resources and policies suggests the possibility of a robust and well-funded solution. This is a gut-feeling call about the shortest route to the future that we all want.

[As always, my virtual avatar has flame-proof clothing on so feel free to respond to the above.]

Also, I had not looked at http://viaf.org in a while. It's very cool and can be an important nexus in a linked open data world. I'm guessing there's work/preliminary thoughts on a similar tool for subject. "visf.org". That would be an even more important nexus of links.

John Muccigrosso said...

Sebs,

In pronciple I agree that giving libraries time to sort things out is a good idea.

But (you knew that was coming), libraries, in my experience, are not exactly cutting edge in these areas. (Contact me "off-line" for my latest sad story.)