LOD-LAM: International Linked Open Data in Libraries, Archives, and Museums Summit

LOD LAMI’m really pleased to announce that I was asked to join the organising committee for the International Linked Open Data in Libraries, Archives, and Museums Summit that will take place this June 2-3, 2011 in San Francisco, California, USA. There’s still time to apply until February 28th, and funding is available to help cover travel costs.

The International Linked Open Data in Libraries, Archives, and Museums Summit (“LOD-LAM”) will convene leaders in their respective areas of expertise from the humanities and sciences to catalyze practical, actionable approaches to publishing Linked Open Data, specifically:

  • Identify the tools and techniques for publishing and working with Linked Open Data.
  • Draft precedents and policy for licensing and copyright considerations regarding the publishing of library, archive, and museum metadata.
  • Publish definitions and promote use cases that will give LAM staff the tools they need to advocate for Linked Open Data in their institutions.

For more information see http://lod-lam.net/summit/about/.

The principal organiser/facilitator is Jon Voss (@LookBackMaps), Founder of LookBackMaps, along with Kris Carpenter Negulescu, Director of Web Group, Internet Archive, who is project managing.

I’m very chuffed to be part of the illustrious Organising Committee:

Lisa Goddard (@lisagoddard), Acting Associate University Librarian for Information Technology, Memorial University Libraries.
Martin Kalfatovic (@UDCMRK), Assistant Director, Digital Services Division at Smithsonian Institution Libraries and the Deputy Project Director of the Biodiversity Heritage Library.
Mark Matienzo (@anarchivist), Digital Archivist in Manuscripts and Archives at the Yale University Library.
Mia Ridge (@mia_out), Lead Web Developer & Technical Architect, Science Museum/NMSI (UK)
Tim Sherratt (@wragge), National Museum of Australia & University of Canberra
MacKenzie Smith, Research Director, MIT Libraries.
Adrian Stevenson (@adrianstevenson), UKOLN; Project Manager, LOCAH Linked Data Project.
John Wilbanks (@wilbanks), VP of Science, Director of Science Commons, Creative Commons.

It’ll be a great event I’m sure, so get your application in ASAP.

Two changes to the model and some definitions

Over the last few weeks we’ve been testing our initial cut at an EAD-to-RDF transform against a range of data that extends beyond EAD documents prepared using the Hub data entry template to documents created using other tools – and varying somewhat in terms of the markup conventions used.

In the course of that, I’ve been pondering some of the choices we made in the model I described here and here, and we decided to make a couple of changes (one very minor and the second still relatively so, I think):

  • Archival Resource: We’ve changed the name of the class we were calling “Unit of Description” to “Archival Resource”. I think “Unit of Description” was problematic for two reasons. First, it was ambiguous, because it could be interpreted either as the unit (of archival material) being described (which is what was intended) or as a unit/part of the archival description (which is not what was intended). Second, I adopted it from the ISAD(G) standard, where the context is one in which the archival resources are considered to be the primary things being described. I’m less sure the label works in the “linked data” context where we’re providing statements, and sets of statements (descriptions), “about” not just the archival materials, but many other things. In this context, everything that is described (people, concepts, places, etc) might be seen, in some sense, a “unit of description”, and so using that label for one subset of them seems inappropriate. That left us with finding a suitable alternative, a generic term that covers archival material in general, at any level of description (fonds, collection, item etc), and “archival resource” seemed like a reasonable fit.
  • Origination as Concept: When I first sketched out the model, I raised some questions, including (as “question 3” in that post) whether it was useful/necessary to model the origination of the archival resource as a pair of concept and agent, following the pattern used for the <controlaccess> terms. Having experimented with that approach, we’ve decided it introduces unnecessary complexity and we’ve fallen back on treating <origination> as a simple relation between archival resource and agent. The use of concept and agent is retained for the <controlaccess> case, where names are typically drawn from an “authority file”, as it allows us to maintain the distinction between a conceptualisation of the agent (as reflected by the authority record/entry) and the agent itself (a distinction which is also made in the model underpinning datasets such as VIAF, which we will be making links to).

The revised model is summarised in the following diagram (an amended version of Figure 3 from the earlier post):

Amended data model for EAD

Amended data model for EAD

i.e. an Archival Resource and a Biographical History are now related directly to an Agent.

Below is a draft list of human-readable definitions for the classes in the model. Some are simply references to classes provided by existing vocabularies like Dublin Core, FOAF, event vocabularies:

Finding Aid
A document describing an archival resource.
Subclass of: bibo:Document, foaf:Document
A document conforming to the Encoded Archival Description standard.
Subclass of: bibo:Document, foaf:Document
Biographical History
A narrative or chronology that places the archival materials in context by providing information about their creator(s). A finding aid may contain several such narratives or chronologies pertaining to different archival materials and their creators.
Subclass of: bibo:DocumentPart, (bibo:Document), foaf:Document
An institution or agency responsible for providing access to archival materials.
Subclass of: foaf:Organization, (foaf:Agent), dcterms:Agent
= wgs84_pos:SpatialThing
Postcode Unit
= ospc:PostcodeUnit
Archival Resource
Recorded information in any form or medium, created or received and maintained, by an organization or person(s) in the transaction of business or the conduct of affairs, and maintained for its long-term research value. An archival resource may be an individual item, such as a letter or photograph, or (more commonly) some aggregation of such items managed and described as a unit.
An indicator of the part of an archival collection constituted by an archival resource, whether it is the whole collection or a sub-section of it.
Subclass of: skos:Concept
= lvont:Language
The size of an archival resource.
Subclass of: dcterms:SizeOrDuration
Temporal Entity
= time:TemporalEntity
An event that resulted in the creation or accumulation of an archival resource.
Subclass of: event:Event, lode:Event
= skos:Concept
Concept Scheme
= skos:ConceptScheme
= foaf:Agent, dcterms:Agent
= foaf:Person, (foaf:Agent), dcterms:Agent
A group of people affiliated by consanguinity, affinity, or co-residence.
Subclass of: foaf:Group, (foaf:Agent), dcterms:Agent
= foaf:Organization, (foaf:Agent), dcterms:Agent
Genre or Form
A category of archival material, defined either by style or technique of intellectual content, order of information or object function, or physical characteristics.
Subclass of: skos:Concept
A sphere of activity or process.
Subclass of: skos:Concept
= bio:Birth, (bio:IndividualEvent), (bio:Event),
(event:Event), (lode:Event)
= bio:Death, (bio:IndividualEvent), (bio:Event),
(event:Event), (lode:Event)
= foaf:Document, bibo:Document
= bibo:Book, (bibo:Document), (foaf:Document)

Locah Lightening at Dev8d

This is just a quick post to say that I’ll be giving a “lightening talk” on the Locah project at 2.45pm this Wednesday 16th February at the Dev8d developer event in London. If you’ve got any questions or would like to know more about the project, then please come along to the session. I should be at Dev8d for the full two days, so grab me anytime if you can’t make the session.

I’ll also be participating in a panel session on Linked Data as well, but I’m not sure when this is scheduled for yet.

Abstract for the talk:

“The Locah project is making records from the Archives Hub service and Copac service available as Linked Data. The Archives Hub is an aggregation of archival metadata from repositories across the UK; Copac provides access to the merged library catalogues of libraries throughout the UK, including all national libraries. In each case the aim is to provide Linked Data according to the principles set out by Tim Berners-Lee, so that we make our data interconnected with other data and contribute to the growth of the Semantic Web. The talk will touch on data modelling, the selection of vocabularies and the design of URI patterns. It will look at the practical realities of how we are turning the Archives Hub EAD data and Copac MODS data into RDF XML, and then loading it into triple stores. The talk will conclude with a look at some of the main opportunities and barriers to the creation and use of Linked Data. There will be a panel session on linked data where delegates can ask further questions.”

I’ve also added my tune to the Dev8d playlist, the sublime ‘French Disko‘ by Stereolab.

Postscript 22nd February 2011:

I’ve now uploaded my slides from this talk to slideshare and embedded them below. The talk was primarily aimed at developers with the assumption that they knew a bit about RDF and Linked Data, so it doesn’t discuss these except in passing. I was mainly trying to give some specifics on the technicalities involved, and what platforms and tools we’re using, so people can follow the same path if they wanted. Please comment below with any questions.

It was another great #dev8d this year, and especially useful for me in terms of learning more about Linked Data related technologies. Top job to organiser Mahendra and the rest of the UKOLN team involved.

[slideshare id=7000641&doc=dev8d2011-110221081440-phpapp02]