Lifting the Lid on Linked Data at ELAG 2011

Myself and Jane have just given our ‘Lifting the Lid on Linked Data‘ presentation at the ELAG European Library Automation Group Conference 2011 in Prague today. It seemed to go pretty well. There were a few comments about the licensing situation for the Copac data on the #elag2011 twitter stream, which is something we’re still working on.

[slideshare id=8082967&doc=elag2011-locah-110524105057-phpapp02]

LOCAH Project – Wider Benefits to Sector & Achievements for Host Institution

Meeting a need

High quality research and teaching relies partly on access to a broad range of resources. Archive and library materials inform and enhance knowledge and are central to the JISC strategy. JISC invests in bibliographic and archival metadata services to enable discovery of, and access to, those materials, and we know the research, teaching and learning communities value those services.

As articulated in the Resource Discovery Taskforce Vision, that value could be increased if the data can be made to “work harder”, to be used in different ways and repurposed in different contexts.

Providing bibliographic and archive data as Linked Data creates links with other data sources, and allows the development of new channels into the data. Researchers are more likely to discover sources that may materially affect their research outcomes, and the ‘hidden’ collections of archives and special collections are more likely to be exposed and used.

Archive data is by its nature incomplete and often sources are hidden and little known. User studies and log analyses indicate that Archives Hub1 users frequently search laterally through the descriptions; this gives them a way to make serendipitous discoveries. Linked data is a way of vastly expanding the benefits of lateral search, helping users discover contextually related materials. Creating links between archival collections and other sources is crucial – archives relating to the same people, organisations, places and subjects are often widely dispersed. By bringing these together intellectually, new discoveries can be made about the life and work of an individual or the circumstances surrounding important historical events. New connections, new relationships, new ideas about our history and society. Put this together with other data sources, such as special collections, multimedia repositories and geographic information systems, and the opportunities for discovery are significantly increased.

Similarly, by making Copac bibliographic data available as Linked Data we can increase the opportunities for developers to provide contextual links to primary and secondary source material held within the UK’s research libraries and an increasing number of specialist libraries, including the British Museum, the National Trust, and the Royal Society. The provision of library and special collections content as Linked Data will allow developers to build interfaces to link contextually related historical sources that may have been curated and described using differing methodologies. The differences in these methodologies and the emerging standards for description and access have resulted in distinct challenges in providing meaningful cross-searching and interlinking of this related content – a Linked Data approach offers potential to overcome that significant hurdle.

Researchers and teachers will have the ability to repurpose data for their own specific use. Linked Data provides flexibility for people to create their own pathways through Archives Hub and Copac data alongside other data sources. Developers will be able to provide applications and visualisations tailored to the needs of researchers, learning environments, institutional and project goals.

Innovation

Archives are described hierarchically, and this presents challenges for the output of Linked Data. In addition, descriptions are a combination of structured data and semi-structured data. As part of this project, we will explore the challenges in working with semi-structured data, which can potentially provide a very rich source of information. The biographical histories for creators of archives may provide unique information that has been based on the archival source. Extracting event-based data from this can really open up the potential of the archival description to be so much more than the representation of an archive collection. It becomes a much more multi-faceted resource, providing data about people, organisations, places and events.

The library community is beginning to explore the potential of Linked Data. The Swedish and Hungarian National Libraries have exposed their catalogues as Linked Data, the Library of Congress has exposed subject authority data (LCSH), and OCLC is now involved in making the Virtual International Authority File (VIAF) available in this way.

By treating the entities (people, places, concepts etc) referred to in bibliographic data as resources in their own right, links can be made to other data referring to those same resources. Those other sources can be used to enrich the presentation of bibliographic data, and the bibliographic data can be used in conjunction with other data sources to create new applications.

Copac is the largest union catalogue of bibliographic data in the UK, and one of the largest in the world, and its exposure as Linked Data can provide a rich data source, of particular value to the research, learning and teaching communities.

In answering the call, we will be able to report on the challenges of the project, and how we have approached them. This will be of benefit to all institutions with bibliographic and archival data looking to maximise its potential. We are very well placed within the research and teaching communities to share our experiences and findings.