The power of connections: unlocking the Web of data

Welcome to our project blog, just set up today. Lots more to come, but here’s a news item Jane Stevenson from Mimas has written to get us going:

Mimas and UKOLN are working together on an exciting JISC funded project to make our Archives Hub and Copac data available as structured Linked Data, for the benefit of education and research. We will also be working in partnership with Eduserv, Talis and OCLC, leading experts within their fields. We want to put archival and bibliographic data at the heart of the Linked Data Web, enabling new links to be made between diverse content sources and enabling the free and flexible exploration of data so that researchers can make new connections between subjects, people, organisations and places to reveal more about our history and society.

Linked Data uses the RDF data model to identify concepts and to describe relationships between those concepts. It promotes the idea of a Web of data rather than a Web of documents. The more document-centric approach, based on Web pages, does not readily expose data within the text in a way that applications can process, so the wealth of information within a page is of limited value.  Both the Archives Hub and Copac have so much rich data within them, and with Linked Data it can be brought to the fore by structuring concepts within the data it in a way that identifies them and facilitates linking to them.  Data can be combined in a way that results in new correlations, new perspectives and new discoveries.

http://www.flickr.com/photos/reedsturtevant/4288406572/

http://www.flickr.com/photos/reedsturtevant/4288406572/

Mimas is keen to explore new ways to open up data for the benefit our users. Providing bibliographic and archive data as Linked Data enables links with other data sources and creates new channels into the data. Researchers are more likely to discover sources that may materially affect their research outcomes.  It means that we can give researchers the potential to combine data sources for themselves, so that we do not need to predict the use of the data.

We know that researchers using the Hub or Copac are sometimes looking for a particular piece of information, such as a photograph of a library, or the birth date of a writer, or the location of an event. Linked Data can be valuable here because it helps to pin down concepts. If a researcher is looking for a photograph of John Rylands Library in Manchester, for example, Linked Data can clarify the concepts – a photograph, the library, ‘John Rylands’ as the name of a library, ‘John Rylands’ as a Victorian philanthropist, ‘Manchester’ as a place in England. It enables us to link across to other sources that can provide further information about these concepts. If a researcher is gathering information around a subject area, they can benefit from the linking concept and explore the Web much more fully because the data is no longer held within silos.

Archive data is by its nature incomplete and often potentially valuable sources are difficult to identify.  Bibliographic data is vast and it can be difficult to make useful connections. Researchers frequently search laterally through the descriptions, giving them a way to make serendipitous discoveries. Linked Data could potentially vastly expand the benefits of lateral search, helping users discover contextually related materials. Creating links just between cultural heritage collections can bring great benefits – archives, artifacts and published works relating to the same people, organisations, places and subjects are often widely dispersed. By bringing these together intellectually, new discoveries can be made about the life and work of an individual or the circumstances surrounding important historical events. New connections, new relationships, new ideas about our history and society. Put this together with other data sources, such as special collections, multimedia repositories and geographic information systems, and the opportunities for discovery are significantly increased.  A Linked Data approach offers potential to overcome differences in methodologies and standards for description and access which can hinder meaningful cross-searching and interlinking of related content.

Linked Data can enable researchers and teachers to repurpose data for their own specific use. It provides flexibility for people to create their own pathways through Archives Hub and Copac data alongside other data sources. Developers will be able to provide applications and visualisations tailored to the needs of researchers, learning environments, institutional and project goals.

This project, named LOCAH (Linked Open Copac and Archives Hub), is exploratory and real world applications of Linked Data are still in the early stages. Whilst the benefits could be extensive, we know that there are challenges, and in particular concerns about the resources required to create Linked Data and the availability of tools to make use of it. A number of key data sources are now available as Linked Data, such as BBC data, Wikipedia and Government datasets. In addition, developers are busy creating tools to make the data easy to query and process.  By getting involved in this creating Linked Data, we can explore the benefits and pitfalls in exposing archival and bibliographic data in this way. This is a project that enables us to contribute to a global effort to unlock the enormous potential within our data for the benefit of researchers and society as a whole.