Extending Domain-Specific Resources to Enable Semantic Access to Cultural Heritage Data


  • Paul D Clough University of Sheffield
  • Neil Ireson University of Sheffield
  • Jennifer Marlow Carnegie Mellon University


Cultural heritage material often contains rich semantic information, which can be utilised for alternative forms of information access beyond keyword searching and browsing by subject categories. In order to provide such functionality it is desirable to annotate all the material in a collection with named entities and their relationships so that all the collection is available for semantic search. In this paper, we examine issues involved with automatic semantic annotation of information about artists from Tate Online using a pre-existing domain-specific structured resource (ULAN). In particular, we focus on extending ULAN's coverage of artists and their associated semantic properties (e.g. birth/death date, birth/death location) by applying focused crawling and automatic information extraction techniques to exploit semi-structured sources of information. This enables the cross-referencing of collections against a range of information sources, thereby improving visibility and end-user information access.

Author Biographies

Paul D Clough, University of Sheffield

Department of Information Studies Lecturer in Information Systems

Neil Ireson, University of Sheffield

Department of Computer Science Research Associate