This project has created a federated search facility, ‘Connected Histories’, which brings together a critical mass of quality content drawn from a wide range of electronic sources on the subject of early modern and nineteenth-century British history. More than simply creating a portal for accessing these historical resources, this project combines web crawling with Natural Language Processing techniques in order to remotely `tag´ previously unstructured texts and allow consistent, structured searching of names, places and dates. In so doing the project has added a new level of precision and intellectual rigour to the search process.
The Connected Histories search engine was developed by the HRI and is hosted by the Institute of Historical Research [IHR] within the University of London, sitting as an `umbrella´ over all the sources in the cluster. Testing was carried out by historians at Sheffield, Hertfordshire, and the Institute of Historical Research. Evaluation was conducted by the Centre for Computing in the Humanities, King’s College London.
In the first instance, ‘Connected Histories’ incorporated the following distributed historical sources:
- British History Online, including the forthcoming People in Place Database
- Old Bailey Proceedings Online, 1674-1913
- Plebeian Lives and the Making of Modern London (incorporating several databases deposited with the former AHDS; website launch March 2011)
- 17th-19th Century Burney newspaper collection from the British Library
- Origins Network
- Parliamentary Papers
- Clergy of the Church of England Database 1540-1835
- Strype´s Survey of London
- Charles Booth Online Archive
In total, Connected Histories provided access to fourteen major databases of primary source texts, containing more than 412 million words, plus 469,000 publications, 3.1 million further pages of text, 87,000 maps and images, 254,000 individuals in databases, and over 100 million name instances.
Connected Histories has since gone on to incorporate many more datasets and continues to grow, providing access to in excess of 10 billion words.
Duration: 1st October 2009 – 31st March 2011
- Prof. Robert Shoemaker (University of Sheffield)
- Prof. Tim Hitchcock (University of Hertfordshire)
- Dr Jane Winters (Institute of Historical Research, University of London)
- Dr Sharon Howard (Project Manager – The Digital Humanities Institute)
- Katherine Rogers (Developer – The Digital Humanities Institute)