As part of the Jisc project, Spotlight on the Digital, The Digital Humanities Institute developed a technical specification for a tool that could address the problem of digital orphans. With the successful completion of this project, The Digital Humanities Institute was then commissioned to undertake the technical design, build, documentation and testing of the proposed tool (internal name “Dewdrop”). The technical work ran alongside a process of demand investigation, business planning, ‘testing in the wild’ and communication planning which was undertaken by Jisc with the support of The Digital Humanities Institute.
Digital orphans are online assets (in this case, research resources) that are deemed to be undiscoverable, unused, unknown or forgotten by the wider research community because they are invisible or inaccessible to the normal mechanisms of discovery, such as search engines, subject catalogues, aggregation sites and other subject-specific websites. The invisibility of online resources can be due to a combination of factors such as poor technical design, poor presentation of content, poor marketing and an absence of individual and/or institutional support.
The tool proposed in the specification is intended to address these problems by being capable of developing a discovery-friendly version of a resource’s textual content at the record level. This discovery-friendly version of a resource’s content, presented as a set of optimised data records, will then mediate between the resource and discovery services. The tool achieves this in two ways: a Crawler retrieves a copy of the resource’s textual content, including data contained in databases; and an Analyser generates discovery-friendly records from the content using Natural Language Processing techniques.
Dewdrop Code and Documentation
- The Digital Humanities Institute’s Specification for Dewdrop
- Spotlight on the Digital (The Digital Humanities Institute website)
- Spotlight on the Digital (Jisc website)
Duration: November 2015 – July 2016
Image Credits: Partial map of the internet developed by opte.org
- Michael Pidd (Principal Investigator – The Digital Humanities Institute)
- Ryan Bloor (Developer – The Digital Humanities Institute)