Exploring Contagion and Migration in European Cultural Memory via Text Mining

This paper presents an overview of the Contagion, Biopolitics and Migration in European Cultural Memory project, which aims to combine data analytics and cultural analytics to investigate themes of disease, health, and migration in the British Library Digital Labs corpus. The project plans to study the long-term effects and influence of cultural representations on public understanding of infectious diseases and their prevention. Specifically, in this work we describe the use of methods from text mining and machine learning to study a corpus of over 47,000 texts, covering fiction and non-fiction, ranging from the 18th century to the early 20th century. The natural language processing techniques involved include word embedding and topic modelling. For instance, language pertaining to disease, health, and migration is explored through lexicons generated using word embedding methods trained on the library corpus. We show how the outputs of these methods can be used to characterise and better understand the discourse around disease and migration during this time period.