Natural Language Generation: Negotiating Text Production in our Digital Humanity

Natural language generation (NLG) – the process wherein computers translate data into readable human languages – has become increasingly present in our modern digital climate. In the last decade, numerous companies specialising in the mass-production of computer-generated news articles have emerged; National Novel Generation Month (NaNoGenMo) has become a popular annual event; #botALLY is used to identify those in support of automated agents producing tweets. Yet NLG has not been subject to any systematic study within the humanities.

This paper offers a glimpse into the social and literary implications of computer-generated texts and NLG. More particularly, this paper examines how NLG output challenges traditional understandings of authorship and what it means to be a reader. Any act of reading engages interpretive faculties; modern readers tend to assume that a text is an effort to communicate a particular pre-determined message. With this assumption, readers assign authorial intention, and hence develop a perceived contract between the author and the reader. NLG, however, brings this contract into question. The author of a computer-generated text is often an obscured figure, an uncertain entanglement of human and computer. How does this obscuration of authorship change how text is received? What does this obscuration say about our new digital humanity?

This paper will present the results of a series of studies conducted by the researcher to discern how readers attribute authorship to computer-generated texts. These results suggest that many everyday readers regard NLG as more than just tools for manifesting human vision. Indeed, systems are attributed agency in and of themselves. Consideration of such implications of NLG is vital as we venture deeper into the digital age. Computer-generated texts may not just challenge traditional understandings of authorship: they may engender new understandings of authorship altogether as readers explore the conceptual gap between human and computer language production.