natural language generation

By leahhenrickson, 12 September, 2019
Language
Year
Record Status
Abstract (in English)

Natural language generation (NLG) – when computers produce text-based output in readable human languages – is becoming increasingly prevalent in our modern digital age. This paper will review the ways in which an NLG system may be framed in popular and scholarly discourse: namely, as a tool or as an agent. It will consider the implications of such perspectives for general perceptions of NLG systems and computer-generated texts. Negotiating claims made by system developers and the opinions of ordinary readers amassed through empirical studies conducted for this research, this paper delves into a theoretical and philosophical exploration of questions of authorial agency related to computer-generated texts, and by considering whether NLG systems constitute tools for manifesting human intention or agents in themselves.

This paper will begin by considering NLG systems as tools for manifesting human intent, the more commonly expressed view amongst developers and readers. An NLG system arguably serves as an extension of a human self (e.g. the developer or the user). Yet one cannot ignore the increasing autonomy of such systems. At what point does an extension of the self become a distinct entity altogether?

The discussion will then shift to considering NLG systems as agents in themselves. As evidenced by the results of studies conducted for this research, ordinary readers do tend to attribute authorship to computer-generated texts. However, these readers often attribute authorship to the system rather than its developers, indicating that – in some way – the system is distinct enough from its creators to warrant the title of author. Yet conventional modern understandings of the word ‘author’ suggest that authorship at least partly presumes intentiondriven agency. Do NLG systems adhere to this expectation? Through reference to various theoretical perspectives, this paper will argue that some NLG systems may surpass the ‘tool’ title and more appropriately be deemed authorial agents. This type of agency, however, is not so characterised by the free-will intention of human writers, but by the intention to fulfil a designated objective that is respected within broader social contexts. When readers attribute authorship to the NLG system itself, that entity is permitted a place within the fluid social networks that humans populate. The NLG system becomes an algorithmic author.

Description (in English)

Eververse is a project which synthesises perspectives from disciplines in the humanities and sciences to develop critical and creative explorations of poetry and poetic identity in the digital age. Eververse sends biometric data from a fitness tracking device worn by the poet to its custom-built poetry generator. This generator utilises NLG techniques to output poetic text published in real time, and 24/7, on the Eververse website.

Screen shots
Image
Eververse logo
Multimedia
Remote video URL
Remote video URL
Technical notes

The ​Eververse application consists of three main modules. The first module interfaces with the Fitbit device and its data through its Application Programming Interface (API). The activity data of the poet wearing the device is then sent, in JSON form to the NLG module referred to as the 'generator.’ This generator carries out a number of steps in order to generate and return a poetic couplet based on a conceptual model of states based on the activity information contained within the passed JSON data. The number of words and the frequency of the generated couplets correlate with the heart rate of the poet, whereas the textual content of the couplet is generated from the input corpus which is fed to the generator. The input corpus currently comprises fifty nine poems on the topic of the body; all are previously published and none is composed by the Eververse poet; a separate, nocturnal corpus is deployed when sleep data is passed to the generator. In order to disassemble and reassemble the corpora for publication in EverVerse, they are arranged in a reverse ngram matrix and further shaped into a frequency lookup table by Poesy, a Markov Model-based Natural Language Poetry Generator. The lookup table is used to create verse lines and a python library is deployed to rhyme the verses. In short, our method takes a language model approach similar to Barbieri, et al. although we do exploit some semantics, specifically alignment of couplets with fitbit activity states. Future work will involve experimenting with exploiting language resources such as WordNet and SentiWordNet similar to previous work by Tobing and Oliveira.

The generator is written mainly in the Python programming language using the micro web framework, Flask. It consists of a web interface to display the generated poetry and an administrator interface that is used to define heart rate parameters for different zones and to determine the form and content of the verse that corresponds to these zones.

The public user interface created to display the generated poetry relies heavily on a number of Open Source JavaScript libraries. These libraries enable display of the generated text (Handlebars.js, Textillate.js), the retrieval of data from the web application’s API and user interface animations (jQuery), and the creation of generative background images (p5.js). The dynamic background images are created in realtime, and utilise the activity data as an input to affect their form and colour, representing a visual correlate to the generated poetry.

Multiple versions of this interface were created for deployment on the web, in a live performance environment, and for display in a standalone exhibition setting. Each interface was adapted to take into account the context in which it would be experienced, for example, differences in how, or if, user interaction was required, and addressing the differing requirements for text size, line spacing, and overall page layouts.

By leahhenrickson, 13 August, 2018
Language
Year
Record Status
Abstract (in English)

Natural language generation (NLG) – the process wherein computers translate data into readable human languages – has become increasingly present in our modern digital climate. In the last decade, numerous companies specialising in the mass-production of computer-generated news articles have emerged; National Novel Generation Month (NaNoGenMo) has become a popular annual event; #botALLY is used to identify those in support of automated agents producing tweets. Yet NLG has not been subject to any systematic study within the humanities.

This paper offers a glimpse into the social and literary implications of computer-generated texts and NLG. More particularly, and in line with the ELO 2018 Conference’s 'Mind the Gap!' theme, this paper examines how NLG output challenges traditional understandings of authorship and what it means to be a reader. Any act of reading engages interpretive faculties; modern readers tend to assume that a text is an effort to communicate a particular pre-determined message. With this assumption, readers assign authorial intention, and hence develop a perceived contract between the author and the reader. This paper refers to this author-reader contract as ‘the hermeneutic contract’.

NLG output in its current state brings the hermeneutic contract into question. The hermeneutic contract’s communication principle rests on two assumptions: that readers believe that authors want them to be interested in their texts, and that authors want readers to understand their texts. Yet the author of a computer-generated text is often an obscured figure, an uncertain entanglement of human and computer. How does this obscuration of authorship change how text is received?

This paper will begin with an introduction to, and brief history of, NLG geared towards those with no previous knowledge of the subject. The remainder of the paper will review the results of a series of studies conducted by the researcher to discern readers’ emotional responses to NLG and their approaches to attributing authorship to computer-generated texts. Studies have indicated that a sense of agency is assigned to an NLG system, and that a continuum from authorship to generation is perhaps the most suitable schema for considering computer-generated texts. Who is responsible for the text? Are computer-generated texts worthy of serious literary analysis? What do computer-generated texts reveal about human creativity and lived experience?

The paper will conclude with an argument for why consideration of the social and literary implications of NLG and computer-generated texts is vital as we venture deeper into the digital age. Computer-generated texts may not just challenge traditional understandings of authorship: they may engender new understandings of authorship altogether as readers explore the conceptual gap between human and computer language production.

Description in original language
Pull Quotes

Computer-generated texts may not just challenge traditional understandings of authorship: they may engender new understandings of authorship altogether.

Platform referenced
By Eric Dean Rasmussen, 22 June, 2012
Language
Year
Record Status
Abstract (in English)

There is a moment that can happen when reading/playing an interactive fiction. The system just presented some text, perhaps quite engaging or even beautiful. And then one tries to reply, using some of the same language, only to receive an error. The underlying system doesn't can't hear the language with which it speaks. The language it displays is written ahead of time, while the language it receives must be parsed and acted upon at runtime.

There is something uneasy about this disjuncture, and one response is to try to avoid all such problems. Will Wright's Sims speak only in gibberish sounds and visual icons, so that the surface representation of language matches the very simple internal representation of what they can discuss. Chris Crawford currently plans for his new storytelling system to avoid the construction of English-like sentences found in Storytron — instead moving to an icon language intended to help players better understand the internal representations (much more complex than those in The Sims) on which his story system will operate.

But surely there must be alternatives. Much of the power of writing is in the construction of natural language — not simply the conveyance of plot structures, characters, and other things such an icon language might help convey. Can't our field pursue both?

One alternative is to remove language from what a responsive system acts upon. In node-link hypertext, the system operates on nodes and the (potentially evolving) connections between them, with text displayed as a function of the current node. In systems such as Text Rain (or my Screen collaboration) the system operates on physical objects with text mapped upon them. These can create powerful experiences, and are perhaps the most artistically successful options at the moment, but they also seem like sidestepping the problem.

So what might a responsive language alternative look like? I think there are two. One is to find ways to engage interactively with language itself, language that somehow follows linguistic rules, rather than language that operates like physical objects. I've been involved in some collaborations in this direction.But what if we want language in systems that operate according to the logics of their fictions? What if we want language to work with a system that doesn't just operate thematically in a manner in concert with the fiction, but perhaps allows interactions with characters or shaping of plots?

I am currently involved in two projects exploring this direction. One, Prom Week, is discussed in detail in a separate proposal from Aaron Reed. It uses a template-driven language system. Another, Character Creator (a collaboration with Marilyn Walker and others) is a project attempting to make deep natural language generation (NLG) technologies useful for writers, enabling them to have characters respond to a much wider variety of potential situations than it would be possible to write dialogue variations for by hand. I hope to present some of our current results and then talk about some of my intuitions for the future — including my belief that successful NLG for writers may look something like successful procedural animation for visual artists.

(Source: Author's abstract, 2012 ELO Conference site)

Creative Works referenced