speech recognition

By Hannah Ackermans, 29 October, 2015
Language
Year
Record Status
Abstract (in English)

Learn a cutting-edge method of performative creative writing based on human-computer interaction.

You will learn to “write with your voices” (as opposed to typing on a keyboard) by using speech recognition software. We will take turns saying impromptu lines out loud into a microphone. The computer will recognize the lines with varying accuracy and turn the speech into text on the computer screen. We will develop a set of improvisational tools to enhance dramatic writing by utilizing the computer’s errors (misrecognitions) in collaboration with other participants. You will be confronted with situations requiring quick decision-making, because the computer does not reproduce your speech with hundred-percent accuracy – a fact that will challenge you to deal with technological dysfunction in the here-and-now of a performative writing situation. Also, you will be challenged to listen and respond to your human writing partners and their texts. Through guided practice, you will learn to take the writing process in unexpected directions, further into an improvisational realm.

While practicing this collaborative, performative live writing method with human and computer partners, we will work toward creating short fictional scenes. The scenes will be based on dramatic situations that we will come up with together through rehearsals and discussion. In addition to this practical work, we will spend time discussing readings of relevant texts (live writing, new technology, human-computer interfaces and drama).

At the end of the workshop, we will present a live, performative writing event, in which you will have the opportunity to perform those aspects of the writing method that you find most compelling. The showing will be planned and performed collaboratively. At the end of this workshop, you will have the tools to continue exploring the relationship between text, digital media, and performance in your own work.

(source: ELO 2015 catalog)

Description (in English)

A conversational labyrinth where the walls are lined with phrases composed, in real time with sound and words, in a virtual environment.

Labylogue, a tribute to Jorge Luis Borges' The Library of Babel, was a simulated three-dimensional large-scale visual poetry performance. Created from different French language speaking Internet nodes, it incorporated software-generated text, triggered by algorithmic recognition of words spoken by participants meeting virtually in the labyrinth. Maurice Benayoun explains that in art spaces and museums in three different French speaking cities -- Brussels, Lyon, and Dakar -- Labylogue developed eight main themes that invited visitors to meet in the labyrinth and as they conversed, immerse themselves in the accompanying text on the walls.Source: Narrabase

Description (in original language)

Labylogue est un espace de conversation.

Dans trois lieux différents reliés par Internet, Bruxelles, Lyon , Dakar , les visiteurs déambulent dans un labyrinthe virtuel en quête de l’autre.

Deux à deux ils dialoguent en français.

A mi-chemin entre le livre et la Bibliothèque de Babel de Borgès, les murs se tapissent de phrases générées en temps réel, qui sont autant d’interprétations du dialogue en cours. A son tour le texte fait l’objet d’une interprétation orale qui anime l’espace du labyrinthe tel un choeur de synthèse qui vagabonde sur les rives de la langue en action.

La médiation numérique introduit dans la communication des couches d’interprétation qui échappent à l’intention brouillant parfois le sens. La parole reprend alors ses droits. Elle glisse sur l’interprétation de la machine en privilégiant le contact là où la trace écrite dérive.rive.

Description in original language
Screen shots
Image
Contributors note

Installation de Maurice Benayoun

Composition sonore : Jean-Baptiste Barrière

Génération de texte : Jean-Pierre Balpe

Réalisation : Z-A Production (www.Z-A.net)

Direction technique : David Nahon

Programmation : David Nahon, Michael Bry

Logiciel de reconnaissance vocale : France Telecom R&D

Production Z-A : Stéphane Singier, Karen Benarouche, Corinne Lambert

Description (in English)

Talking Cure is an installation that includes live video processing, speech recognition, and a dynamically composed sound environment. It is about seeing, writing, and speaking — about word pictures, the gaze, and cure. It works with the story of Anna O, the patient of Joseph Breuer's who gave to him and Freud the concept of the "talking cure" as well as the word pictures to substantiate it. The reader enters a space with a projection surface at one end and a high-backed chair, facing it, at another. In front of the chair are a video camera and microphone. The video camera's image of the person in the chair is displayed, as text, on the screen. This "word picture" display is formed by reducing the live image to three colors, and then using these colors to determine the mixture between three color-coded layers of text. One of these layers is from Joseph Breuer's case study of Anna O. Another layer of text consists of the words "to torment" repeated — one of the few direct quotations attributed to Anna in the case study. The third layer of text, which becomes visible only when a person is in the chair, reworks Anna's snake hallucinations through the story of the Gorgon Medusa, reconfiguring the analytic gaze. Speaking into the microphone triggers a speech-to-text engine that replaces Anna's words with what it (mis)understands the participant to have said. What is said into the microphone is also recorded, and becomes part of a sound environment that includes recordings of Breuer's words, Anna's words, our words, and all that has been spoken over the length of the installation. Others in the space observe the person in the chair through word pictures on the screen. Readers move their bodies at first to create visual effects, and then to achieve textual ones, creating new reading experiences for themselves and others in the room. Movements range from slowly moving an extended arm in order to recreate left-to-right reading, to head or hand rotation seeking evocative neologisms at the mobile textual borders within the image. The video processing technique was created by Utterback, and has been exhibited separately as Written Forms. The sound environment was designed and implemented by Castiglia, and Nathan Wardrip-Fruin implemented the speech-to-text. Talking Cure was first presented at the 2002 Electronic Literature Organization symposium at UCLA. I have also presented it as a performance/reading, cycling verbally between the layers of text while my image is projected as a different textual mixture on a screen.

(Source: Author's website.)

I ♥ E-Poetry entry
Screen shots
Image
Image
Image
Image
Multimedia
Remote video URL
Contributors note

with: Camille Utterback, Clilly Castiglia, and Nathan Wardrip-Fruin. The video processing technique was created by Utterback, and has been exhibited separately as Written Forms. The sound environment was designed and implemented by Castiglia, and Nathan Wardrip-Fruin implemented the speech-to-text.

Description (in English)

Simon Biggs with Mark Shovman developed a virtual interactive artwork in response to a commission of the Poetry Beyond Text project. Tower is inspired by the story of the Tower of Babel. Inter-subjective relations are central to this work, which evokes the idea of first-, second- and third-person perspectives. Tower is an interactive work which creates an immersive 3D textual environment combining visualisation, physical interaction, speech recognition and predictive text algorithms. Viewers (or inter-actors) occupy one of three roles: as central inter-actor, wearing a VR head-display, as one of several inter-actors, wearing 3D spectacles, or as spectator, standing outside the interactive zone. The central inter-actor is located at the vertiginous pinnacle of a virtual spiral word structure. When the inter-actor speaks their spoken words appear to float from their mouth and join the spiralling history of previously spoken words. As the uttered word emerges other words, predicted on the basis of statistical frequency within a textual corpus, spring from the spoken word. The second-person inter-actors see words appearing from the first-person inter-actor's mouth and the spiral gradually growing, with the first-person inter-actor at its pinnacle, while the third-person observers stand outside the interactive zone, observing the tableau. As it grows the spiral comes to resemble a Tower of Babel composed of words, spoken and potential. (Source: Poetry Beyond Text)

Technical notes

Tower requires a high powered gaming type PC with multi-screen capability, 3D stereo projection (two instances of), matching 3D spectacles, a head mounted display system, spatial tracking system, a voice recognition optimised wireless microphone and Windows 7 or XP.

Content type
Author
Year
Language
Platform/Software
Record Status
Description (in English)

Oracle is a voice recognition and interpretive grammar based interactive performance artwork. The performer's speech, a series of questions posed by the audience, is acquired and presented in a digital projection. The computer system reads the acquired and collective texts, as they are layered upon one another, and generates answers to each question using a word from each of the prior questions.

Technical notes

Oracle requires a Intel based Mac pro computer, video projection and a speech recognition optimised microphone. The software was mainly authored by the artist but also requires Macspeech Dictate to be installed.