chinese characters

Content type
Translator
Year
Language
Publication Type
Record Status
Description (in English)

Yuefu is a poetry generation system using OpenAI’s GPT, a Generative Pre-Trained natural language model pretrained on Chinese newspapers, that is fine-tuned with classical Chinese poetry. The developers write in their paper describing the system that it does not use "human crafted rules or features," or "any additional neural components". The system can generate poems in various formal, classical styles.  

The example shown is translated by Ru-Ping Cheng and Jeff Ding for the ChinAI newsletter. It is an example of Cang Tou Shi, a Chinese version of acrostic poems. "In this case," the translator explains, "the first words of each line form the title of the poem: 神经网络 (neural networks)." Some other examples of the system's output are shown in a preprint published by the system's creators, and a translation of a Chinese newspaper article (entered into ELMCIP) provides translations of more examples.  

Pull Quotes

Neural Networks

Allocating divine status to a soul that has passed—it is natural,Like the classics that preserve the virtues of ancient wisdom.The astray scripts of the internet try earnestly to preserve their legacies,A newfound literary wisdom that shall be passed down for centuries.

Screen shots
Image
Screenshot of one of the generated poems in Chinese
Technical notes

A demo of the system can be accessed on WeChat. The developers write that to test it, one should register a Wechat account and add “EI体验空间” or “诺亚实验室”.

By June Hovdenakk, 12 September, 2018
Author
Language
License
All Rights reserved
Record Status
Abstract (in English)

This paper proposes a typology for studying Chinese text-based playable media (e.g. interactive installations, screen-based works, computer games) in terms of the freedom of user interaction with the Chinese characters. In the last two decades, various typologies/models/categories have been proposed to systematize the research of electronic literature and text-based digital art (Seiça, 2012). These classifications focus on different aspects of digital works, including but not limited to: visual experience of users, aesthetic principles, interactive features, technologies applied and structure of codes (Campas, 2004; Hayles, 2008; Strehovec, 2015). Although dissecting electronic literature with such diverse angles, these classifications are all based on examples of alphabetical languages and pay little attention to the abilities (freedom) of the user deconstructing and manipulating the basic linguistic units in the works. The Chinese language differentiates itself from any alphabetical-based languages by containing a huge number of graphemes instead of a dozens of letters. This creates a problem of how to input Chinese characters into western originated machines (from typewriter to computer) (Mullaney, 2017). In most of the modern commercial digital systems, all useable characters must be listed on the Unicode table as an alphanumeric code. However, these codes are arbitrarily assigned and make no sense for human users. People always need to input another set of codes or data, often based on the phonetic or written structure of a character, to the inputting software which will call the corresponding Unicode from the operating system. This handling of characters through “reinterpreting and rendering” (Cayley, 2003, p.281) is the norm of all Chinese computer systems. However, many Chinese text-based playable works intentionally or unintentionally sabotage such process flow and challenge the limitation imposed by the computer systems. Since this is a unique condition in Chinese-based works, this proposed typology will be based on the difference of how users manipulate the characters in the examined works and what extra freedom has been provided in comparison to consumer applications. This typology is not only needed for categorizing the characteristics of Chinese text-based playable media for future research, but can also provide a ground for systematically analyzing the difference between character-based and alphabetical-based languages in digital interacting environment. 

Pull Quotes

The Chinese language differentiates itself from any alphabetical-based languages by containing a huge number of graphemes instead of a dozens of letters. This creates a problem of how to input Chinese characters into western originated machines (from typewriter to computer) (Mullaney, 2017).