Date of Award
2022
Document Type
Thesis
Degree Name
Bachelors
Department
Natural Sciences
First Advisor
Gillman, David
Area of Concentration
Computer Science
Abstract
This thesis explores the field of Text Style Transfer using Natural Language Process- ing techniques. This task falls into the larger topic of Text Attribute Transfer, a field also encompassing closely-related tasks such as Machine Translation and Paraphrase Generation. Where style transfer differs from these existing tasks is that it focuses on trying to capture the more fine-grained aspects of writing such as diction and semantic syntax. The goal of this thesis is to explore some of the fundamental topics in Natu- ral Language Processing necessary for the understanding of Attribute Transfer tasks. These topics include text normalization techniques, tokenization and word embedding techniques such as Word2Vec and FastText, and a beginning discussion of Machine Translation using Sequence-to-Sequence modelling with an attention mechanism. Fol- lowing this, this paper will briefly discuss modern techniques for monolingual-corpus Style Transfer such as Style-Content Disentanglement and Pseudo-parallel Corpus Construction. We will attempt to apply techniques for parallel-corpus Style Trans- fer to a dataset of Shakespearean texts rewritten in Modern English. Metrics for evaluating the success of these modeling techniques will also be discussed, with a demonstration being performed on the outcome of our parallel-corpus experiment and a discussion on some of the shortcomings with existing evaluation metrics.
Recommended Citation
LoPresto, Austin, "AUTOMATED MODERNIZATION OF SHAKESPEAREAN ENGLISH: USING NATURAL LANGUAGE PROCESSING TO CAPTURE WRITING STYLE" (2022). Theses & ETDs. 6264.
https://digitalcommons.ncf.edu/theses_etds/6264