Author

Emma Schwarz

Date of Award

2019

Document Type

Thesis

Degree Name

Bachelors

Department

Natural Sciences

First Advisor

Doucette, John

Area of Concentration

Computer Science

Abstract

I extract the lexical representation of target words related to the concepts of gender and sensuality in Tolstoy utilizing the word2vec algorithm. The word2vec algorithm takes as input a corpus and outputs a vector space where the vectorized words that are closest in the space are most similar. This similarity is defined through calculating the cosine similarity between vectors in the space. The corpus consists of Anna Karenina and The Kreutzer Sonata, two works that address the woman question in nineteenth-century Russia, and is retrieved from Project Gutenberg. I provide visualizations of the words most similar to target words within the two novels, as well as a list of beginning literary analyses for each target word’s results.

Share

COinS