Date of Award
2013
Document Type
Thesis
Degree Name
Bachelors
Department
Natural Sciences
First Advisor
Rahal, Imad
Keywords
Computer Science, Plagiarism, String Comparison
Area of Concentration
Natural Sciences
Abstract
Source code plagiarism is easy to perform and difficult to catch. Detection approaches vary, with little consensus. This thesis compares several string comparison techniques borrowed from Biology on a large collection of student work containing various types of plagiarism. All the algorithms succeeded in matching a plagiarized file to its original files upwards of 90% of the time. A modification is proposed for these algorithms that drastically improves their runtimes with little or no effect on accuracy. The strengths and weaknesses of each are explored, in the hope of improving future plagiarism detection techniques.
Recommended Citation
Wielga, Colin, "A COMPARISON OF BIOLOGICAL STRING SIMILARITY ALGORITHMS APPLICATION TO SOURCE CODE PLAGIARISM DETECTION" (2013). Theses & ETDs. 6852.
https://digitalcommons.ncf.edu/theses_etds/6852
Rights
The author has granted New College of Florida the nonexclusive right to archive, make accessible, and distribute for educational purposes this work in whole or in part in all forms of media, now or hereafter known. The copyright of this work remains with the author.