Author

John McIntosh

Date of Award

2025

Document Type

Thesis

Degree Name

Bachelors

Department

Natural Sciences

First Advisor

Skripnikov, Andrey

Second Advisor

Loveland, Rohan

Area of Concentration

Computer Science

Abstract

This thesis examines the evolution of Major League Baseball (MLB) hitting strategies from 1950 to 2010 using statistical analysis and machine learning techniques. The study investigates changes in player performance metrics to determine how hitting styles have evolved, specifically focusing on distinguishing "power hitters" from "contact hitters." Principal Component Analysis “(PCA)”, t-distributed Stochastic Neighbor Embedding, and clustering methods, such as K Means Clustering, were applied to historical MLB data from Baseball Reference to reveal underlying trends and shifts in player roles over time. An interactive dashboard was developed utilizing Streamlit to visualize these trends dynamically. This was done by incorporating a year-by-year and decade display of stats, decade hitting trend comparison, contact hitting versus power, utilizing a PCA, and a player comparison allowing the user to look at players from 1950-2010 that fall under the contact designation or power designation and compare their stats to see what differentiated them.

Share

COinS