Date of Award
5-2026
Document Type
Thesis
Degree Name
Bachelor of Arts (BA)
Department
Natural Sciences
First Advisor
Roy, Tania
Second Advisor
Skripnikov, Andrey
Area of Concentration
Computer Science
Abstract
E-commerce businesses collect large amounts of transaction data every day, but most companies do not have the right tools to turn that data into useful decisions. Managers either rely on intuition or need a data scientist to answer every question. This thesis presents the design and development of an analytics dashboard that tries to close that gap. The platform combines descriptive analysis, predictive modeling, and natural language querying in a single interactive application built with Streamlit and PostgreSQL. The dashboard has five analytical modules. The Sales Overview and Category Explorer shows high-level business metrics alongside customer demographics and channel performance. The Demand Forecasting module uses a log-transformed Random Forest model to predict daily order volume per product category, using lag features and calendar-based indicators as inputs. The Churn Prediction module uses Logistic Regression, comparing a linear and a quadratic version, with class-weight balancing and five-fold cross-validation to find customers who are at risk of not coming back. The Customer Segmentation module applies K-Means clustering to group customers into four distinct profiles and suggest specific business actions for each one. Finally, the Natural Language Analyst module lets users ask questions in plain English, which are then converted into SQL queries using a Groq-powered LLaMA pipeline and returned as plain-English answers. Together, these five modules show a natural progression from describing what happened, to predicting what will happen, to letting anyone ask their own questions. The platform is designed so that both technical and non-technical users can get value from it, making data more accessible across the organization.
Link for referenced dashboard: https://thesis-brvaldez.azurewebsites.net/
Recommended Citation
Valdez, Bruno, "Descriptive Analysis of E-Commerce Sales Performance: An End-to-End Analytics Dashboard Using PostgreSQL, Machine Learning, and Natural Language Querying" (2026). Theses & ETDs. 6916.
https://digitalcommons.ncf.edu/theses_etds/6916
Rights
The author has granted New College of Florida the nonexclusive right to archive, make accessible, and distribute for educational purposes this work in whole or in part in all forms of media, now or hereafter known. The copyright of this work remains with the author.