Customer segmentation is important for businesses that want to gain useful insights to make better decisions, stay competitive and improve customer satisfaction. This research investigates customer segmentation in the retail sector by combining demographic information, purchasing behaviours and sentiment analysis using different clustering methods. K-Means, Hierarchical Agglomerative Clustering (HAC) and Density-Based Spatial Clustering of Applications with Noise (DBSCAN) were employed to analyse demographic and behavioural data to identify distinct customer segments. Clustering results were evaluated using metrics such as the Silhouette Score, Davies-Bouldin Index and Calinski-Harabasz Index, which confirmed the effectiveness of algorithms. The findings demonstrate that all clustering methods effectively identify distinct customer segments based on demographic and behavioural characteristics across product categories, including Behavioral RFM (Recency, Frequency, Monetary) clusters. Sentiment analysis of product reviews provides deeper insights into customer feelings and opinions, illustrating how these sentiments affect purchasing behaviour. Read full report below
Language |
Python |
Library |
Pandas, Matplotlib, TextBlob, Scikit-Learn |
Analysis |
K-Means, Hierarchical Agglomerative Clustering (HAC), Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Sentiment Analysis |
Data Source |
Kaggle |