Comparative Analysis of the Applicability of Five Clustering Algorithms for Market Segmentation
Data
2023Autorius
Teslenko, Denys
Sorokina, Anna
Smelyakov, Kyrylo
Filipov, Oleksii
Metaduomenys
Rodyti detalų aprašąSantrauka
Customer communication is important in maintaining business sustainability, which can be tremendously improved by market segmentation. Clustering is a powerful tool to carry out market segmentation in an effective way. As every algorithm has its advantages and disadvantages, may be susceptible to the data peculiarities and change its effectiveness depending on it, for marketing it is vital to understand which techniques work better on mixed data, and which show poor results or are not applicable at all. This paper represents a comparative analysis on five clustering algorithms (k-means, BIRCH, agglomerative hierarchical clustering, DBSCAN and OPTICS), which represent three different approaches in clustering: centroid-based, connectivity-based and density-based approach. Using an open-source dataset with the data that describe hotel customers we evaluated the effectiveness of the aforementioned algorithms in extracting customer groups. For clustering quality evaluation, we used Davies-Bouldin score and Silhouette score, and after that visualized the results. As a result, we made a complete comparison of clustering algorithms and revealed the advantage of agglomerative hierarchical clustering over other algorithms. Also, we proved the significant benefit from using Gower's distance with density-based algorithms and loss with connectivity-based algorithm when dealing with mixed data. Finally, we summarized obtained results to make a conclusion about the applicability of each algorithm.
