An Application of PCA Based K-Means Clustering for Customer Segmentation in One Luxury Goods Company
Session
Civil Engineering, Infrastructure and Environment
Description
In this paper K-means clustering algorithm is applied in order to classify customers into several groups showing the similarity within a group is better than among groups. After determining the relevant client’s attributes in a SQL Server database, PCA (Principal Component Analysis) is applied in order to reduce the number of features, and after that, K-means algorithm is performed in MATLAB programming environment, using fixed number of clusters. Each centroid defines one of the clusters, while each data point is assigned to the nearest centroid, based on the squared Euclidean distance. In this research, centroids are randomly generated, while the separation distance between the resulting clusters is analyzed and illustrated using the Silhouette index. The analysis and results presented in this paper could determine a similarity in purchasing or using the services by a population cluster in one luxury goods company, to develop market segments, to identify repetitive behavior or trends in aiming to provide a full assessment of actions and to create some new customer loyalty campaigns.
Keywords:
cluster analysis, K-means, Principal Component Analysis (PCA), Silhouette index
Session Chair
Feti Selmani
Session Co-Chair
Anjeza Alaj
Proceedings Editor
Edmond Hajrizi
ISBN
978-9951-550-19-2
First Page
86
Last Page
93
Location
Pristina, Kosovo
Start Date
26-10-2019 1:30 PM
End Date
26-10-2019 3:00 PM
DOI
10.33107/ubt-ic.2019.189
Recommended Citation
Savic, Ana; Bjelobaba, Goran; Janicijevic, Stefana; and Stefanovic, Hana, "An Application of PCA Based K-Means Clustering for Customer Segmentation in One Luxury Goods Company" (2019). UBT International Conference. 189.
https://knowledgecenter.ubt-uni.net/conference/2019/events/189
Included in
An Application of PCA Based K-Means Clustering for Customer Segmentation in One Luxury Goods Company
Pristina, Kosovo
In this paper K-means clustering algorithm is applied in order to classify customers into several groups showing the similarity within a group is better than among groups. After determining the relevant client’s attributes in a SQL Server database, PCA (Principal Component Analysis) is applied in order to reduce the number of features, and after that, K-means algorithm is performed in MATLAB programming environment, using fixed number of clusters. Each centroid defines one of the clusters, while each data point is assigned to the nearest centroid, based on the squared Euclidean distance. In this research, centroids are randomly generated, while the separation distance between the resulting clusters is analyzed and illustrated using the Silhouette index. The analysis and results presented in this paper could determine a similarity in purchasing or using the services by a population cluster in one luxury goods company, to develop market segments, to identify repetitive behavior or trends in aiming to provide a full assessment of actions and to create some new customer loyalty campaigns.