Using Genetic K-Means Algorithm for PCA Regression Data in Customer Churn Prediction

Huang, Bingquan; Satoh, T.; Huang, Y.; Kechadi, M. -T.; Buckley, B.

doi:10.1007/978-3-642-17313-4_22

Bingquan Huang²¹,
T. Satoh²¹,
Y. Huang²¹,
M. -T. Kechadi²¹ &
…
B. Buckley²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6441))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

3227 Accesses

Abstract

Imbalance distribution of samples between churners and non-churners can hugely affect churn prediction results in telecommunication services field. One method to solve this is over-sampling approach by PCA regression. However, PCA regression may not generate good churn samples if a dataset is nonlinear discriminant. We employed Genetic K-means Algorithm to cluster a dataset to find locally optimum small dataset to overcome the problem. The experiments were carried out on a real-world telecommunication dataset and assessed on a churn prediction task. The experiments showed that Genetic K-means Algorithm can improve prediction results for PCA regression and performed as good as SMOTE.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Improve customer churn prediction through the proposed PCA-PSO-K means algorithm in the communication industry

Article 17 November 2022

Customer Segmentation by Various Clustering Approaches and Building an Effective Hybrid Learning System on Churn Prediction Dataset

Research on Telecom Customer Churn Prediction Method Based on Data Mining

References

Burges, C.J.C.: A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2(2), 121–167 (1998)
Article Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kergelmeyer, W.P.: SMOTE: Synthetic Minority Over-sampling Technique. JAIR 16, 321–357 (2002)
MATH Google Scholar
Hosmer, D.W., Lemeshow, S.: Applied Logistic Regression. Wiley, New York (1989)
MATH Google Scholar
Kohonen, T.: Self-Organizing Maps. Series in Information Sciences, vol. 30. Springer, Heidelberg (2001)
MATH Google Scholar
Quinlan, J.R.: Improved use of continuous attributes in c4. 5. Journal of Artificial Intelligence Research 4, 77–90 (1996)
MATH Google Scholar
Zhang, J., Yang, Y., Lades, M.: Face Recognition: Eigenface, Elastic Matching, and Neural Nets. In: The IEEE, pp. 1423–1435 (1997)
Google Scholar
Luo, B., Peiji, S., Juan, L.: Customer Churn Prediction Based on the Decision Tree in Personal Handyphone System Service. In: International Conference on Service Systems and Service Management, pp. 1–5 (2007)
Google Scholar
Au, W., Chan, C.C., Yao, X.: A novel evolutionary data mining algorithm with applications to churn prediction. IEEE Transactions on Evolutionary Computation 7, 532–545 (2003)
Article Google Scholar
Bradley, A.P.: The Use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30, 1145–1159 (1997)
Article Google Scholar
Bäck, T.: Evolutionary Algorithms in Theory and Practice. ch. 2. Oxford Univeristy Press, Oxford (1996)
MATH Google Scholar
Chawla, N.V., Japkowicz, N., Kotcz, A.: Editorial: special issue on learning from imbalanced data sets. SIGKDD Explor. Newsl. 6, 1–6 (2004)
Article Google Scholar
Domingos, P.: MetaCost: A general method for making classifiers cost sensitive. In: The 5th International Conference on Knowledge Discovery and Data Mining, pp. 155–164 (1999)
Google Scholar
Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Kluwer Academic Publishers, Dordrecht (1989)
MATH Google Scholar
Jolliffe, I.T.: Principal Components Analysis. Springer, New York (1986)
Book MATH Google Scholar
Huang, B.Q., Kechadi, M.T., Buckley, B.: Customer Churn Prediction for Broadband Internet Services. In: Pedersen, T.B., Mohania, M.K., Tjoa, A.M. (eds.) Data Warehousing and Knowledge Discovery. LNCS, vol. 5691, pp. 229–243. Springer, Heidelberg (2009)
Chapter Google Scholar
Wei, C., Chiu, I.: Turning telecommunications call details to churn prediction: a data mining approach. Expert Systems with Applications 23, 103–112 (2002)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Informatics, University College Dublin, Belfield, Dublin, 4, Ireland
Bingquan Huang, T. Satoh, Y. Huang, M. -T. Kechadi & B. Buckley

Authors

Bingquan Huang
View author publications
You can also search for this author in PubMed Google Scholar
T. Satoh
View author publications
You can also search for this author in PubMed Google Scholar
Y. Huang
View author publications
You can also search for this author in PubMed Google Scholar
M. -T. Kechadi
View author publications
You can also search for this author in PubMed Google Scholar
B. Buckley
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Engineering and Information Technology, University of Technology Sydney, 2007, Sydney, NSW, Australia
Longbing Cao
College of Computer Science, Chongqing University, 400030, Chongqing, China
Jiang Zhong & Yong Feng &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huang, B., Satoh, T., Huang, Y., Kechadi, M.T., Buckley, B. (2010). Using Genetic K-Means Algorithm for PCA Regression Data in Customer Churn Prediction. In: Cao, L., Zhong, J., Feng, Y. (eds) Advanced Data Mining and Applications. ADMA 2010. Lecture Notes in Computer Science(), vol 6441. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17313-4_22

Download citation

DOI: https://doi.org/10.1007/978-3-642-17313-4_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17312-7
Online ISBN: 978-3-642-17313-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics