The impact of churn labelling rules on churn prediction in telecommunications

Bugajev, Andrej; Kriauzienė, Rima; Vasilecas, Olegas; Chadyšas, Viktoras

dc.contributor.author	Bugajev, Andrej
dc.contributor.author	Kriauzienė, Rima
dc.contributor.author	Vasilecas, Olegas
dc.contributor.author	Chadyšas, Viktoras
dc.date.accessioned	2023-09-18T16:19:47Z
dc.date.available	2023-09-18T16:19:47Z
dc.date.issued	2022
dc.identifier.issn	0868-4952
dc.identifier.uri	https://etalpykla.vilniustech.lt/handle/123456789/113239
dc.description.abstract	One of the biggest difficulties in telecommunication industry is to retain the customers and prevent the churn. In this article, we overview the most recent researches related to churn detection for telecommunication companies. The selected machine learning methods are applied to the publicly available datasets, partially reproducing the results of other authors and then it is applied to the private Moremins company dataset. Next, we extend the analysis to cover the exiting research gaps: the differences of churn definitions are analysed, it is shown that the accuracy in other researches is better due to some false assumptions, i.e. labelling rules derived from definition lead to very good classification accuracy, however, it does not imply the usefulness for such churn detection in the context of further customer retention. The main outcome of the research is the detailed analysis of the impact of the differences in churn definitions to a final result, it was shown that the impact of labelling rules derived from definitions can be large. The data in this study consist of call detail records (CDRs) and other user aggregated daily data, 11000 user entries over 275 days of data was analysed. 6 different classification methods were applied, all of them giving similar results, one of the best results was achieved using Gradient Boosting Classifier with accuracy rate 0.832, F-measure 0.646, recall 0.769.	eng
dc.format	PDF
dc.format.extent	p. 247-277
dc.format.medium	tekstas / txt
dc.language.iso	eng
dc.relation.isreferencedby	Science Citation Index Expanded (Web of Science)
dc.relation.isreferencedby	Scopus
dc.title	The impact of churn labelling rules on churn prediction in telecommunications
dc.type	Straipsnis Web of Science DB / Article in Web of Science DB
dcterms.accessRights	Open access article under the CC BY license.
dcterms.license	Creative Commons – Attribution – 4.0 International
dcterms.references	36
dc.type.pubtype	S1 - Straipsnis Web of Science DB / Web of Science DB article
dc.contributor.institution	Vilniaus Gedimino technikos universitetas
dc.contributor.faculty	Fundamentinių mokslų fakultetas / Faculty of Fundamental Sciences
dc.subject.researchfield	N 001 - Matematika / Mathematics
dc.subject.researchfield	N 009 - Informatika / Computer science
dc.subject.researchfield	T 007 - Informatikos inžinerija / Informatics engineering
dc.subject.vgtuprioritizedfields	FM0101 - Fizinių, technologinių ir ekonominių procesų matematiniai modeliai / Mathematical models of physical, technological and economic processes
dc.subject.ltspecializations	L104 - Nauji gamybos procesai, medžiagos ir technologijos / New production processes, materials and technologies
dc.subject.en	churn prediction
dc.subject.en	churn definition
dc.subject.en	telecom
dc.subject.en	machine learning
dc.subject.en	binary classification
dc.subject.en	customer classification
dc.subject.en	imbalanced learning
dc.subject.en	RFM
dcterms.sourcetitle	Informatica
dc.description.issue	iss. 2
dc.description.volume	vol. 33
dc.publisher.name	Vilnius University Institute of Data Science and Digital Technologies
dc.publisher.city	Vilnius
dc.identifier.doi	000823737700002
dc.identifier.doi	10.15388/22-INFOR484
dc.identifier.elaba	132110365

Šio įrašo failai

Failai	Dydis	Formatas	Peržiūra
Su šiuo įrašu susijusių failų nėra.

Šis įrašas yra šioje (-se) kolekcijoje (-ose)

Straipsniai Web of Science ir/ar Scopus referuojamuose leidiniuose / Articles in Web of Science and/or Scopus indexed sources [7946]

Rodyti trumpą aprašą