| dc.contributor.author | Bugajev, Andrej | |
| dc.contributor.author | Kriauzienė, Rima | |
| dc.contributor.author | Vasilecas, Olegas | |
| dc.contributor.author | Chadyšas, Viktoras | |
| dc.date.accessioned | 2023-09-18T16:19:47Z | |
| dc.date.available | 2023-09-18T16:19:47Z | |
| dc.date.issued | 2022 | |
| dc.identifier.issn | 0868-4952 | |
| dc.identifier.uri | https://etalpykla.vilniustech.lt/handle/123456789/113239 | |
| dc.description.abstract | One of the biggest difficulties in telecommunication industry is to retain the customers and prevent the churn. In this article, we overview the most recent researches related to churn detection for telecommunication companies. The selected machine learning methods are applied to the publicly available datasets, partially reproducing the results of other authors and then it is applied to the private Moremins company dataset. Next, we extend the analysis to cover the exiting research gaps: the differences of churn definitions are analysed, it is shown that the accuracy in other researches is better due to some false assumptions, i.e. labelling rules derived from definition lead to very good classification accuracy, however, it does not imply the usefulness for such churn detection in the context of further customer retention. The main outcome of the research is the detailed analysis of the impact of the differences in churn definitions to a final result, it was shown that the impact of labelling rules derived from definitions can be large. The data in this study consist of call detail records (CDRs) and other user aggregated daily data, 11000 user entries over 275 days of data was analysed. 6 different classification methods were applied, all of them giving similar results, one of the best results was achieved using Gradient Boosting Classifier with accuracy rate 0.832, F-measure 0.646, recall 0.769. | eng |
| dc.format | PDF | |
| dc.format.extent | p. 247-277 | |
| dc.format.medium | tekstas / txt | |
| dc.language.iso | eng | |
| dc.relation.isreferencedby | Science Citation Index Expanded (Web of Science) | |
| dc.relation.isreferencedby | Scopus | |
| dc.title | The impact of churn labelling rules on churn prediction in telecommunications | |
| dc.type | Straipsnis Web of Science DB / Article in Web of Science DB | |
| dcterms.accessRights | Open access article under the CC BY license. | |
| dcterms.license | Creative Commons – Attribution – 4.0 International | |
| dcterms.references | 36 | |
| dc.type.pubtype | S1 - Straipsnis Web of Science DB / Web of Science DB article | |
| dc.contributor.institution | Vilniaus Gedimino technikos universitetas | |
| dc.contributor.faculty | Fundamentinių mokslų fakultetas / Faculty of Fundamental Sciences | |
| dc.subject.researchfield | N 001 - Matematika / Mathematics | |
| dc.subject.researchfield | N 009 - Informatika / Computer science | |
| dc.subject.researchfield | T 007 - Informatikos inžinerija / Informatics engineering | |
| dc.subject.vgtuprioritizedfields | FM0101 - Fizinių, technologinių ir ekonominių procesų matematiniai modeliai / Mathematical models of physical, technological and economic processes | |
| dc.subject.ltspecializations | L104 - Nauji gamybos procesai, medžiagos ir technologijos / New production processes, materials and technologies | |
| dc.subject.en | churn prediction | |
| dc.subject.en | churn definition | |
| dc.subject.en | telecom | |
| dc.subject.en | machine learning | |
| dc.subject.en | binary classification | |
| dc.subject.en | customer classification | |
| dc.subject.en | imbalanced learning | |
| dc.subject.en | RFM | |
| dcterms.sourcetitle | Informatica | |
| dc.description.issue | iss. 2 | |
| dc.description.volume | vol. 33 | |
| dc.publisher.name | Vilnius University Institute of Data Science and Digital Technologies | |
| dc.publisher.city | Vilnius | |
| dc.identifier.doi | 000823737700002 | |
| dc.identifier.doi | 10.15388/22-INFOR484 | |
| dc.identifier.elaba | 132110365 | |