| dc.rights.license | Visos teisės saugomos / All rights reserved | en_US |
| dc.contributor.author | Jevsejev, Roman | |
| dc.contributor.author | Mažeika, Dalius | |
| dc.contributor.author | Bereiša, Mindaugas | |
| dc.date.accessioned | 2026-01-13T09:08:26Z | |
| dc.date.available | 2026-01-13T09:08:26Z | |
| dc.date.issued | 2025 | |
| dc.identifier.isbn | 9798331598747 | en_US |
| dc.identifier.issn | 2831-5634 | en_US |
| dc.identifier.uri | https://etalpykla.vilniustech.lt/handle/123456789/159726 | |
| dc.description.abstract | This study investigates the challenges of preparing datasets for machine learning models based on the data of a centralized system for managing IT incidents within an organization. Key challenges include data quality issues, class imbalance, the need for anonymization, and redundancy in the information. Various data preparation techniques are analyzed, such as handling missing values, encoding categorical and textual data, balancing datasets, anonymizing sensitive information, and performing feature selection. The paper highlights its structural complexities and processing difficulties by examining the state enterprise's Service Desk incident data. Furthermore, the impact of data engineering and cleaning techniques on the accuracy and reliability of machine learning models is assessed. Finally, specific techniques to improve data preparation and to optimize model performance are analyzed. | en_US |
| dc.format.extent | 5 p. | en_US |
| dc.format.medium | Tekstas / Text | en_US |
| dc.language.iso | en | en_US |
| dc.relation.uri | https://etalpykla.vilniustech.lt/handle/123456789/159405 | en_US |
| dc.source.uri | https://ieeexplore.ieee.org/document/11016852 | en_US |
| dc.subject | IT Service Management | en_US |
| dc.subject | Incident Management Systems | en_US |
| dc.subject | Data Preprocessing | en_US |
| dc.subject | Data Filtering | en_US |
| dc.subject | Multilingual Translation | en_US |
| dc.subject | M2M100 Model | en_US |
| dc.subject | Data Anonymization | en_US |
| dc.title | An Approach for Building IT Support Dataset for Machine Learning Models | en_US |
| dc.type | Konferencijos publikacija / Conference paper | en_US |
| dcterms.accrualMethod | Rankinis pateikimas / Manual submission | en_US |
| dcterms.issued | 2025-06-02 | |
| dcterms.references | 10 | en_US |
| dc.description.version | Taip / Yes | en_US |
| dc.contributor.institution | Vilniaus Gedimino technikos universitetas | en_US |
| dc.contributor.institution | Vilnius Gediminas Technical University | en_US |
| dc.contributor.institution | SE Ignalina Nuclear Power Plant | en_US |
| dc.contributor.faculty | Fundamentinių mokslų fakultetas / Faculty of Fundamental Sciences | en_US |
| dc.contributor.department | Informacinių technologijų katedra / Department of Information Technologies | en_US |
| dcterms.sourcetitle | 2025 IEEE Open Conference of Electrical, Electronic and Information Sciences (eStream), April 24, 2025, Vilnius, Lithuania | en_US |
| dc.identifier.eisbn | 9798331598730 | en_US |
| dc.identifier.eissn | 2690-8506 | en_US |
| dc.publisher.name | IEEE | en_US |
| dc.publisher.country | United States of America | en_US |
| dc.publisher.city | New York | en_US |
| dc.identifier.doi | https://doi.org/10.1109/eStream66938.2025.11016852 | en_US |