| dc.contributor.author | Raudys, Šarūnas | |
| dc.date.accessioned | 2023-09-18T19:37:58Z | |
| dc.date.available | 2023-09-18T19:37:58Z | |
| dc.date.issued | 2006 | |
| dc.identifier.issn | 0302-9743 | |
| dc.identifier.other | (BIS)VGT02-000012733 | |
| dc.identifier.uri | https://etalpykla.vilniustech.lt/handle/123456789/141554 | |
| dc.description.abstract | We propose probabilistic framework for analysis of inaccuracies due to feature selection (FS) when flawed estimates of performance of feature subsets are utilized. The approach is based on analysis of random search FS procedure and postulation that joint distribution of true and estimated classification errors is known a priori. We derive expected values for the FS bias, a difference between actual classification error after FS and classification error if ideal FS is performed according to exact estimates. The increase in true classification error due to inaccurate FS is comparable or even exceeds a training bias, a difference between generalization and Bayes errors. We have shown that there exists overfitting phenomenon in feature selection, entitled in this paper as feature over-selection. The effects of feature over-selection could be reduced if FS would be performed on basis of positional statistics. Theoretical results are supported by experiments carried out on simulated Gaussian data, as well as on high dimensional microarray gene expression data. | eng |
| dc.format | PDF | |
| dc.format.extent | p. 622-631 | |
| dc.format.medium | tekstas / txt | |
| dc.language.iso | eng | |
| dc.relation.isreferencedby | Conference Proceedings Citation Index - Science (Web of Science) | |
| dc.relation.isreferencedby | SpringerLink | |
| dc.relation.isreferencedby | Science Citation Index Expanded (Web of Science) | |
| dc.relation.isreferencedby | Compendex | |
| dc.relation.isreferencedby | MathSciNet | |
| dc.relation.isreferencedby | GeoRef | |
| dc.source.uri | https://doi.org/10.1007/11815921_68 | |
| dc.title | Feature over-selection | |
| dc.type | Straipsnis Web of Science DB / Article in Web of Science DB | |
| dcterms.accessRights | LBT: Tomo antraštė: Structural, syntactic, and statistical pattern recognition : Joint IAPR international workshops, SSPR 2006 and SPR 2006 : Hong Kong, China, August 17-19, 2006 : Proceedings. | |
| dcterms.references | 17 | |
| dc.type.pubtype | S1 - Straipsnis Web of Science DB / Web of Science DB article | |
| dc.contributor.institution | Vilniaus Gedimino technikos universitetas | |
| dc.contributor.faculty | Fundamentinių mokslų fakultetas / Faculty of Fundamental Sciences | |
| dc.subject.researchfield | N 009 - Informatika / Computer science | |
| dc.subject.researchfield | T 007 - Informatikos inžinerija / Informatics engineering | |
| dcterms.sourcetitle | Structural, Syntactic, and Statistical Pattern Recognition : Joint IAPR International Workshops SSPR 2006 and SPR 2006 Hong Kong, China, August 2006 : proceedings. Lecture Notes in Computer Science | |
| dc.description.volume | Vol. 4109 | |
| dc.publisher.name | Springer | |
| dc.publisher.city | Berlin | |
| dc.identifier.doi | LBT02-000023268 | |
| dc.identifier.doi | 000240075100068 | |
| dc.identifier.doi | 10.1007/11815921_68 | |
| dc.identifier.elaba | 3742940 | |