| dc.contributor.author | Kolesau, Aliaksei | |
| dc.contributor.author | Šešok, Dmitrij | |
| dc.date.accessioned | 2023-09-18T20:29:21Z | |
| dc.date.available | 2023-09-18T20:29:21Z | |
| dc.date.issued | 2020 | |
| dc.identifier.uri | https://etalpykla.vilniustech.lt/handle/123456789/150331 | |
| dc.description.abstract | In this paper we examine the results of using different acoustic feature computation pipelines for classifying audio keywords with a convolutional neural network (CNN). We compare the use of Mel-frequency cepstral coefficients (MFCCs) and a simple filterbank averaging technique. Also we examined the influence of MFCCs computation parameters on the resulting quality. The results show that CNNs benifit from using prior knowledge in acoustic feature computation. In our experiments we got 30% drop in accuracy while switching from MFCC to filterbank averaging. Furthemore, the default values of MFCCs parameters that are used in many libraries might not be the best for voice activation problem: frame length of 55 ms showed better results than default length of 20 ms. | eng |
| dc.format | PDF | |
| dc.format.extent | p. 1-4 | |
| dc.format.medium | tekstas / txt | |
| dc.language.iso | eng | |
| dc.relation.isreferencedby | Scopus | |
| dc.relation.isreferencedby | IEEE Xplore | |
| dc.source.uri | https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9108867 | |
| dc.title | Investigation of acoustic features for voice activation problem | |
| dc.type | Straipsnis konferencijos darbų leidinyje Scopus DB / Paper in conference publication in Scopus DB | |
| dcterms.references | 21 | |
| dc.type.pubtype | P1b - Straipsnis konferencijos darbų leidinyje Scopus DB / Article in conference proceedings Scopus DB | |
| dc.contributor.institution | Vilniaus Gedimino technikos universitetas | |
| dc.contributor.faculty | Fundamentinių mokslų fakultetas / Faculty of Fundamental Sciences | |
| dc.subject.researchfield | T 007 - Informatikos inžinerija / Informatics engineering | |
| dc.subject.vgtuprioritizedfields | IK0303 - Dirbtinio intelekto ir sprendimų priėmimo sistemos / Artificial intelligence and decision support systems | |
| dc.subject.ltspecializations | L106 - Transportas, logistika ir informacinės ir ryšių technologijos (IRT) / Transport, logistic and information and communication technologies | |
| dc.subject.en | voice activation | |
| dc.subject.en | convolutional neural network | |
| dc.subject.en | MFCC | |
| dcterms.sourcetitle | 2020 IEEE Open Conference of Electrical, Electronic and Information Sciences (eStream), 30 April 2020, Vilnius, Lithuania: proceedings of the conference / organized by Vilnius Gediminas Technical University | |
| dc.publisher.name | IEEE | |
| dc.publisher.city | New York | |
| dc.identifier.doi | 10.1109/eStream50540.2020.9108867 | |
| dc.identifier.elaba | 62634235 | |