Show simple item record

dc.contributor.authorAtliha, Viktar
dc.contributor.authorŠešok, Dmitrij
dc.date.accessioned2023-09-18T16:16:47Z
dc.date.available2023-09-18T16:16:47Z
dc.date.issued2022
dc.identifier.issn2076-3417
dc.identifier.urihttps://etalpykla.vilniustech.lt/handle/123456789/112616
dc.description.abstractImage captioning is a very important task, which is on the edge between natural language processing (NLP) and computer vision (CV). The current quality of the captioning models allows them to be used for practical tasks, but they require both large computational power and considerable storage space. Despite the practical importance of the image-captioning problem, only a few papers have investigated model size compression in order to prepare them for use on mobile devices. Furthermore, these works usually only investigate decoder compression in a typical encoder–decoder architecture, while the encoder traditionally occupies most of the space. We applied the most efficient model-compression techniques such as architectural changes, pruning and quantization to several state-of-the-art image-captioning architectures. As a result, all of these models were compressed by no less than 91% in terms of memory (including encoder), but lost no more than 2% and 4.5% in metrics such as CIDEr and SPICE, respectively. At the same time, the best model showed results of 127.4 CIDEr and 21.4 SPICE, with a size equal to only 34.8 MB, which sets a strong baseline for compression problems for image-captioning models, and could be used for practical applications.eng
dc.formatPDF
dc.format.extentp. 1-14
dc.format.mediumtekstas / txt
dc.language.isoeng
dc.relation.isreferencedbyScience Citation Index Expanded (Web of Science)
dc.relation.isreferencedbyScopus
dc.relation.isreferencedbyDOAJ
dc.relation.isreferencedbyINSPEC
dc.relation.isreferencedbyJ-Gate
dc.relation.isreferencedbyGale's Academic OneFile
dc.source.urihttps://www.mdpi.com/2076-3417/12/3/1638/pdf
dc.titleImage-captioning model compression
dc.typeStraipsnis Web of Science DB / Article in Web of Science DB
dcterms.accessRightsThis article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/)
dcterms.licenseCreative Commons – Attribution – 4.0 International
dcterms.references61
dc.type.pubtypeS1 - Straipsnis Web of Science DB / Web of Science DB article
dc.contributor.institutionVilniaus Gedimino technikos universitetas
dc.contributor.facultyFundamentinių mokslų fakultetas / Faculty of Fundamental Sciences
dc.subject.researchfieldT 007 - Informatikos inžinerija / Informatics engineering
dc.subject.researchfieldN 009 - Informatika / Computer science
dc.subject.studydirectionB04 - Informatikos inžinerija / Informatics engineering
dc.subject.vgtuprioritizedfieldsIK0303 - Dirbtinio intelekto ir sprendimų priėmimo sistemos / Artificial intelligence and decision support systems
dc.subject.ltspecializationsL106 - Transportas, logistika ir informacinės ir ryšių technologijos (IRT) / Transport, logistic and information and communication technologies
dc.subject.enimage captioning
dc.subject.enmodel compression
dc.subject.enpruning
dc.subject.enquantization
dcterms.sourcetitleApplied sciences
dc.description.issueiss. 3
dc.description.volumevol. 12
dc.publisher.nameMDPI
dc.publisher.cityBasel
dc.identifier.doi000756109800001
dc.identifier.doi10.3390/app12031638
dc.identifier.elaba118832017


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record