Show simple item record

dc.contributor.authorGrigalis, Tomas
dc.contributor.authorRadvilavičius, Lukas
dc.contributor.authorČenys, Antanas
dc.contributor.authorGordevičius, Juozas
dc.date.accessioned2023-09-18T19:17:29Z
dc.date.available2023-09-18T19:17:29Z
dc.date.issued2012
dc.identifier.other(BIS)VGT02-000025155
dc.identifier.urihttps://etalpykla.vilniustech.lt/handle/123456789/137424
dc.description.abstractWe propose a novel approach for extraction of structured web data called ClustVX. It clusters visually similar web page elements by exploiting their visual formatting and structural features. Clusters are then used to derive extraction rules. The experimental evaluation results of ClustVX system on three publicly available benchmark data sets outperform state-of-the-art structured data extraction systems.eng
dc.formatPDF
dc.format.extentp. 435-438
dc.format.mediumtekstas / txt
dc.language.isoeng
dc.relation.ispartofseriesLecture Notes in Computer Science vol. 7387 0302-9743
dc.relation.isreferencedbyScopus
dc.relation.isreferencedbySpringerLink
dc.source.urihttp://link.springer.com/chapter/10.1007%2F978-3-642-31753-8_38
dc.titleClustering visually similar web page elements for structured web data extraction
dc.typeStraipsnis konferencijos darbų leidinyje Scopus DB / Paper in conference publication in Scopus DB
dcterms.references8
dc.type.pubtypeP1b - Straipsnis konferencijos darbų leidinyje Scopus DB / Article in conference proceedings Scopus DB
dc.contributor.institutionVilniaus Gedimino technikos universitetas
dc.contributor.institutionVilniaus universitetas
dc.contributor.facultyFundamentinių mokslų fakultetas / Faculty of Fundamental Sciences
dc.contributor.departmentInformacinių sistemų katedra / Department of Information Systems
dc.subject.researchfieldT 007 - Informatikos inžinerija / Informatics engineering
dcterms.sourcetitleWeb Engineering : 12th International Conference, ICWE 2012, Berlin, Germany, July 23-27, 2012 : proceedings
dc.publisher.nameSpringer
dc.publisher.cityNew York
dc.identifier.doi10.1007/978-3-642-31753-8_38
dc.identifier.elaba3994444


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record