Evento
A Scoring Map Algorithm for Automatically Detecting Structural Similarity of DOM Elements
Colaboradores:
Domínguez Mayo, Francisco José; Marchiori, Massimo; Filipe, Joaquim
Tipo del evento:
Conferencia
Nombre del evento:
17th International Conference on Web Information Systems and Technologies
Fecha del evento:
26/10/2021
Institución Organizadora:
Polytechnic Institute of Setubal;
Título del Libro:
Proceedings of the 17th International Conference on Web Information Systems and Technologies
Editorial:
ScitePress
ISBN:
978-989-758-536-4
Idioma:
Inglés
Clasificación temática:
Resumen
Most documents in the WWW are generated from templates that represent user interface (UI) elements, and later filled with contents. In the field of information extraction, many approaches emerged to analyze the documents? structure, obtain similar features amongst them, and generate wrappers that are used to extract the raw contents from such documents. Therefore, most techniques documented in the literature are optimized to compare full documents, but there are other fields of applicability that require analyzing structural similarity on smaller UI components, like web augmentation or transcoding. In this paper we present two flexible algorithms to measure similarity between DOM Elements by using a mixed approach that considers both elements? location and inner structure. The proposed algorithms were used in the context of two projects: an approach for automatic usability refactoring, and a web accessibility helper. We also present a wrapper induction technique based on such algorithms. Additionally, we present a precision & recall evaluation of our algorithms as compared with other known approaches, applied to DOM elements of different sizes, but smaller than full scaled documents. The proposed algorithms run in linear time, so they are faster than most approaches that analyze structural similarity.
Palabras clave:
INFORMATION EXTRACTION
,
WEB ADAPTATION
,
REFACTORING FOR USABILITY
Archivos asociados
Licencia
Identificadores
Colecciones
Eventos(CCT - LA PLATA)
Eventos de CTRO.CIENTIFICO TECNOL.CONICET - LA PLATA
Eventos de CTRO.CIENTIFICO TECNOL.CONICET - LA PLATA
Citación
A Scoring Map Algorithm for Automatically Detecting Structural Similarity of DOM Elements; 17th International Conference on Web Information Systems and Technologies; Setúbal; Portugal; 2021; 174-185
Compartir