Mostrar el registro sencillo del ítem
dc.contributor.author
Baya, Ariel Emilio

dc.contributor.author
Larese, Monica Graciela

dc.date.available
2024-03-25T12:57:02Z
dc.date.issued
2023-06
dc.identifier.citation
Baya, Ariel Emilio; Larese, Monica Graciela; DStab: estimating clustering quality by distance stability; Springer; Pattern Analysis And Applications; 26; 3; 6-2023; 1463-1479
dc.identifier.issn
1433-7541
dc.identifier.uri
http://hdl.handle.net/11336/231430
dc.description.abstract
Most commonly, stability analyses are performed using an external validation measure. For example, the Jaccard index is one of the indexes of choice for stability measurement. The index is wrapped around a resampling method to sense the model’s stability. Other methods use classifiers to look for stable partitions instead. In these cases, a resampling method is also used with an external index, an error measure driven by a classifier, and a clustering algorithm aiming to find stable clustering model configurations. Contrary to previous stability-based methods, we propose a novel validation procedure consisting of an internal validation index within a resampling strategy. We propose an index based on the distance between cluster centroids coupled with a twofold cross-validation resampling approach. Moreover, we use a threshold based on a null hypothesis to detect meaningful clustering partitions. As part of our experimental study, we have selected the K-means algorithm because of its simplicity but primarily for its instability compared to other algorithms, such as Hierarchical methods. Finally, we compare our approach with several known validation indexes and discuss the results. Our findings show that our method cannot only find meaningful clustering partitions but is also helpful as an unsupervised data analysis tool.
dc.format
application/pdf
dc.language.iso
eng
dc.publisher
Springer

dc.rights
info:eu-repo/semantics/restrictedAccess
dc.rights.uri
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.subject
Clustering
dc.subject
Clustering validation
dc.subject
K-Means
dc.subject.classification
Ciencias de la Información y Bioinformática

dc.subject.classification
Ciencias de la Computación e Información

dc.subject.classification
CIENCIAS NATURALES Y EXACTAS

dc.title
DStab: estimating clustering quality by distance stability
dc.type
info:eu-repo/semantics/article
dc.type
info:ar-repo/semantics/artículo
dc.type
info:eu-repo/semantics/publishedVersion
dc.date.updated
2024-03-25T11:56:55Z
dc.identifier.eissn
1433-755X
dc.journal.volume
26
dc.journal.number
3
dc.journal.pagination
1463-1479
dc.journal.pais
Alemania

dc.description.fil
Fil: Baya, Ariel Emilio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; Argentina
dc.description.fil
Fil: Larese, Monica Graciela. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; Argentina
dc.journal.title
Pattern Analysis And Applications

dc.relation.alternativeid
info:eu-repo/semantics/altIdentifier/url/https://link.springer.com/10.1007/s10044-023-01175-7
dc.relation.alternativeid
info:eu-repo/semantics/altIdentifier/doi/http://dx.doi.org/10.1007/s10044-023-01175-7
Archivos asociados