Repositorio Institucional
Repositorio Institucional
CONICET Digital
  • Inicio
  • EXPLORAR
    • AUTORES
    • DISCIPLINAS
    • COMUNIDADES
  • Estadísticas
  • Novedades
    • Noticias
    • Boletines
  • Ayuda
    • General
    • Datos de investigación
  • Acerca de
    • CONICET Digital
    • Equipo
    • Red Federal
  • Contacto
JavaScript is disabled for your browser. Some features of this site may not work without it.
  • INFORMACIÓN GENERAL
  • RESUMEN
  • ESTADISTICAS
 
Artículo

DStab: estimating clustering quality by distance stability

Baya, Ariel EmilioIcon ; Larese, Monica GracielaIcon
Fecha de publicación: 06/2023
Editorial: Springer
Revista: Pattern Analysis And Applications
ISSN: 1433-7541
e-ISSN: 1433-755X
Idioma: Inglés
Tipo de recurso: Artículo publicado
Clasificación temática:
Ciencias de la Información y Bioinformática

Resumen

Most commonly, stability analyses are performed using an external validation measure. For example, the Jaccard index is one of the indexes of choice for stability measurement. The index is wrapped around a resampling method to sense the model’s stability. Other methods use classifiers to look for stable partitions instead. In these cases, a resampling method is also used with an external index, an error measure driven by a classifier, and a clustering algorithm aiming to find stable clustering model configurations. Contrary to previous stability-based methods, we propose a novel validation procedure consisting of an internal validation index within a resampling strategy. We propose an index based on the distance between cluster centroids coupled with a twofold cross-validation resampling approach. Moreover, we use a threshold based on a null hypothesis to detect meaningful clustering partitions. As part of our experimental study, we have selected the K-means algorithm because of its simplicity but primarily for its instability compared to other algorithms, such as Hierarchical methods. Finally, we compare our approach with several known validation indexes and discuss the results. Our findings show that our method cannot only find meaningful clustering partitions but is also helpful as an unsupervised data analysis tool.
Palabras clave: Clustering , Clustering validation , K-Means
Ver el registro completo
 
Archivos asociados
Tamaño: 2.771Mb
Formato: PDF
.
Solicitar
Licencia
info:eu-repo/semantics/restrictedAccess Excepto donde se diga explícitamente, este item se publica bajo la siguiente descripción: Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Unported (CC BY-NC-SA 2.5)
Identificadores
URI: http://hdl.handle.net/11336/231430
URL: https://link.springer.com/10.1007/s10044-023-01175-7
DOI: http://dx.doi.org/10.1007/s10044-023-01175-7
Colecciones
Articulos(CIFASIS)
Articulos de CENTRO INT.FRANCO ARG.D/CS D/L/INF.Y SISTEM.
Citación
Baya, Ariel Emilio; Larese, Monica Graciela; DStab: estimating clustering quality by distance stability; Springer; Pattern Analysis And Applications; 26; 3; 6-2023; 1463-1479
Compartir
Altmétricas
 

Enviar por e-mail
Separar cada destinatario (hasta 5) con punto y coma.
  • Facebook
  • X Conicet Digital
  • Instagram
  • YouTube
  • Sound Cloud
  • LinkedIn

Los contenidos del CONICET están licenciados bajo Creative Commons Reconocimiento 2.5 Argentina License

https://www.conicet.gov.ar/ - CONICET

Inicio

Explorar

  • Autores
  • Disciplinas
  • Comunidades

Estadísticas

Novedades

  • Noticias
  • Boletines

Ayuda

Acerca de

  • CONICET Digital
  • Equipo
  • Red Federal

Contacto

Godoy Cruz 2290 (C1425FQB) CABA – República Argentina – Tel: +5411 4899-5400 repositorio@conicet.gov.ar
TÉRMINOS Y CONDICIONES