Artículo
Silhouette + Attraction: A Simple and Effective Method for Text Clustering
Fecha de publicación:
14/08/2015
Editorial:
Cambridge University Press
Revista:
Natural Language Engineering
ISSN:
1351-3249
Idioma:
Inglés
Tipo de recurso:
Artículo publicado
Clasificación temática:
Resumen
This article presents Sil-Att, a simple and effective method for text clustering, which is based on two main concepts: the silhouette coefficient and the idea of attraction. The combination of both principles allows to obtain a general technique that can be used either as a boosting method, which improves results of other clustering algorithms, or as an independent clustering algorithm. The experimental work shows that Sil-Att is able to obtain high quality results on text corpora with very different characteristics. Furthermore, its stable performance on all the considered corpora is indicative that it is a very robust method. This is a very interesting positive aspect of Sil-Att with respect to the other algorithms used in the experiments, whose performances heavily depend on specific characteristics of the corpora being considered.
Palabras clave:
Clustering
,
Short Texts Corpora
,
Attraction
,
Silhouette
Archivos asociados
Licencia
Identificadores
Colecciones
Articulos(CCT - SAN LUIS)
Articulos de CTRO.CIENTIFICO TECNOL.CONICET - SAN LUIS
Articulos de CTRO.CIENTIFICO TECNOL.CONICET - SAN LUIS
Citación
Errecalde, Marcelo L.; Cagnina, Leticia Cecilia; Rosso, Paolo; Silhouette + Attraction: A Simple and Effective Method for Text Clustering; Cambridge University Press; Natural Language Engineering; 1; 14-8-2015; 1-40
Compartir
Altmétricas