Artículo
Information Approach to Co-occurrence of Words in Written Language
Fecha de publicación:
06/2015
Editorial:
Complex Systems Publications
Revista:
Complex systems
ISSN:
0891-2513
Idioma:
Inglés
Tipo de recurso:
Artículo publicado
Clasificación temática:
Resumen
In this paper we study the distribution of words across the different parts of a book using tools from information theory. In particular, the mutual information between words in the text and parts of the text is compared with the mutual information of a shuffled version of the book. This analysis allows us to extract not only relevant words of the text but also relationships between the different words, such as co-occurrence and repulsion between them. With the connections due to co-occurrence of words, we show how to construct a network that reflects the semantic organization of the book. This method can be applied to other types of sequences, measuring the relations between the different symbols that compose such sequences.
Palabras clave:
Information
,
Coocurrence
,
Words
,
Language
Archivos asociados
Licencia
Identificadores
Colecciones
Articulos(CCT - PATAGONIA NORTE)
Articulos de CTRO.CIENTIFICO TECNOL.CONICET - PATAGONIA NORTE
Articulos de CTRO.CIENTIFICO TECNOL.CONICET - PATAGONIA NORTE
Citación
Hernández Lahme, Damián Gabriel; Information Approach to Co-occurrence of Words in Written Language; Complex Systems Publications; Complex systems; 24; 2; 6-2015; 1-21
Compartir