Artículo
Decoding the structure of the WWW: a comparative analysis of web crawls
Serrano, Maria Angeles; Maguitman, Ana Gabriela
; Boguña, Marian; Fortunato, Santo; Vespignani, Alessandro

Fecha de publicación:
08/2007
Editorial:
Association for Computing Machinary
Revista:
Acm Transactions On The Web
ISSN:
1559-1131
Idioma:
Inglés
Tipo de recurso:
Artículo publicado
Clasificación temática:
Resumen
The understanding of the immense and intricate topological structure of the World Wide Web (WWW) is a major scientific and technological challenge. This has been recently tackled by char-acterizing the properties of its representative graphs, in which vertices and directed edges areidentified with Web pages and hyperlinks, respectively. Data gathered in large-scale crawls havebeen analyzed by several groups resulting in a general picture of the WWW that encompassesmany of the complex properties typical of rapidly evolving networks. In this article, we report adetailed statistical analysis of the topological properties of four different WWW graphs obtainedwith different crawlers. We find that, despite the very large size of the samples, the statistical mea-sures characterizing these graphs differ quantitatively, and in some cases qualitatively, dependingon the domain analyzed and the crawl used for gathering the data. This spurs the issue of thepresence of sampling biases and structural differences of Web crawls that might induce propertiesnot representative of the actual global underlying graph. In short, the stability of the widely ac-cepted statistical description of the Web is called into question. In order to provide a more accuratecharacterization of the Web graph, we study statistical measures beyond the degree distribution,such as degree-degree correlation functions or the statistics of reciprocal connections. The latterappears to enclose the relevant correlations of the WWW graph and carry most of the topologica.
Palabras clave:
World Wide Web
Archivos asociados
Licencia
Identificadores
Colecciones
Articulos(CCT - BAHIA BLANCA)
Articulos de CTRO.CIENTIFICO TECNOL.CONICET - BAHIA BLANCA
Articulos de CTRO.CIENTIFICO TECNOL.CONICET - BAHIA BLANCA
Citación
Serrano, Maria Angeles; Maguitman, Ana Gabriela; Boguña, Marian; Fortunato, Santo; Vespignani, Alessandro; Decoding the structure of the WWW: a comparative analysis of web crawls; Association for Computing Machinary; Acm Transactions On The Web; 1; 2; 8-2007; 1131-1155
Compartir
Altmétricas