dc.contributor.author | Dikaiakos, Marios D. | en |
dc.contributor.author | Stassopoulou, Athena | en |
dc.contributor.author | Papageorgiou, Loizos | en |
dc.creator | Dikaiakos, Marios D. | en |
dc.creator | Stassopoulou, Athena | en |
dc.creator | Papageorgiou, Loizos | en |
dc.date.accessioned | 2019-11-13T10:39:52Z | |
dc.date.available | 2019-11-13T10:39:52Z | |
dc.date.issued | 2005 | |
dc.identifier.uri | http://gnosis.library.ucy.ac.cy/handle/7/53844 | |
dc.description.abstract | In this paper, we present a characterization study of search-engine crawlers. For the purposes of our work, we use Web-server access logs from five academic sites in three different countries. Based on these logs, we analyze the activity of different crawlers that belong to five search engines: Google, AltaVista, Inktomi, FastSearch and CiteSeer. We compare crawler behavior to the characteristics of the general World-Wide Web traffic and to general characterization studies. We analyze crawler requests to derive insights into the behavior and strategy of crawlers. We propose a set of simple metrics that describe qualitative characteristics of crawler behavior, vis-à-vis a crawler's preference on resources of a particular format, its frequency of visits on a Web site, and the pervasiveness of its visits to a particular site. To the best of our knowledge, this is the first extensive and in depth characterization of search-engine crawlers. Our results and observations provide useful insights into crawler behavior and serve as basis of our ongoing work on the automatic detection of Web crawlers. © 2005 Elsevier B.V. All rights reserved. | en |
dc.source | Computer Communications | en |
dc.source.uri | https://www.scopus.com/inward/record.uri?eid=2-s2.0-17644390582&doi=10.1016%2fj.comcom.2005.01.003&partnerID=40&md5=bbaae7e21912c52febdc917fe80e21e6 | |
dc.subject | World Wide Web | en |
dc.subject | Search engines | en |
dc.subject | Resource allocation | en |
dc.subject | Servers | en |
dc.subject | Java programming language | en |
dc.subject | Web crawlers | en |
dc.subject | Crawlers | en |
dc.subject | HTTP | en |
dc.subject | Local area networks | en |
dc.subject | Web characterization | en |
dc.subject | Web servers | en |
dc.subject | Web traffic | en |
dc.title | An investigation of web crawler behavior: Characterization and metrics | en |
dc.type | info:eu-repo/semantics/article | |
dc.identifier.doi | 10.1016/j.comcom.2005.01.003 | |
dc.description.volume | 28 | |
dc.description.issue | 8 | |
dc.description.startingpage | 880 | |
dc.description.endingpage | 897 | |
dc.author.faculty | 002 Σχολή Θετικών και Εφαρμοσμένων Επιστημών / Faculty of Pure and Applied Sciences | |
dc.author.department | Τμήμα Πληροφορικής / Department of Computer Science | |
dc.type.uhtype | Article | en |
dc.description.notes | <p>Cited By :45</p> | en |
dc.source.abbreviation | Comput.Commun. | en |
dc.contributor.orcid | Dikaiakos, Marios D. [0000-0002-4350-6058] | |
dc.gnosis.orcid | 0000-0002-4350-6058 | |