dc.contributor.author | Zeinalipour-Yazdi, Constantinos D. | en |
dc.contributor.author | Dikaiakos, Marios D. | en |
dc.contributor.editor | Gal A. | en |
dc.contributor.editor | Halevy A. | en |
dc.creator | Zeinalipour-Yazdi, Constantinos D. | en |
dc.creator | Dikaiakos, Marios D. | en |
dc.date.accessioned | 2019-11-13T10:43:03Z | |
dc.date.available | 2019-11-13T10:43:03Z | |
dc.date.issued | 2002 | |
dc.identifier.issn | 0302-9743 | |
dc.identifier.uri | http://gnosis.library.ucy.ac.cy/handle/7/55178 | |
dc.description.abstract | Web crawlers are the key component of services running on Internet and providing searching and indexing support for the entire Web, for corporate Intranets and large portal sites. More recently, crawlers have also been used as tools to conduct focused Web searches and to gather data about the characteristics of the WWW. In this paper, we study the employment of crawlers as a programmable, scalable, and distributed component in future Internet middleware infrastructures and proxy services. In particular, we present the architecture and implementation of, and experimentation withWebRACE, a high-performance, distributedWeb crawler, filtering server and object cache.We address the challenge of designing and implementing modular, open, distributed, and scalable crawlers, using Java. We describe our design and implementation decisions, and various optimizations. We discuss the advantages and disadvantages of using Java to implement the WebRACE-crawler, and present an evaluation of its performance. WebRACE is designed in the context of eRACE, an extensible Retrieval Annotation Caching Engine, which collects, annotates and disseminates information from heterogeneous Internet sources and protocols, according to XML-encoded user profiles that determine the urgency and relevance of collected information. © Springer-Verlag Berlin Heidelberg 2002. | en |
dc.source | 5th International Workshop on Next Generation Information Technologies and Systems, NGITS 2002 | en |
dc.source.uri | https://www.scopus.com/inward/record.uri?eid=2-s2.0-84937389622&partnerID=40&md5=d4388abbc8563ad5c4a2c10c99e37230 | |
dc.subject | World Wide Web | en |
dc.subject | Internet | en |
dc.subject | Search engines | en |
dc.subject | Design and implementations | en |
dc.subject | Internet protocols | en |
dc.subject | Middleware | en |
dc.subject | Social networking (online) | en |
dc.subject | User profile | en |
dc.subject | Integrated circuit design | en |
dc.subject | Distributed components | en |
dc.subject | Corporate intranet | en |
dc.subject | Distributed crawler | en |
dc.subject | Future internet | en |
dc.subject | Internet sources | en |
dc.subject | Proxy services | en |
dc.title | Design and implementation of a distributed crawler and filtering processor | en |
dc.type | info:eu-repo/semantics/article | |
dc.description.volume | 2382 | |
dc.description.startingpage | 58 | |
dc.description.endingpage | 74 | |
dc.author.faculty | 002 Σχολή Θετικών και Εφαρμοσμένων Επιστημών / Faculty of Pure and Applied Sciences | |
dc.author.department | Τμήμα Πληροφορικής / Department of Computer Science | |
dc.type.uhtype | Article | en |
dc.description.notes | <p>Sponsors: | en |
dc.description.notes | Conference code: 121059 | en |
dc.description.notes | Cited By :17</p> | en |
dc.source.abbreviation | Lect. Notes Comput. Sci. | en |
dc.contributor.orcid | Zeinalipour-Yazdi, Constantinos D. [0000-0002-8388-1549] | |
dc.contributor.orcid | Dikaiakos, Marios D. [0000-0002-4350-6058] | |
dc.gnosis.orcid | 0000-0002-8388-1549 | |
dc.gnosis.orcid | 0000-0002-4350-6058 | |