Characterization and analysis of a web search benchmark

Hadjilambrou, Zacharias; Kleanthous, Marios M.; Sazeides, Yiannakis

doi:10.1109/ISPASS.2015.7095818

Conference Object

Date

2015

Author

Hadjilambrou, Zacharias
Kleanthous, Marios M.
Sazeides, Yiannakis

ISBN

978-1-4799-1956-7

Publisher

Institute of Electrical and Electronics Engineers Inc.

Source

ISPASS 2015 - IEEE International Symposium on Performance Analysis of Systems and Software
2015 15th IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2015

Pages

328-337

Google Scholar check

Keyword(s):

World Wide Web

Websites

Information retrieval

Search engines

Social networking (online)

Metadata

Show full item record

Abstract

Web search as a service is very impressive. Web search runs on thousands of servers which perform search on an index of billions of web pages. The search results must be both relevant to the user queries and reach the user in a fraction of a second. A web search service must guarantee the same QoS at all times even at the peak incoming traffic load. Not unjustifiably the web search service has attracted a lot of research attention. Despite the high research interest web search has gained, there are still plenty unknown about the functionality and the architecture of web search benchmarks. Much research has been done using commercial web search engines, like Bing or Google, but many details of these search engines are, of course, not disclosed to the public. We take an academically accepted web search benchmark and we perform a thorough characterization and analysis of it. We shed light in to the architecture and the functionality of the benchmark. We also investigate some prominent web search research issues. In particular, we study how intra-server index partitioning affects the response time and throughput and we also explore the potential use of low power servers for web search. Our results show that intra-server partitioning can reduce tail latencies and that low power servers given enough partitioning can provide same response times as conventional high performance servers. © 2015 IEEE.