Comprehensive Characterization of an Open Source Document Search Engine
Date
2019Author
Hadjilambrou, ZachariasKleanthous, Marios
Antoniou, Georgia
Portero, Antoni
Sazeides, Yiannakis
ISSN
1544-3566Source
ACM Transactions on Architecture and Code OptimizationVolume
16Issue
2Google Scholar check
Metadata
Show full item recordAbstract
This work performs a thorough characterization and analysis of the open source Lucene search library. The article describes in detail the architecture, functionality, and micro-architectural behavior of the search engine, and investigates prominent online document search research issues. In particular, we study how intra-server index partitioning affects the response time and throughput, explore the potential use of low power servers for document search, and examine the sources of performance degradation ands the causes of tail latencies. Some of our main conclusions are the following: (a) intra-server index partitioning can reduce tail latencies but with diminishing benefits as incoming query traffic increases, (b) low power servers given enough partitioning can provide same average and tail response times as conventional high performance servers, (c) index search is a CPU-intensive cache-friendly application, and (d) C-states are the main culprits for performance degradation in document search.