research-article

Efficient and Effective Query Auto-Completion

Authors:
Simon Gog

eBay Inc., San Jose, CA, USA

eBay Inc., San Jose, CA, USA
View Profile

,
Giulio Ermanno Pibiri

ISTI-CNR, Pisa, Italy

ISTI-CNR, Pisa, Italy
View Profile

,
Rossano Venturini

University of Pisa, Pisa, Italy

University of Pisa, Pisa, Italy
View Profile

SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information RetrievalJuly 2020Pages 2271–2280https://doi.org/10.1145/3397271.3401432

Published:25 July 2020Publication History

SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 2271–2280

ABSTRACT

Query Auto-Completion (QAC) is an ubiquitous feature of modern textual search systems, suggesting possible ways of completing the query being typed by the user. Efficiency is crucial to make the system have a real-time responsiveness when operating in the million-scale search space. Prior work has extensively advocated the use of a trie data structure for fast prefix-search operations in compact space. However, searching by prefix has little discovery power in that only completions that are prefixed by the query are returned. This may impact negatively the effectiveness of the QAC system, with a consequent monetary loss for real applications like Web Search Engines and eCommerce.

In this work we describe the implementation that empowers a new QAC system at eBay, and discuss its efficiency/effectiveness in relation to other approaches at the state-of-the-art. The solution is based on the combination of an inverted index with succinct data structures, a much less explored direction in the literature. This system is replacing the previous implementation based on Apache SOLR that was not always able to meet the required service-level- agreement.

References

Z. Bar-Yossef and N. Kraus. Context-sensitive query auto-completion. In Proceedings of the 20th international conference on World wide web, pages 107--116. ACM, 2011.Google ScholarDigital Library
H. Bast and I. Weber. Type less, find more: fast autocompletion search with a succinct index. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 364--371. ACM, 2006.Google ScholarDigital Library
S. Bhatia, D. Majumdar, and P. Mitra. Query suggestions in the absence of query logs. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, pages 795--804, 2011.Google ScholarDigital Library
F. Cai, M. De Rijke, et al. A survey of query auto completion in information retrieval. Foundations and Trends® in Information Retrieval, 10 (4): 273--363, 2016.Google ScholarDigital Library
H. Cao, D. Jiang, J. Pei, Q. He, Z. Liao, E. Chen, and H. Li. Context-aware query suggestion by mining click-through and session data. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 875--883, 2008.Google ScholarDigital Library
S. Chaudhuri and R. Kaushik. Extending autocompletion to tolerate errors. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data, pages 707--718. ACM, 2009.Google ScholarDigital Library
G. Di Santo, R. McCreadie, C. Macdonald, and I. Ounis. Comparing approaches for query autocompletion. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 775--778. ACM, 2015.Google ScholarDigital Library
P. Elias. Efficient storage and retrieval by content and address of static files. Journal of the ACM, 21 (2): 246--260, 1974.Google ScholarDigital Library
R. M. Fano. On the number of bits required to implement an associative memory. Memorandum 61, Computer Structures Group, MIT, 1971.Google Scholar
J. Fischer and V. Heun. Space-efficient preprocessing schemes for range minimum queries on static arrays. SIAM Journal on Computing, 40 (2): 465--492, 2011.Google ScholarDigital Library
E. Fredkin. Trie memory. Communications of the ACM, 3 (9): 490--499, 1960.Google ScholarDigital Library
B.-J. P. Hsu and G. Ottaviano. Space-efficient data structures for top-k completion. In Proceedings of the 22nd international conference on World Wide Web, pages 583--594. ACM, 2013.Google ScholarDigital Library
M. Inc. Msn query log, https://www.microsoft.com/en-us/research/people/nickcr. 2006.Google Scholar
S. Ji, G. Li, C. Li, and J. Feng. Efficient interactive fuzzy keyword search. In Proceedings of the 18th international conference on World wide web, pages 371--380, 2009.Google ScholarDigital Library
R. Jones, B. Rey, O. Madani, and W. Greiner. Generating query substitutions. In Proceedings of the 15th international conference on World Wide Web, pages 387--396, 2006.Google ScholarDigital Library
U. Krishnan, A. Moffat, and J. Zobel. A taxonomy of query auto completion modes. In Proceedings of the 22nd Australasian Document Computing Symposium, page 6. ACM, 2017.Google Scholar
M. A. Martinez-Prieto, N. Brisaboa, R. Cánovas, F. Claude, and G. Navarro. Practical compressed string dictionaries. Information Systems, 56: 73--108, 2016.Google ScholarDigital Library
B. Mitra and N. Craswell. Query auto-completion for rare prefixes. In Proceedings of the 24th ACM international on conference on information and knowledge management, pages 1755--1758. ACM, 2015.Google ScholarDigital Library
B. Mitra, M. Shokouhi, F. Radlinski, and K. Hofmann. On user interactions with query auto-completion. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval, pages 1055--1058. ACM, 2014.Google ScholarDigital Library
A. Moffat and L. Stuiver. Binary interpolative coding for effective index compression. Information Retrieval Journal, 3 (1): 25--47, 2000.Google ScholarDigital Library
A. Moffat and J. Zobel. Self-indexing inverted files for fast text retrieval. ACM Transactions on Information Systems (TOIS), 14 (4): 349--379, 1996.Google ScholarDigital Library
S. Muthukrishnan. Efficient algorithms for document retrieval problems. In Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms, pages 657--666. Society for Industrial and Applied Mathematics, 2002.Google ScholarDigital Library
G. Ottaviano and R. Venturini. Partitioned elias-fano indexes. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval, pages 273--282. ACM, 2014.Google ScholarDigital Library
G. Pass, A. Chowdhury, and C. Torgeson. A picture of search. In International Conference on Scalable Information Systems, volume 152, page 1, 2006.Google ScholarDigital Library
G. E. Pibiri. On slicing sorted integer sequences. CoRR, abs/1907.01032, 2019. URL http://arxiv.org/abs/1907.01032.Google Scholar
G. E. Pibiri and R. Venturini. Efficient data structures for massive n-gram datasets. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 615--624. ACM, 2017.Google ScholarDigital Library
G. E. Pibiri and R. Venturini. Handling massive n-gram datasets efficiently. ACM Transactions on Information Systems (TOIS), 37 (2): 25, 2019 a .Google Scholar
G. E. Pibiri and R. Venturini. On optimally partitioning variable-byte codes. IEEE Transactions on Knowledge and Data Engineering, 2019 b.Google Scholar
]surveyG. E. Pibiri and R. Venturini. Techniques for inverted index compression. CoRR, abs/1908.10598, 2019 c. URL http://arxiv.org/abs/1908.10598.Google Scholar
G. E. Pibiri, M. Petri, and A. Moffat. Fast dictionary-based compression for inverted indexes. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, pages 6--14. ACM, 2019.Google ScholarDigital Library
J. Plaisance, N. Kurz, and D. Lemire. Vectorized VByte decoding. In International Symposium on Web Algorithms, 2015.Google Scholar
M. Shokouhi. Learning to personalize query auto-completion. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval, pages 103--112. ACM, 2013.Google ScholarDigital Library
M. Shokouhi and K. Radinsky. Time-sensitive query auto-completion. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval, pages 601--610. ACM, 2012.Google ScholarDigital Library
J. Zhang, X. Long, and T. Suel. Performance of compressed inverted list caching in search engines. In International World Wide Web Conference (WWW), pages 387--396, 2008.Google ScholarDigital Library

Index Terms

Efficient and Effective Query Auto-Completion
1. Information systems
  1. Information retrieval

Recommendations

Query Auto-Completion for Rare Prefixes
CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management

Query auto-completion (QAC) systems typically suggest queries that have previously been observed in search logs. Given a partial user query, the system looks up this query prefix against a precomputed set of candidates, then orders them using ranking ...
Read More
Learning to personalize query auto-completion
SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Query auto-completion (QAC) is one of the most prominent features of modern search engines. The list of query candidates is generated according to the prefix entered by the user in the search box and is updated on each new key stroke. Query prefixes ...
Read More
Diversifying Query Auto-Completion

Query auto-completion assists web search users in formulating queries with a few keystrokes, helping them to avoid spelling mistakes and to produce clear query expressions, and so on. Previous work on query auto-completion mainly centers around ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2020
2548 pages
ISBN:9781450380164
DOI:10.1145/3397271
General Chairs:
Jimmy Huang
York University, Canada
,
Yi Chang
Jilin University, China
,
Xueqi Cheng
Chinese Academy of Sciences, China
,
Program Chairs:
Jaap Kamps
University of Amsterdam, Netherlands
,
Vanessa Murdock
Amazon, U.S.A.
,
Ji-Rong Wen
Renmin University of China, China
,
Yiqun Liu
Tsinghua University, China
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 July 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
conjunctive-search
prefix-search
query auto-completion
top-k retrieval
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 5
  Total Citations
  View Citations
- 408
  Total Downloads
- Downloads (Last 12 months)66
- Downloads (Last 6 weeks)17
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Efficient and Effective Query Auto-Completion

SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Query Auto-Completion for Rare Prefixes

Learning to personalize query auto-completion

Diversifying Query Auto-Completion