skip to main content
10.1145/3404835.3463088acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

Learning Early Exit Strategies for Additive Ranking Ensembles

Published:11 July 2021Publication History

ABSTRACT

Modern search engine ranking pipelines are commonly based on large machine-learned ensembles of regression trees. We propose LEAR, a novel - learned - technique aimed to reduce the average number of trees traversed by documents to accumulate the scores, thus reducing the overall query response time. LEAR exploits a classifier that predicts whether a document can early exit the ensemble because it is unlikely to be ranked among the final top-k results. The early exit decision occurs at a sentinel point, i.e., after having evaluated a limited number of trees, and the partial scores are exploited to filter out non-promising documents. We evaluate LEAR by deploying it in a production-like setting, adopting a state-of-the-art algorithm for ensembles traversal. We provide a comprehensive experimental evaluation on two public datasets. The experiments show that LEAR has a significant impact on the efficiency of the query processing without hindering its ranking quality. In detail, on a first dataset, LEAR is able to achieve a speedup of 3x without any loss in NDCG@10, while on a second dataset the speedup is larger than 5x with a negligible NDCG@10 loss (< 0.05%).

References

  1. Nima Asadi and Jimmy Lin. 2013. Training efficient tree-based models for document ranking. In Advances in Information Retrieval. Springer, 146--157.Google ScholarGoogle Scholar
  2. Christopher JC Burges. 2010. From ranknet to lambdarank to lambdamart: An overview. Learning, Vol. 11, 23--581 (2010), 81.Google ScholarGoogle Scholar
  3. Berkant Barla Cambazoglu, Hugo Zaragoza, Olivier Chapelle, Jiang Chen, Ciya Liao, Zhaohui Zheng, and Jon Degenhardt. 2010. Early exit optimizations for additive machine learned ranking systems. In Proc. WSDM. ACM, 411--420.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. G. Capannini, C. Lucchese, F. M. Nardini, S. Orlando, R. Perego, and N. Tonellotto. 2016. Quality versus efficiency in document scoring with learning-to-rank models. Information Processing & Management, Vol. 52, 6 (2016), 1161 -- 1177.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Ruey-Cheng Chen, Luke Gallagher, Roi Blanco, and J. Shane Culpepper. 2017. Efficient Cost-Aware Cascade Ranking in Multi-Stage Retrieval. In Proc. SIGIR. ACM, 445--454.Google ScholarGoogle Scholar
  6. D. Dato, C. Lucchese, F.M. Nardini, S. Orlando, R. Perego, N. Tonellotto, and R. Venturini. 2016. Fast ranking with additive ensembles of oblivious and non-oblivious regression trees. ACM TOIS, Vol. 35, 2 (2016).Google ScholarGoogle Scholar
  7. J. H. Friedman. 2000. Greedy Function Approximation: A Gradient Boosting Machine. Annals of Statistics, Vol. 29 (2000), 1189--1232.Google ScholarGoogle ScholarCross RefCross Ref
  8. Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Fabrizio Silvestri, and Trani Salvatore. 2018. X-CLEaVER: Learning Ranking Ensembles by Growing and Pruning Trees. ACM TIST (2018).Google ScholarGoogle Scholar
  9. Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Fabrizio Silvestri, and Salvatore Trani. 2016. Post-Learning Optimization of Tree Ensembles for Efficient Ranking. In Proc. SIGIR. ACM, 949--952.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Lucchese, F. M. Nardini, S. Orlando, R. Perego, N. Tonellotto, and R. Venturini. 2015. QuickScorer: A Fast Algorithm to Rank Documents with Additive Ensembles of Regression Trees. In Proc. SIGIR. ACM, 73--82.Google ScholarGoogle Scholar
  11. Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, and Salvatore Trani. 2017. X-DART: Blending Dropout and Pruning for Efficient Learning to Rank. In Proc. SIGIR. ACM, 1077--1080.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, and Salvatore Trani. 2020. Query-level Early Exit for Additive Learning-to-Rank Ensembles. In Proc. SIGIR. ACM, 2033--2036.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Lidan Wang, Jimmy Lin, and Donald Metzler. 2010. Learning to Efficiently Rank. In Proc. SIGIR. ACM, New York, NY, USA, 138--145.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Q. Wu, C.J.C. Burges, K.M. Svore, and J. Gao. 2010. Adapting boosting for information retrieval measures. Information Retrieval (2010).Google ScholarGoogle Scholar
  15. Ji Xin, Raphael Tang, Yaoliang Yu, and Jimmy Lin. 2021. BERxiT: Early Exiting for BERT with Better Fine-Tuning and Extension to Regression. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Association for Computational Linguistics, Online, 91--104. https://www.aclweb.org/anthology/2021.eacl-main.8Google ScholarGoogle ScholarCross RefCross Ref
  16. Ting Ye, Hucheng Zhou, Will Y. Zou, Bin Gao, and Ruofei Zhang. 2018. RapidScorer: Fast Tree Ensemble Evaluation by Maximizing Compactness in Data Level Parallelization. In Proc. SIGKDD. ACM, 941--950.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Learning Early Exit Strategies for Additive Ranking Ensembles

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval
          July 2021
          2998 pages
          ISBN:9781450380379
          DOI:10.1145/3404835

          Copyright © 2021 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 11 July 2021

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • short-paper

          Acceptance Rates

          Overall Acceptance Rate792of3,983submissions,20%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader