ABSTRACT
This tutorial aims to weave together diverse strands of modern Learning to Rank (LtR) research, and present them in a unified full-day tutorial. First, we will introduce the fundamentals of LtR, and an overview of its various sub-fields. Then, we will discuss some recent advances in gradient boosting methods such as LambdaMART by focusing on their efficiency/effectiveness trade-offs and optimizations. Subsequently, we will then present TF-Ranking, a new open source TensorFlow package for neural LtR models, and how it can be used for modeling sparse textual features. Finally, we will conclude the tutorial by covering unbiased LtR -- a new research field aiming at learning from biased implicit user feedback. The tutorial will consist of three two-hour sessions, each focusing on one of the topics described above. It will provide a mix of theoretical and hands-on sessions, and should benefit both academics interested in learning more about the current state-of-the-art in LtR, as well as practitioners who want to use LtR techniques in their applications.
- Mart'in Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, and others. 2016. Tensorflow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation. 265--283. Google ScholarDigital Library
- Aman Agarwal, Ivan Zaitsev, and Thorsten Joachims. 2018. Consistent position bias estimation without online interventions for learning-to-rank. arXiv preprint arXiv:1806.03555 (2018).Google Scholar
- B. Barla Cambazoglu, Hugo Zaragoza, Olivier Chapelle, Jiang Chen, Ciya Liao, Zhaohui Zheng, and Jon Degenhardt. 2010. Early exit optimizations for additive machine learned ranking systems. In 3rd ACM International Conference on Web Search and Data Mining. ACM, 411--420. Google ScholarDigital Library
- Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. Neural Information Processing Systems, Workshop on Machine Learning Systems (2015).Google Scholar
- Domenico Dato, Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Nicola Tonellotto, and Rossano Venturini. 2016. Fast ranking with additive ensembles of oblivious and non-oblivious regression trees. ACM Transactions on Information Systems , Vol. 35, 2 (2016), Article 15. Google ScholarDigital Library
- Rolf Jagerman, Harrie Oosterhuis, and Maarten de Rijke. 2019. To model or to intervene: A comparison of counterfactual and online learning to rank from user interactions. In 42nd International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, (to appear). Google ScholarDigital Library
- Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. In 22nd ACM International Conference on Multimedia. ACM, 675--678. Google ScholarDigital Library
- Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. 2017. Unbiased learning-to-rank with biased feedback. In 10th ACM International Conference on Web Search and Data Mining. ACM, 781--789. Google ScholarDigital Library
- Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Fabrizio Silvestri, and Salvatore Trani. 2018. X-CLEaVER: Learning ranking ensembles by growing and pruning trees. ACM Transactions on Intelligent Systems and Technology , Vol. 9, 6 (2018), Article 62. Google ScholarDigital Library
- Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. 2016. MS MARCO: A human generated machine reading comprehension dataset. arXiv preprint arXiv:1611.09268 (2016).Google Scholar
- Harrie Oosterhuis and Maarten de Rijke. 2018. Differentiable unbiased online learning to rank. 27th ACM International Conference on Information and Knowledge Management. ACM, 1293--1302. Google ScholarDigital Library
- Rama Kumar Pasumarthi, Sebastian Bruch, Xuanhui Wang, Cheng Li, Michael Bendersky, Marc Najork, Jan Pfeifer, Nadav Golbandi, Rohan Anil, and Stephan Wolf. 2019. TF-Ranking: Scalable TensorFlow library for learning-to-rank. In 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . ACM, (to appear). Google ScholarDigital Library
- Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. In Advances in Neural Information Processing Systems, AutoDiff Workshop: The Future of Gradient-Based Machine Learning Software and Techniques .Google Scholar
- Tao Qin and Tie-Yan Liu. 2013. Introducing LETOR 4.0 Datasets. arXiv preprint arXiv:1306.2597 (2013).Google Scholar
- Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, and others. 2015. Imagenet large scale visual recognition challenge. International Journal of Computer Vision , Vol. 115, 3 (2015), 211--252. Google ScholarDigital Library
- Lidan Wang, Jimmy J. Lin, and Donald Metzler. 2010. Learning to efficiently rank. In 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 138--145. Google ScholarDigital Library
- Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. 2016. Learning to rank with selection bias in personal search. In 41st International ACM SIGIR Conference on Research & Development in Information Retrieval . ACM, 115--124. Google ScholarDigital Library
- Xuanhui Wang, Nadav Golbandi, Michael Bendersky, Donald Metzler, and Marc Najork. 2018. Position bias estimation for unbiased learning to rank in personal search. In 11th ACM International Conference on Web Search and Data Mining. ACM, 610 --618. Google ScholarDigital Library
- Zhixiang Xu, Olivier Chapelle, and Kilian Q Weinberger. 2012. The greedy miser: Learning under test-time budgets. In 29th International Conference on Machine Learning. 1175--1182. Google ScholarDigital Library
- Yisong Yue and Thorsten Joachims. 2009. Interactively optimizing information retrieval systems as a dueling bandits problem. In 26th Annual International Conference on Machine Learning. ACM, 1201--1208. Google ScholarDigital Library
Index Terms
- Learning to Rank in Theory and Practice: From Gradient Boosting to Neural Networks and Unbiased Learning
Recommendations
Unbiased Learning to Rank: Online or Offline?
How to obtain an unbiased ranking model by learning to rank with biased user feedback is an important research question for IR. Existing work on unbiased learning to rank (ULTR) can be broadly categorized into two groups—the studies on unbiased learning ...
Intent-Aware Propensity Estimation via Click Pattern Stratification
WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023Counterfactual learning to rank via inverse propensity weighting is the most popular approach to train ranking models using biased implicit user feedback from logged search data. Standard click propensity estimation techniques rely on simple models of ...
Maximizing Marginal Fairness for Dynamic Learning to Rank
WWW '21: Proceedings of the Web Conference 2021Rankings, especially those in search and recommendation systems, often determine how people access information and how information is exposed to people. Therefore, how to balance the relevance and fairness of information exposure is considered as one ...
Comments