research-article

Open Access

Verifiable Learning for Robust Tree Ensembles

Authors:
Stefano Calzavara

Università Ca' Foscari Venezia, Venice, Italy

Università Ca' Foscari Venezia, Venice, Italy

0000-0001-9179-8270
View Profile

,
Lorenzo Cazzaro

Università Ca' Foscari Venezia, Venice, Italy

Università Ca' Foscari Venezia, Venice, Italy

0000-0001-6479-2949
View Profile

,
Giulio Ermanno Pibiri

Università Ca' Foscari Venezia, Venice, Italy

Università Ca' Foscari Venezia, Venice, Italy

0000-0003-0724-7092
View Profile

,
Nicola Prezza

University Ca' Foscari Venezia, Venice, Italy

University Ca' Foscari Venezia, Venice, Italy

0000-0003-3553-4953
View Profile

CCS '23: Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications SecurityNovember 2023Pages 1850–1864https://doi.org/10.1145/3576915.3623100

Published:21 November 2023Publication History

CCS '23: Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security

Pages 1850–1864

ABSTRACT

Verifying the robustness of machine learning models against evasion attacks at test time is an important research problem. Unfortunately, prior work established that this problem is NP-hard for decision tree ensembles, hence bound to be intractable for specific inputs. In this paper, we identify a restricted class of decision tree ensembles, called large-spread ensembles, which admit a security verification algorithm running in polynomial time. We then propose a new approach called verifiable learning, which advocates the training of such restricted model classes which are amenable for efficient verification. We show the benefits of this idea by designing a new training algorithm that automatically learns a large-spread decision tree ensemble from labelled data, thus enabling its security verification in polynomial time. Experimental results on public datasets confirm that large-spread ensembles trained using our algorithm can be verified in a matter of seconds, using standard commercial hardware. Moreover, large-spread ensembles are more robust than traditional ensembles against evasion attacks, at the cost of an acceptable loss of accuracy in the non-adversarial setting.

References

Maksym Andriushchenko and Matthias Hein. 2019. Provably robust boosted decision stumps and trees against adversarial attacks. In NeurIPS.Google Scholar
Osbert Bastani, Yani Ioannou, Leonidas Lampropoulos, Dimitrios Vytiniotis, Aditya V. Nori, and Antonio Criminisi. 2016. Measuring Neural Net Robustness with Constraints. In NeurIPS.Google Scholar
Battista Biggio, Igino Corona, Davide Maiorca, Blaine Nelson, Nedim Srndic, Pavel Laskov, Giorgio Giacinto, and Fabio Roli. 2013. Evasion Attacks against Machine Learning at Test Time. In ECML PKDD.Google Scholar
Leo Breiman. 2001. Random Forests. Mach. Learn., Vol. 45, 1 (2001), 5--32.Google ScholarDigital Library
Leo Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. 1984. Classification and Regression Trees. Wadsworth.Google Scholar
Stefano Calzavara, Lorenzo Cazzaro, Claudio Lucchese, Federico Marcuzzi, and Salvatore Orlando. 2022. Beyond robustness: Resilience verification of tree-based classifiers. Comput. Secur., Vol. 121 (2022).Google ScholarDigital Library
Stefano Calzavara, Lorenzo Cazzaro, Giulio Ermanno Pibiri, and Nicola Prezza. 2023. Verifiable Learning for Robust Tree Ensembles. CoRR, Vol. abs/2305.03626 (2023). https://doi.org/10.48550/arXiv.2305.03626Google ScholarCross Ref
Stefano Calzavara, Pietro Ferrara, and Claudio Lucchese. 2020a. Certifying Decision Trees Against Evasion Attacks by Program Analysis. In ESORICS.Google Scholar
Stefano Calzavara, Claudio Lucchese, Federico Marcuzzi, and Salvatore Orlando. 2021. Feature partitioning for robust tree ensembles and their certification in adversarial scenarios. EURASIP J. Inf. Secur., Vol. 2021, 1 (2021), 12.Google ScholarCross Ref
Stefano Calzavara, Claudio Lucchese, and Gabriele Tolomei. 2019. Adversarial Training of Gradient-Boosted Decision Trees. In CIKM.Google Scholar
Stefano Calzavara, Claudio Lucchese, Gabriele Tolomei, Seyum Assefa Abebe, and Salvatore Orlando. 2020b. Treant: training evasion-aware decision trees. Data Min. Knowl. Discov., Vol. 34, 5 (2020), 1390--1420.Google ScholarDigital Library
Hongge Chen, Huan Zhang, Duane S. Boning, and Cho-Jui Hsieh. 2019a. Robust Decision Trees Against Adversarial Examples. In ICML.Google Scholar
Hongge Chen, Huan Zhang, Si Si, Yang Li, Duane S. Boning, and Cho-Jui Hsieh. 2019b. Robustness Verification of Tree-based Models. In NeurIPS.Google Scholar
Yizheng Chen, Shiqi Wang, Weifan Jiang, Asaf Cidon, and Suman Jana. 2021a. Cost-Aware Robust Tree Ensembles for Security Applications. In USENIX Security Symposium.Google Scholar
Yizheng Chen, Shiqi Wang, Yue Qin, Xiaojing Liao, Suman Jana, and David A. Wagner. 2021b. Learning Security Classifiers with Verified Global Robustness Properties. In ACM CCS.Google Scholar
Luca Demetrio, Scott E. Coull, Battista Biggio, Giovanni Lagorio, Alessandro Armando, and Fabio Roli. 2021. Adversarial EXEmples: A Survey and Experimental Evaluation of Practical Attacks on Machine Learning for Windows Malware Detection. ACM Trans. Priv. Secur., Vol. 24, 4 (2021), 27:1--27:31.Google ScholarDigital Library
Laurens Devos, Wannes Meert, and Jesse Davis. 2021. Verifying Tree Ensembles by Reasoning about Potential Instances. In SDM.Google Scholar
Souradeep Dutta, Susmit Jha, Sriram Sankaranarayanan, and Ashish Tiwari. 2018. Output Range Analysis for Deep Feedforward Neural Networks. In NFM.Google Scholar
Gil Einziger, Maayan Goldstein, Yaniv Sa'ar, and Itai Segall. 2019. Verifying Robustness of Gradient Boosted Models. In AAAI.Google Scholar
Dario Guidotti, Francesco Leofante, Luca Pulina, and Armando Tacchella. 2020. Verification of Neural Networks: Enhancing Scalability Through Pruning. In ECAI.Google Scholar
Jun-Qi Guo, Ming-Zhuo Teng, Wei Gao, and Zhi-Hua Zhou. 2022. Fast Provably Robust Decision Trees and Boosting. In ICML.Google Scholar
Xiaowei Huang, Marta Kwiatkowska, Sen Wang, and Min Wu. 2017. Safety Verification of Deep Neural Networks. In CAV.Google Scholar
Kai Jia and Martin C. Rinard. 2020. Efficient Exact Verification of Binarized Neural Networks. In NeurIPS.Google Scholar
Alex Kantchelian, J. D. Tygar, and Anthony D. Joseph. 2016. Evasion and Hardening of Tree Ensemble Classifiers. In ICML.Google Scholar
Guy Katz, Clark W. Barrett, David L. Dill, Kyle Julian, and Mykel J. Kochenderfer. 2017. Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks. In CAV.Google Scholar
Guy Katz, Derek A. Huang, Duligur Ibeling, Kyle Julian, Christopher Lazarus, Rachel Lim, Parth Shah, Shantanu Thakoor, Haoze Wu, Aleksandar Zeljic, David L. Dill, Mykel J. Kochenderfer, and Clark W. Barrett. 2019. The Marabou Framework for Verification and Analysis of Deep Neural Networks. In CAV.Google Scholar
Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In NeurIPS.Google Scholar
Klas Leino, Zifan Wang, and Matt Fredrikson. 2021. Globally-Robust Neural Networks. In ICML.Google Scholar
Alessio Lomuscio and Lalit Maganti. 2017. An approach to reachability analysis for feed-forward ReLU neural networks. CoRR, Vol. abs/1706.07351 (2017).Google Scholar
Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2018. Towards Deep Learning Models Resistant to Adversarial Attacks. In ICLR.Google Scholar
Mark Niklas Müller, Franziska Eckert, Marc Fischer, and Martin T. Vechev. 2023. Certified Training: Small Boxes are All You Need. In ICLR.Google Scholar
Francesco Ranzato and Marco Zanella. 2020. Abstract Interpretation of Decision Tree Ensemble Classifiers. In AAAI.Google Scholar
Francesco Ranzato and Marco Zanella. 2021. Genetic adversarial training of decision trees. In GECCO.Google Scholar
Naoto Sato, Hironobu Kuruma, Yuichiroh Nakagawa, and Hideto Ogawa. 2020. Formal Verification of a Decision-Tree Ensemble Model and Detection of Its Violation Ranges. IEICE Trans. Inf. Syst., Vol. 103-D, 2 (2020), 363--378.Google ScholarCross Ref
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian J. Goodfellow, and Rob Fergus. 2014. Intriguing properties of neural networks. In ICLR.Google Scholar
Vincent Tjeng, Kai Yuanqing Xiao, and Russ Tedrake. 2019. Evaluating Robustness of Neural Networks with Mixed Integer Programming. In ICLR.Google Scholar
John Törnblom and Simin Nadjm-Tehrani. 2020. Formal verification of input-output mappings of tree ensembles. Sci. Comput. Program., Vol. 194 (2020), 102450.Google ScholarCross Ref
Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Alexander Turner, and Aleksander Madry. 2019. Robustness May Be at Odds with Accuracy. In ICLR.Google Scholar
Daniël Vos and Sicco Verwer. 2021. Efficient Training of Robust Decision Trees Against Adversarial Examples. In ICML.Google Scholar
Daniël Vos and Sicco Verwer. 2022a. Adversarially Robust Decision Tree Relabeling. In ECML PKDD.Google Scholar
Daniël Vos and Sicco Verwer. 2022b. Robust Optimal Classification Trees against Adversarial Examples. In AAAI.Google Scholar
Yihan Wang, Huan Zhang, Hongge Chen, Duane S. Boning, and Cho-Jui Hsieh. 2020. On Lp-norm Robustness of Ensemble Decision Stumps and Trees. In ICML.Google Scholar
Kai Yuanqing Xiao, Vincent Tjeng, Nur Muhammad (Mahi) Shafiullah, and Aleksander Madry. 2019. Training for Faster Adversarial Robustness Verification via Inducing ReLU Stability. In ICLR.Google Scholar
Zhuolin Yang, Linyi Li, Xiaojun Xu, Bhavya Kailkhura, Tao Xie, and Bo Li. 2022. On the Certified Robustness for Ensemble Models and Beyond. In ICLR.Google Scholar

Index Terms

Verifiable Learning for Robust Tree Ensembles

Recommendations

Empirical Investigation of Decision Tree Ensembles for Monitoring Cardiac Complications of Diabetes

Cardiac complications of diabetes require continuous monitoring since they may lead to increased morbidity or sudden death of patients. In order to monitor clinical complications of diabetes using wearable sensors, a small set of features have to be ...
Read More
A robust multi-class AdaBoost algorithm for mislabeled noisy data

AdaBoost has been theoretically and empirically proved to be a very successful ensemble learning algorithm, which iteratively generates a set of diverse weak learners and combines their outputs using the weighted majority voting rule as the final ...
Read More
Robust Computation Tree Logic
NASA Formal Methods
Abstract
It is widely accepted that every system should be robust in that “small” violations of environment assumptions should lead to “small” violations of system guarantees, but it is less clear how to make this intuition mathematically precise. While ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CCS '23: Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security
November 2023
3722 pages
ISBN:9798400700507
DOI:10.1145/3576915
General Chairs:
Weizhi Meng
Technical University of Denmark
,
Christian D. Jensen
Technical University of Denmark
,
Program Chairs:
Cas Cremers
CISPA Helmholtz Center for Information Security
,
Engin Kirda
Khoury College of Computer Sciences
Copyright © 2023 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 November 2023
Check for updates
Author Tags
decision tree ensembles
learning models
machine learning and security
robustness
verification and program analysis for machine
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,261of6,999submissions,18%
Upcoming Conference
CCS '24

Sponsor:

sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 14 - 18, 2024

Salt Lake City , UT , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 148
  Total Downloads
- Downloads (Last 12 months)148
- Downloads (Last 6 weeks)42
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Verifiable Learning for Robust Tree Ensembles

CCS '23: Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security

ABSTRACT

References

Cited By

Index Terms

Recommendations

Empirical Investigation of Decision Tree Ensembles for Monitoring Cardiac Complications of Diabetes

A robust multi-class AdaBoost algorithm for mislabeled noisy data

Robust Computation Tree Logic

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Verifiable Learning for Robust Tree Ensembles

CCS '23: Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security

ABSTRACT

References

Cited By

Index Terms

Recommendations

Empirical Investigation of Decision Tree Ensembles for Monitoring Cardiac Complications of Diabetes

A robust multi-class AdaBoost algorithm for mislabeled noisy data

Robust Computation Tree Logic

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media