Skip to main content
Log in

On using Bayesian networks for complexity reduction in decision trees

  • Original Article
  • Published:
Statistical Methods and Applications Aims and scope Submit manuscript

Abstract

In this paper we use the Bayesian network as a tool of explorative analysis: its theory guarantees that, given the structure and some assumptions, the Markov blanket of a variable is the minimal conditioning set through which the variable is independent from all the others. We use the Markov blanket of a target variable to extract the relevant features for constructing a decision tree (DT). Our proposal reduces the complexity of the DT so it has a simpler visualization and it can be more easily interpretable. On the other hand, it maintains a good classification performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Acid S, de Campos L (2003) Searching for Bayesian network structures in the space of restricted acyclic partially directed graphs. J Artif Intell Res 18: 445–490

    MATH  Google Scholar 

  • Aliferis CF, Tsamardinos I, Statnikov A (2003) HITON: a novel Markov Blanket algorithm for optimal variable selection. In: Proceedings of the 2003 American Medical Informatics Association (AMIA) annual symposium, pp 21–25

  • Bouckaert RR (1995) Bayesian belief networks: from construction to inference. PhD Thesis, University of Utrecht

  • Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth International Group, Belmont

    MATH  Google Scholar 

  • Breslow LA, Aha DW (1997) Simplifying decision trees: a survey. Knowl Eng Rev 12(1): 1–40

    Article  Google Scholar 

  • Buntine W (1991) Theory refinement on Bayesian networks. In: Proceeding of the seventh conference on uncertainty in artificial intelligence, pp 52–60

  • Chickering DM (1996) Learning Bayesian networks is NP-complete. In: Fisher D, Lenz HJ (eds) Learning from data: artificial intelligence and statistics. Springer, New York

    Google Scholar 

  • Chow CK, Liu CN (1968) Approximating discrete probability distributions with dependence trees. IEEE Trans Inf Theory 14(3): 462–467

    Article  MATH  MathSciNet  Google Scholar 

  • Cooper G, Herskovits E (1992) A Bayesian method for the induction of probabilistic networks from data. Mach Learn 9: 309–347

    MATH  Google Scholar 

  • Cowell RG, Dawid AP, Lauritzen SL, Spiegelhalter DJ (1999) Probabilistic networks and expert systems. Springer, New York

    MATH  Google Scholar 

  • Frey L, Fisher D, Tsamardinos I, Aliferis CF, Statnikov A (2003) Identifying Markov Blankets with decision tree induction. In: Proceedings of third IEEE international conference on data mining (ICDM), Melbourne, pp 59–66

  • Friedman N, Goldszmidt M (1996) Learning Bayesian networks with local structures. In: Proceedings of the twelfth conference on uncertainty in artificial intelligence, pp 252–262

  • Glymour C, Cooper GF (1999) Computation, causation and discovery. MIT Press, Cambridge

    MATH  Google Scholar 

  • Hartemink AJ, Gifford DK, Jaakkola TS, Young RA (2002) Combining location and expression data for principled discovery of genetic regulatory network models. In: Pacific symposium on biocomputing, pp 437–449

  • Heckerman D, Geiger D, Chickering DM (1995) Learning Bayesian networks: the combinations of knowledge and statistical data. Mach Learn 20: 197–243

    MATH  Google Scholar 

  • Herskovits E, Cooper GF (1990) Kutato: an entropy-driven system for the construction of probabilistic expert systems from databases. In: Proceedings of the sixth conference on uncertainty in artificial intelligence, pp 54–62

  • Hornik K, Zeileis A, Hothorn T, Buchta C (2007) RWeka: an R Interface to Weka. R package version 0.3-2

  • Jensen FV (2001) Bayesian networks and decision graphs. Springer, New York

    MATH  Google Scholar 

  • John GH, Kohavi R, Pleger K (1994) Irrelevant features and the subset selection problem. In: Proceedings of the eleventh international machine learning conference, pp 121–129

  • Lam W, Bacchus F (1994) Learning Bayesian belief networks. An approach based on the MDL principle. Comput Intell 10(4): 269–293

    Article  Google Scholar 

  • Liu H, Motoda H (2008) Computational methods of feature selection. Chapman & Hall/CRC, Taylor and Francis Group LLC, London

  • Madden MG (2003) The performance of Bayesian network classifiers constructed using different techniques. In: Working notes of the ECML PkDD-03 Workshop, pp 59–70

  • Margaritis D, Thrun S (1999) Bayesian network induction via local neighborhoods. In: Solla S, Leen T, Müller KR (eds) Proceedings of conference on neural information processing systems (NIPS-12). MIT Press, Cambridge

    Google Scholar 

  • Meek C (1995) Strong completeness and faithfulness in Bayesian networks. In: Proceedings of the eleventh conference on uncertainty in artificial intelligence, pp 403–410

  • Mitchell T (1997) Machine learning. Mc Graw-Hill, New York

    MATH  Google Scholar 

  • Pearl J (1988) Probabilistic reasoning in intelligence systems. Morgan Kaufmann, Los Altos

    Google Scholar 

  • Quinlan JR (1986) Induction of decision trees. Mach Learn 1: 81–106

    Google Scholar 

  • Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, Los Altos

    Google Scholar 

  • Ripley B (2007) The tree package. R package version 1.0-26

  • Saeys Y, Inza I, Larrañaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19): 2507–2517

    Article  Google Scholar 

  • Schauerhuber M, Zeileis A, Meyer D, Hornik K (2008) Benchmarking open-source tree learners in R/RWeka. Data analysis, machine learning and applications. In: Proceedings of the 31st annual conference of the Gesellschaft für Klassification

  • Tsamardinos I, Aliferis C, Statnikov A (2003) Algorithms for large scale markov blanket discovery. In: The sixteenth international flairs conference, St. Augustine, USA

  • Witten I, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adriana Brogini.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Brogini, A., Slanzi, D. On using Bayesian networks for complexity reduction in decision trees. Stat Methods Appl 19, 127–139 (2010). https://doi.org/10.1007/s10260-009-0116-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10260-009-0116-1

Keywords

Navigation