Abstract
In this paper we use the Bayesian network as a tool of explorative analysis: its theory guarantees that, given the structure and some assumptions, the Markov blanket of a variable is the minimal conditioning set through which the variable is independent from all the others. We use the Markov blanket of a target variable to extract the relevant features for constructing a decision tree (DT). Our proposal reduces the complexity of the DT so it has a simpler visualization and it can be more easily interpretable. On the other hand, it maintains a good classification performance.
Similar content being viewed by others
References
Acid S, de Campos L (2003) Searching for Bayesian network structures in the space of restricted acyclic partially directed graphs. J Artif Intell Res 18: 445–490
Aliferis CF, Tsamardinos I, Statnikov A (2003) HITON: a novel Markov Blanket algorithm for optimal variable selection. In: Proceedings of the 2003 American Medical Informatics Association (AMIA) annual symposium, pp 21–25
Bouckaert RR (1995) Bayesian belief networks: from construction to inference. PhD Thesis, University of Utrecht
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth International Group, Belmont
Breslow LA, Aha DW (1997) Simplifying decision trees: a survey. Knowl Eng Rev 12(1): 1–40
Buntine W (1991) Theory refinement on Bayesian networks. In: Proceeding of the seventh conference on uncertainty in artificial intelligence, pp 52–60
Chickering DM (1996) Learning Bayesian networks is NP-complete. In: Fisher D, Lenz HJ (eds) Learning from data: artificial intelligence and statistics. Springer, New York
Chow CK, Liu CN (1968) Approximating discrete probability distributions with dependence trees. IEEE Trans Inf Theory 14(3): 462–467
Cooper G, Herskovits E (1992) A Bayesian method for the induction of probabilistic networks from data. Mach Learn 9: 309–347
Cowell RG, Dawid AP, Lauritzen SL, Spiegelhalter DJ (1999) Probabilistic networks and expert systems. Springer, New York
Frey L, Fisher D, Tsamardinos I, Aliferis CF, Statnikov A (2003) Identifying Markov Blankets with decision tree induction. In: Proceedings of third IEEE international conference on data mining (ICDM), Melbourne, pp 59–66
Friedman N, Goldszmidt M (1996) Learning Bayesian networks with local structures. In: Proceedings of the twelfth conference on uncertainty in artificial intelligence, pp 252–262
Glymour C, Cooper GF (1999) Computation, causation and discovery. MIT Press, Cambridge
Hartemink AJ, Gifford DK, Jaakkola TS, Young RA (2002) Combining location and expression data for principled discovery of genetic regulatory network models. In: Pacific symposium on biocomputing, pp 437–449
Heckerman D, Geiger D, Chickering DM (1995) Learning Bayesian networks: the combinations of knowledge and statistical data. Mach Learn 20: 197–243
Herskovits E, Cooper GF (1990) Kutato: an entropy-driven system for the construction of probabilistic expert systems from databases. In: Proceedings of the sixth conference on uncertainty in artificial intelligence, pp 54–62
Hornik K, Zeileis A, Hothorn T, Buchta C (2007) RWeka: an R Interface to Weka. R package version 0.3-2
Jensen FV (2001) Bayesian networks and decision graphs. Springer, New York
John GH, Kohavi R, Pleger K (1994) Irrelevant features and the subset selection problem. In: Proceedings of the eleventh international machine learning conference, pp 121–129
Lam W, Bacchus F (1994) Learning Bayesian belief networks. An approach based on the MDL principle. Comput Intell 10(4): 269–293
Liu H, Motoda H (2008) Computational methods of feature selection. Chapman & Hall/CRC, Taylor and Francis Group LLC, London
Madden MG (2003) The performance of Bayesian network classifiers constructed using different techniques. In: Working notes of the ECML PkDD-03 Workshop, pp 59–70
Margaritis D, Thrun S (1999) Bayesian network induction via local neighborhoods. In: Solla S, Leen T, Müller KR (eds) Proceedings of conference on neural information processing systems (NIPS-12). MIT Press, Cambridge
Meek C (1995) Strong completeness and faithfulness in Bayesian networks. In: Proceedings of the eleventh conference on uncertainty in artificial intelligence, pp 403–410
Mitchell T (1997) Machine learning. Mc Graw-Hill, New York
Pearl J (1988) Probabilistic reasoning in intelligence systems. Morgan Kaufmann, Los Altos
Quinlan JR (1986) Induction of decision trees. Mach Learn 1: 81–106
Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, Los Altos
Ripley B (2007) The tree package. R package version 1.0-26
Saeys Y, Inza I, Larrañaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19): 2507–2517
Schauerhuber M, Zeileis A, Meyer D, Hornik K (2008) Benchmarking open-source tree learners in R/RWeka. Data analysis, machine learning and applications. In: Proceedings of the 31st annual conference of the Gesellschaft für Klassification
Tsamardinos I, Aliferis C, Statnikov A (2003) Algorithms for large scale markov blanket discovery. In: The sixteenth international flairs conference, St. Augustine, USA
Witten I, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Brogini, A., Slanzi, D. On using Bayesian networks for complexity reduction in decision trees. Stat Methods Appl 19, 127–139 (2010). https://doi.org/10.1007/s10260-009-0116-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10260-009-0116-1