Several Computational Studies About Variable Selection for Probabilistic Bayesian Classifiers

Brogini, Adriana; Slanzi, Debora

doi:10.1007/978-3-642-03739-9_23

Several Computational Studies About Variable Selection for Probabilistic Bayesian Classifiers

Adriana Brogini⁴ &
Debora Slanzi

Conference paper
First Online: 25 November 2009

1517 Accesses

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

Abstract

The Bayesian network can be considered as a probabilistic classifier with the ability of giving a clear insight into the structural relationships in the domain under investigation. In this paper we use some methodologies of feature subset selection in order to determine the relevant variables which are then used for constructing the Bayesian network. To test how the selected methods of feature selection affect the classification, we consider several Bayesian classifiers: Naïve Bayes, Tree Augmented Naïve Bayes and the general Bayesian network, which is used as benchmark for the comparison.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Aliferis, C. F., Tsamardinos, I., & Statnikov, A. (2003). HITON: A novel Markov blanket algorithm for optimal variable selection. In Proceedings of the 2003 American Medical Informatics Association (AMIA) Annual Symposium (pp. 21–25).
Google Scholar
Cheng, J., & Greiner, R. (1999). Comparing Bayesian network classifiers. In Proceedings UAI-99.
Google Scholar
Cooper, G. F., & Herskovits, E. (1992). A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9(4), 309–348.
MATH Google Scholar
Frey, L., Fisher, D., Tsamardinos, I., Aliferis, C. F., & Statnikov, A. (2003). Identifying Markov blankets with decision tree induction. In Proceedings of third IEEE International Conference on Data Mining (ICDM) (pp. 59–66).
Google Scholar
Friedman, N., Geiger, D., & Goldszmidt, M. (1997). Bayesian network classifiers. Machine Learning, 29, 131–161.
Article MATH Google Scholar
Goldberg, D. E. (1989). Genetic algorithms in search, optimization and machine learning. Reading, MA: Addison-Wesley.
MATH Google Scholar
Heckerman, D. (1999). A tutorial on learning Bayesian networks. In Learning graphical models. Cambridge, MA: MIT Press.
Google Scholar
Heckerman, D., Geiger, D., & Chickering, D. M. (1995). Learning Bayesian networks: The combinations of knowledge and statistical data. Machine Learning, 20, 197–243.
MATH Google Scholar
Kohavi, R., & George, H. J. (1997). Wrappers for feature subset selection. Artificial Intelligence, 1(2), 273–324.
Article Google Scholar
Langley, P., Iba, W., & Thompson, K. (1992). An analysis of Bayesian classifiers. In Proceedings of AAAI-92 (pp. 223–228).
Google Scholar
Lauritzen, S. L. (1996). Graphical models. Oxford: Clarendon Press.
Google Scholar
Madden, M. G. (2003). The performance of Bayesian network classifiers constructed using different techniques. In Working notes of the ECML PkDD-03 workshop (pp. 59–70).
Google Scholar
Margaritis, D., & Thrun, S. (1999). Bayesian network induction via local neighborhoods. In Proceedings of conference on Neural Information Processing Systems (NIPS-12), MIT Press.
Google Scholar
Meek, C. (1997). Graphical models: Selecting causal and statistical models. Ph.D. Thesis, Carnegie Mellon University.
Google Scholar
Mitchell, M. (1996). An introduction to genetic algorithms. Cambridge, MA: MIT Press.
Google Scholar
Nadeau, C., & Bengio, Y. (2000). Inference for the generalization error. Advances in Neural Information Processing Systems, 12, 293–281.
Google Scholar
Neapolitan, R. E. (1990). Probabilistic reasoning in expert systems: Theory and algorithms. New York: Wiley.
Google Scholar
Pearl, J. (1988). Probabilistic reasoning in intelligence systems. Los Altos, CA: Morgan Kaufmann.
Google Scholar
Quinlan, J. R. (1993). C4.5: Programs for machine learning. Los Altos, CA: Morgan Kaufmann.
Google Scholar
Saeys, Y., Inza I., & Larrañaga, P. (2007). A review of feature selection techniques in bioinformatics. Bioinformatics, 23(19), 2507–2517.
Article Google Scholar
Spirtes, S., Glymour, C., & Scheines, R. (1993). Causation, prediction and search. Berlin: Springer.
MATH Google Scholar
Spirtes, P., Glymour, C., & Scheines, R. (2000). Causation, prediction, and search. New York: MIT Press.
Google Scholar
Tsamardinos, I., & Aliferis, C. F. (2003). Towards principled feature selection: Relevancy, filters and wrappers. In Proceedings of the ninth international workshop on Artificial Intelligence and Statistics.
Google Scholar
Tsamardinos, I., Aliferis, C., & Statnikov, A. (2003). Algorithms for large scale Markov blanket discovery. In Proceeding of the sixteenth international FLAIRS conference.
Google Scholar
WEKA. (2004). On-line documentation. Waikato University, New Zeland. Retrieved from http//www.cs.waikato.ac.nz/ml/weka/.

Download references

Author information

Authors and Affiliations

Department of Statistics, University of Padova, via Cesare Battisti 241, 35121, Padova, Italy
Adriana Brogini

Authors

Adriana Brogini
View author publications
You can also search for this author in PubMed Google Scholar
Debora Slanzi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adriana Brogini .

Editor information

Editors and Affiliations

Fac. Economia, Università Macerata, Via Crescimbeni 20, Macerata, 62100, Italy
Francesco Palumbo
Dipto. Matematica e Statistica, Università Federico II di Napoli, Via Cinthia (Monte S. Angelo), Napoli, 80126, Italy
Carlo Natale Lauro
Depto. Economía y Empresa, Universitat Pompeu Fabra, Ramon Trias Fargas 25-27, Barcelona, 08005, Spain
Michael J. Greenacre

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brogini, A., Slanzi, D. (2010). Several Computational Studies About Variable Selection for Probabilistic Bayesian Classifiers. In: Palumbo, F., Lauro, C., Greenacre, M. (eds) Data Analysis and Classification. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03739-9_23

Download citation

DOI: https://doi.org/10.1007/978-3-642-03739-9_23
Published: 25 November 2009
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03738-2
Online ISBN: 978-3-642-03739-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics