Abstract
Food recognition is an interesting and challenging problem with applications in medical, social and anthropological research areas. The high variability of food images makes the recognition task difficult for current state-of-the-art methods. It has been proved that the exploitation of multiple features to capture complementary aspects of the image contents is useful to improve the discrimination of different food items. In this paper we exploit an image representation based on the consensus among visual vocabularies built on different feature spaces. Starting from a set of visual codebooks, a consensus clustering technique is used to build a consensus vocabulary used to represent food pictures with a Bag-of-Visual-Words paradigm. This new representation is employed together with a SVM for recognition purpose.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Battiato, S., Farinella, G.M., Puglisi, G., Ravì, D.: Aligning codebooks for near duplicate image detection. Multimedia Tools and Applications, 1–24 (2013)
Chen, M., Dhingra, K., Wu, W., Yang, L., Sukthankar, R., Yang, J.: Pfid: Pittsburgh fast-food image dataset. IEEE International Conference on Image Processing, 289–292 (2009)
Jiménez, A.R., Jain, A.K., Ceres, R., Pons, J.: Automatic fruit recognition: a survey and new results using range/attenuation images. Pattern recognition 32(10), 1719–1736 (1999)
Joutou, T., Yanai, K.: A food image recognition system with multiple kernel learning. IEEE International Conference on Image Processing, 285–288 (2009)
Lazebnik, S., Schmid, C., Ponce, J.: A sparse texture representation using local affine regions. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(8), 1265–1278 (2005)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)
Matsuda, Y., Hoashi, H., Yanai, K.: Recognition of multiple-food images by detecting candidate regions. IEEE International Conference on Multimedia and Expo, 25–30 (2012)
Matsuda, Y., Yanai, K.: Multiple-food recognition considering co-occurrence employing manifold ranking. In: International Conference on Pattern Recognition, pp. 2017–2020 (2012)
Perronnin, F.: Universal and adapted vocabularies for generic visual categorization. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(7), 1243–1256 (2008)
Saffari, A., Bischof, H.: Clustering in a boosting framework, pp. 75–82. Computer Vision Winter Workshop (2007)
Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. IEEE Conference on Computer Vision and Pattern Recognition, 1–8 (2008)
Topchy, A., Jain, A.K., Punch, W.: Clustering ensembles: Models of consensus and weak partitions. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(12), 1866–1881 (2005)
van Gemert, J.C., Veenman, C.J., Smeulders, A.W., Geusebroek, J.-M.: Visual word ambiguity. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(7), 1271–1283 (2010)
Yang, S., Chen, M., Pomerleau, D., Sukthankar, R.: Food recognition using statistics of pairwise local features. IEEE Conference on Computer Vision and Pattern Recognition, 2249–2256 (2010)
Farinella, G.M., Moltisanti, M., Battiato, S.: Classifying Food Images Represented as Bag of Textons. IEEE International Conference on Image Processing, 5212–5216 (2014)
Farinella, G.M., Allegra, D., Stanco, F.: A benchmark dataset to study the representation of food images. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014 Workshops. LNCS, vol. 8927, pp. 584–599. Springer, Heidelberg (2015)
Anthimopoulos, M.M., Gianola, L., Scarnato, L., Diem, P., Mougiakakou, S.G.: A Food Recognition System for Diabetic Patients Based on an Optimized Bag-of-Features Model. IEEE Journal of Biomedical and Health Informatics 18(4), 1261–1271 (2014)
Bossard, L., Guillaumin, M., Van Gool, L.: Food-101 – mining discriminative components with random forests. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part VI. LNCS, vol. 8694, pp. 446–461. Springer, Heidelberg (2014)
Hu, Y., Cheng, X., Chia, L.-T., Xie, X., Rajan, D., Tan, A.-H.: Coherent Phrase Model for Efficient Image Near-Duplicate Retrieval. IEEE Transactions on Multimedia 11(8), 1434–1445 (2009)
Varma, M., Zisserman, A.: A Statistical Approach to Texture Classication from Single Images. International Journal of Computer Vision 62(1-2), 61–81 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Farinella, G.M., Moltisanti, M., Battiato, S. (2015). Food Recognition Using Consensus Vocabularies. In: Murino, V., Puppo, E., Sona, D., Cristani, M., Sansone, C. (eds) New Trends in Image Analysis and Processing -- ICIAP 2015 Workshops. ICIAP 2015. Lecture Notes in Computer Science(), vol 9281. Springer, Cham. https://doi.org/10.1007/978-3-319-23222-5_47
Download citation
DOI: https://doi.org/10.1007/978-3-319-23222-5_47
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23221-8
Online ISBN: 978-3-319-23222-5
eBook Packages: Computer ScienceComputer Science (R0)