Abstract
In this paper a new approach for the representation of multi-variant data is introduced. Current approaches consist on either hard-core coding techniques or conceptual / logical models to integrate structured and semi-structured data in customized, application-specific ways. The representation introduced here relies instead on unfolding technique to represent multi-variant data uniformly. This leads to a framework with core functionalities for organizing structured and semi-structured data. The paper presents also an efficient methodology towards retrieval of data from the proposed storage along with comparative performance analysis against existing practices. Accuracy, precision, and recall of the proposed technique are quantitatively evaluated and carefully reported.
Similar content being viewed by others
References
McHugh J, Abiteboul S, Goldman R, Quass D, Widom J (1997) Lore: a database management system for semistructured data. SIGMOD Record 26(3):54–66
Novotny T (2007) A content-oriented data model for semi-structured data. Dateso p 5566
Giacomo DG, Lenzerini M (1997) A uniform framework for concept definitions in description logics. J Artif Intell Res, AI Access Foundation and Morgan Kaufman, pp 87–110
Calvanese D, Giacomo DG, Lenzerini M (2007) Extending semi-structured data. In: 4th asia-pacific conference on conceptual modeling. vol 67, Australia, pp 11–14
Magnani M, Montesi D (2004) Dimensions of ignorance in a semi-structured data model. In: IEEE 15th international workshop on database and expert systems applications, pp 933–937
Kittivoravitkul S, Mc.Brien P (2005) Integrating unnormalised semi-structured data sources. In: 17th Int. Conf. CAiSE 2005, vol 3520. Springer LNCS, pp 460–474
Dobbie G, Xiaoying W, Ling WT, Lee LM (2001) Designing semi-structured database using ORA-SS model. In: IEEE 2nd International Conference on Web Information Systems Engineering, vol 1, p 171
Dittrich PJ, Vaz Salles, AM (2006) iDM: a unified and versatile data model for personal dataspace management. In: Pro. of VLDB 06, ACM, Korea, pp 367–378
Ramanan P (2002) Efficient algorithms for minimizing tree pattern queries. In: Proc. ACM SIGMOD Int. Conf. Management of Data -SIGMOD 02, pp 299–309
Chen D, Chan CY (2008) Minimization of tree pattern queries with constraints. Proc. ACM SIGMOD Intl Conf. Management of Data -SIGMOD 08, pp 609–622
Lu Jiaheng, Ling TW, Ling Z, Wang C (2011) Extended XML tree pattern matching: theories and algorithms. IEEE Trans Knowl Data Eng 23(3):402–416
Haw S-C, Lee C-S (2009) TwigX-Guide: an efficient twig pattern matching system extending data guide indexing and region encoding labeling. J Inform Sci Eng 25(2):603–617
Grimsmo N, Bjørklund TA (2010) Towards unifying advances in twig join algorithms. ADC’10 proceedings of the twenty-first australasian conference on database technologies, vol 104, pp 57–66
Gotz M, Koch C, Martens W (2009) Efficient algorithms for descendant-only tree pattern queries. Elsevier Inform Syst 34(7):602–623
Shahbazi A, Miller J (2014) Extended subtree: a new similarity function for tree structured data. IEEE Trans Knowl Data Eng 26(4):864–877
Tran T, Ladwig G, Rudolph S (2013) Managing structured and semi-structured RDF data using structure indexes. IEEE Trans Knowl Data Eng 25(9):2076–2089
Jong PY (2006) Prestro authorization: a bitmap indexing scheme for high speed access control to XML documents. IEEE Trans Knowl Data Eng 17(7):972–987
Monjurul Alom BM, Henskens F, Hannaford M (2009) Querying Semi-structured data with compression in distributed environments IEEE Sixth International Conference on Information Technology, Australia, pp 1546–1553
Hachicha M, Darmont J (2013) A survey of XML tree patterns. IEEE Trans Knowl Data Eng 25(1):29–46
Boag S, Chamberlin D, Fernandez MF, Florescu D, Robie J, Simeon J (2007) XQuery 1.0: An XML Query Language, World Wide Web Consortium (W3C), http://www.w3.org/TR/xquery/
Wu H, Li G, Zhou L (2013) Ginix: generalized inverted index for keyword search. J Tsinghua Sci Technol 18(1):77–81 (publisher TUP, indexed by IEEE xplore)
Xu L, Ling TW, Wu H (2012) Labeling dynamic XML documents: an order-centric approach. IEEE Trans Knowl Data Eng 24(1):100–113
Chakraborty S, Chaki N (2011) A survey on semi-structured data models. In: Proc. of 10th Int. Conference CISIM on Agent Based Computing and Its Application, CCIS, Springer, Kolkata, India, Dec 14–16
Mikael R, Jensen TH, Mller T, Bach P (2001) Converting XML data to UML diagrams for conceptual data integration. In: 1st Int. Workshop on Data Integration over the Web (DIWeb). 13-th Con. on Advanced Information Systems Engineering CAISE01
Lahiri T, Abiteboul S, Widom J (1999) OZONE: integrating structured and semi-structured data. In: 7-th Int. Con. Database Programming Languages, Scotland
Garcia- Molina H, Papakonstantinou Y, Quass D, Rajaraman A, Sajiv Y, Ullman J, Vassalos V, Widom J (1997) The TSIMMIS approach to mediation: data models and languages. J Intell Inform Syst 8(2):117–132
Kettouch MS, Luca C, Hobbs M, Fatima A (2015) Data integration approach for semi-structured and structured data (Linked Data). In: 13th IEEE International Conference on Industrial Informatics (INDIN), pp 820–825
Yants VI, Chernov AV, Butakova MA, Klimanskaya EV (2015) Multilevel data storage model of fuzzy semi-structured data. In: 2015 XVIII International Conference on Soft Computing and Measurements (SCM), pp 112–114, IEEE, May, 2015
Greene GJ (2015) A generic framework for concept-based exploration of semi-Structured Software Engineering Data. In: 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp 894–897
Sen S, Cortesi A, Chaki N (2016) Hyper-lattice algebraic model for data warehousing. Springer, Switzerland
Li S, Xu M (2010) A novel approach of computing XML similarity based on weighted XML data model. In: 8-th IEEE International Conference on Control and Automation, Xiamen, China, June 9–11, pp 1157–1162
Chakraborty S, Chaki N (2012) DFRS: Domain-based framework for representing semi-structured data. CUBE 12-The Int. Conference, ACM, Pune, pp 447–452, 11–12
Bhandarkar M, Vagelis HF, Rangasalmi R (2006) Efficient native storage system for semi-structured data, Florida International University Technical Report TR-2006-09-01
Wu X, Souldatos S, Theodoratos D, Dalamages T, Vassillou Y, Sellis T (2012) Processing and evaluating partial tree pattern queries on XML data. IEEE Trans Knowl Data Eng 24(12):2244–2259
Qiaoyu L (2010) Performance Analysis of Data Organization of the Real Time Memory Database Based on Red-Black Tree. In: International Conference on Computing, Control and Industrial Engineering (CCIE), vol 1, pp 428–430
Neviarouskaya A, Prendinger H, Ishizuka M (2011) SentiFul: a lexicon for sentiment analysis. IEEE Trans Affect Comput 2(1):22–36
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chakraborty, S., Cortesi, A. & Chaki, N. A uniform representation of multi-variant data in intensive-query databases. Innovations Syst Softw Eng 12, 163–176 (2016). https://doi.org/10.1007/s11334-016-0275-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11334-016-0275-9