Skip to main content
Log in

A uniform representation of multi-variant data in intensive-query databases

  • S.I. : ICACNI 2015
  • Published:
Innovations in Systems and Software Engineering Aims and scope Submit manuscript

Abstract

In this paper a new approach for the representation of multi-variant data is introduced. Current approaches consist on either hard-core coding techniques or conceptual / logical models to integrate structured and semi-structured data in customized, application-specific ways. The representation introduced here relies instead on unfolding technique to represent multi-variant data uniformly. This leads to a framework with core functionalities for organizing structured and semi-structured data. The paper presents also an efficient methodology towards retrieval of data from the proposed storage along with comparative performance analysis against existing practices. Accuracy, precision, and recall of the proposed technique are quantitatively evaluated and carefully reported.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. McHugh J, Abiteboul S, Goldman R, Quass D, Widom J (1997) Lore: a database management system for semistructured data. SIGMOD Record 26(3):54–66

    Article  Google Scholar 

  2. Novotny T (2007) A content-oriented data model for semi-structured data. Dateso p 5566

  3. Giacomo DG, Lenzerini M (1997) A uniform framework for concept definitions in description logics. J Artif Intell Res, AI Access Foundation and Morgan Kaufman, pp 87–110

  4. Calvanese D, Giacomo DG, Lenzerini M (2007) Extending semi-structured data. In: 4th asia-pacific conference on conceptual modeling. vol 67, Australia, pp 11–14

  5. Magnani M, Montesi D (2004) Dimensions of ignorance in a semi-structured data model. In: IEEE 15th international workshop on database and expert systems applications, pp 933–937

  6. Kittivoravitkul S, Mc.Brien P (2005) Integrating unnormalised semi-structured data sources. In: 17th Int. Conf. CAiSE 2005, vol 3520. Springer LNCS, pp 460–474

  7. Dobbie G, Xiaoying W, Ling WT, Lee LM (2001) Designing semi-structured database using ORA-SS model. In: IEEE 2nd International Conference on Web Information Systems Engineering, vol 1, p 171

  8. Dittrich PJ, Vaz Salles, AM (2006) iDM: a unified and versatile data model for personal dataspace management. In: Pro. of VLDB 06, ACM, Korea, pp 367–378

  9. Ramanan P (2002) Efficient algorithms for minimizing tree pattern queries. In: Proc. ACM SIGMOD Int. Conf. Management of Data -SIGMOD 02, pp 299–309

  10. Chen D, Chan CY (2008) Minimization of tree pattern queries with constraints. Proc. ACM SIGMOD Intl Conf. Management of Data -SIGMOD 08, pp 609–622

  11. Lu Jiaheng, Ling TW, Ling Z, Wang C (2011) Extended XML tree pattern matching: theories and algorithms. IEEE Trans Knowl Data Eng 23(3):402–416

    Article  Google Scholar 

  12. Haw S-C, Lee C-S (2009) TwigX-Guide: an efficient twig pattern matching system extending data guide indexing and region encoding labeling. J Inform Sci Eng 25(2):603–617

    Google Scholar 

  13. Grimsmo N, Bjørklund TA (2010) Towards unifying advances in twig join algorithms. ADC’10 proceedings of the twenty-first australasian conference on database technologies, vol 104, pp 57–66

  14. Gotz M, Koch C, Martens W (2009) Efficient algorithms for descendant-only tree pattern queries. Elsevier Inform Syst 34(7):602–623

    Article  Google Scholar 

  15. Shahbazi A, Miller J (2014) Extended subtree: a new similarity function for tree structured data. IEEE Trans Knowl Data Eng 26(4):864–877

    Article  Google Scholar 

  16. Tran T, Ladwig G, Rudolph S (2013) Managing structured and semi-structured RDF data using structure indexes. IEEE Trans Knowl Data Eng 25(9):2076–2089

    Article  Google Scholar 

  17. Jong PY (2006) Prestro authorization: a bitmap indexing scheme for high speed access control to XML documents. IEEE Trans Knowl Data Eng 17(7):972–987

    Google Scholar 

  18. Monjurul Alom BM, Henskens F, Hannaford M (2009) Querying Semi-structured data with compression in distributed environments IEEE Sixth International Conference on Information Technology, Australia, pp 1546–1553

  19. Hachicha M, Darmont J (2013) A survey of XML tree patterns. IEEE Trans Knowl Data Eng 25(1):29–46

    Article  Google Scholar 

  20. Boag S, Chamberlin D, Fernandez MF, Florescu D, Robie J, Simeon J (2007) XQuery 1.0: An XML Query Language, World Wide Web Consortium (W3C), http://www.w3.org/TR/xquery/

  21. Wu H, Li G, Zhou L (2013) Ginix: generalized inverted index for keyword search. J Tsinghua Sci Technol 18(1):77–81 (publisher TUP, indexed by IEEE xplore)

    Article  MathSciNet  Google Scholar 

  22. Xu L, Ling TW, Wu H (2012) Labeling dynamic XML documents: an order-centric approach. IEEE Trans Knowl Data Eng 24(1):100–113

    Article  Google Scholar 

  23. Chakraborty S, Chaki N (2011) A survey on semi-structured data models. In: Proc. of 10th Int. Conference CISIM on Agent Based Computing and Its Application, CCIS, Springer, Kolkata, India, Dec 14–16

  24. Mikael R, Jensen TH, Mller T, Bach P (2001) Converting XML data to UML diagrams for conceptual data integration. In: 1st Int. Workshop on Data Integration over the Web (DIWeb). 13-th Con. on Advanced Information Systems Engineering CAISE01

  25. Lahiri T, Abiteboul S, Widom J (1999) OZONE: integrating structured and semi-structured data. In: 7-th Int. Con. Database Programming Languages, Scotland

  26. Garcia- Molina H, Papakonstantinou Y, Quass D, Rajaraman A, Sajiv Y, Ullman J, Vassalos V, Widom J (1997) The TSIMMIS approach to mediation: data models and languages. J Intell Inform Syst 8(2):117–132

    Article  Google Scholar 

  27. Kettouch MS, Luca C, Hobbs M, Fatima A (2015) Data integration approach for semi-structured and structured data (Linked Data). In: 13th IEEE International Conference on Industrial Informatics (INDIN), pp 820–825

  28. Yants VI, Chernov AV, Butakova MA, Klimanskaya EV (2015) Multilevel data storage model of fuzzy semi-structured data. In: 2015 XVIII International Conference on Soft Computing and Measurements (SCM), pp 112–114, IEEE, May, 2015

  29. Greene GJ (2015) A generic framework for concept-based exploration of semi-Structured Software Engineering Data. In: 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp 894–897

  30. Sen S, Cortesi A, Chaki N (2016) Hyper-lattice algebraic model for data warehousing. Springer, Switzerland

    Book  MATH  Google Scholar 

  31. Li S, Xu M (2010) A novel approach of computing XML similarity based on weighted XML data model. In: 8-th IEEE International Conference on Control and Automation, Xiamen, China, June 9–11, pp 1157–1162

  32. Chakraborty S, Chaki N (2012) DFRS: Domain-based framework for representing semi-structured data. CUBE 12-The Int. Conference, ACM, Pune, pp 447–452, 11–12

  33. Bhandarkar M, Vagelis HF, Rangasalmi R (2006) Efficient native storage system for semi-structured data, Florida International University Technical Report TR-2006-09-01

  34. Wu X, Souldatos S, Theodoratos D, Dalamages T, Vassillou Y, Sellis T (2012) Processing and evaluating partial tree pattern queries on XML data. IEEE Trans Knowl Data Eng 24(12):2244–2259

    Article  Google Scholar 

  35. Qiaoyu L (2010) Performance Analysis of Data Organization of the Real Time Memory Database Based on Red-Black Tree. In: International Conference on Computing, Control and Industrial Engineering (CCIE), vol 1, pp 428–430

  36. Neviarouskaya A, Prendinger H, Ishizuka M (2011) SentiFul: a lexicon for sentiment analysis. IEEE Trans Affect Comput 2(1):22–36

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nabedu Chaki.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chakraborty, S., Cortesi, A. & Chaki, N. A uniform representation of multi-variant data in intensive-query databases. Innovations Syst Softw Eng 12, 163–176 (2016). https://doi.org/10.1007/s11334-016-0275-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11334-016-0275-9

Keywords

Navigation