skip to main content
10.1145/3448016.3457304acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections

EIRES: Efficient Integration of Remote Data in Event Stream Processing

Published:18 June 2021Publication History

ABSTRACT

To support reactive and predictive applications, complex event processing (CEP) systems detect patterns in event streams based on predefined queries. To determine the events that constitute a query match, their payload data may need to be assessed together with data from remote sources. Such dependencies are problematic, since waiting for remote data to be fetched interrupts the processing of the stream. Yet, without event selection based on remote data, the query state to maintain may grow exponentially. In either case, the performance of the CEP system degrades drastically.

To tackle these issues, we present EIRES, a framework for efficient integration of static data from remote sources in CEP. It employs a cost-model to determine when to fetch certain remote data elements and how long to keep them in a cache for future use. EIRES combines strategies for (i) prefetching that queries remote data based on anticipated use and (ii) lazy evaluation that postpones the event selection based on remote data without interrupting the stream processing. Our experiments indicate that the combination of these strategies improves the latency of query evaluation by up to 3,725x for synthetic data and 47x for real-world data.

Skip Supplemental Material Section

Supplemental Material

3448016.3457304.mp4

mp4

204.2 MB

References

  1. N. Giatrakos, E. Alevizos, A. Artikis, A. Deligiannakis, and M. N. Garofalakis, "Complex event recognition in the big data era: a survey," VLDB J., vol. 29, no. 1, pp. 313--352, 2020. [Online]. Available: https://doi.org/10.1007/s00778-019-00557-wGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  2. E. Wu, Y. Diao, and S. Rizvi, "High-performance complex event processing over streams," in Proceedings of the ACM SIGMOD International Conference on Management of Data, Chicago, Illinois, USA, June 27--29, 2006, S. Chaudhuri, V. Hristidis, and N. Polyzotis, Eds.hskip 1em plus 0.5em minus 0.4emrelax ACM, 2006, pp. 407--418. [Online]. Available: http://doi.acm.org/10.1145/1142473.1142520Google ScholarGoogle Scholar
  3. J. Agrawal, Y. Diao, D. Gyllstrom, and N. Immerman, "Efficient pattern matching over event streams," in Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, Vancouver, BC, Canada, June 10--12, 2008, J. T. Wang, Ed.hskip 1em plus 0.5em minus 0.4emrelax ACM, 2008, pp. 147--160. [Online]. Available: http://doi.acm.org/10.1145/1376616.1376634Google ScholarGoogle Scholar
  4. R. C. Fernandez, M. Weidlich, P. R. Pietzuch, and A. Gal, "Scalable stateful stream processing for smart grids," in The 8th ACM International Conference on Distributed Event-Based Systems, DEBS '14, Mumbai, India, May 26--29, 2014, U. Bellur and R. Kothari, Eds.hskip 1em plus 0.5em minus 0.4emrelax ACM, 2014, pp. 276--281. [Online]. Available: http://doi.acm.org/10.1145/2611286.2611326Google ScholarGoogle Scholar
  5. Y. Mei and S. Madden, "Zstream: a cost-based query processor for adaptively detecting composite events," in Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2009, Providence, Rhode Island, USA, June 29 - July 2, 2009, U. cC etintemel, S. B. Zdonik, D. Kossmann, and N. Tatbul, Eds.hskip 1em plus 0.5em minus 0.4emrelax ACM, 2009, pp. 193--206. [Online]. Available: http://doi.acm.org/10.1145/1559845.1559867Google ScholarGoogle Scholar
  6. H. Zhang, Y. Diao, and N. Immerman, "On complexity and optimization of expensive queries in complex event processing," in International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, June 22--27, 2014, C. E. Dyreson, F. Li, and M. T. Ö zsu, Eds.hskip 1em plus 0.5em minus 0.4emrelax ACM, 2014, pp. 217--228. [Online]. Available: http://doi.acm.org/10.1145/2588555.2593671Google ScholarGoogle Scholar
  7. M. Ray, C. Lei, and E. A. Rundensteiner, "Scalable pattern sharing on event streams," in Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016, F. Ö zcan, G. Koutrika, and S. Madden, Eds.hskip 1em plus 0.5em minus 0.4emrelax ACM, 2016, pp. 495--510. [Online]. Available: https://doi.org/10.1145/2882903.2882947Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. Ding, S. Chen, E. A. Rundensteiner, J. Tatemura, W. Hsiung, and K. S. Candan, "Runtime semantic query optimization for event stream processing," in Proceedings of the 24th International Conference on Data Engineering, ICDE 2008, April 7--12, 2008, Cancú n, Mexico, G. Alonso, J. A. Blakeley, and A. L. P. Chen, Eds.hskip 1em plus 0.5em minus 0.4emrelax IEEE Computer Society, 2008, pp. 676--685. [Online]. Available: https://doi.org/10.1109/ICDE.2008.4497476Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Weidlich, H. Ziekow, A. Gal, J. Mendling, and M. Weske, "Optimizing event pattern matching using business process models," IEEE Trans. Knowl. Data Eng., vol. 26, no. 11, pp. 2759--2773, 2014. [Online]. Available: https://doi.org/10.1109/TKDE.2014.2302306Google ScholarGoogle ScholarCross RefCross Ref
  10. B. Zhao, N. Q. V. Hung, and M. Weidlich, "Load shedding for complex event processing: Input-based and state-based techniques," in 36th IEEE International Conference on Data Engineering, ICDE 2020, Dallas, TX, USA, April 20--24, 2020. hskip 1em plus 0.5em minus 0.4emrelax IEEE, 2020, pp. 1093--1104. [Online]. Available: https://doi.org/10.1109/ICDE48307.2020.00099Google ScholarGoogle Scholar
  11. Y. He, S. Barman, and J. F. Naughton, "On load shedding in complex event processing," in Proc. 17th International Conference on Database Theory (ICDT), Athens, Greece, March 24--28, 2014., 2014, pp. 213--224. [Online]. Available: https://doi.org/10.5441/002/icdt.2014.23Google ScholarGoogle Scholar
  12. A. Kundu, S. Panigrahi, S. Sural, and A. K. Majumdar, "Blast-ssaha hybridization for credit card fraud detection," IEEE transactions on dependable and Secure Computing, vol. 6, no. 4, pp. 309--315, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. feedzai.com, "Modern Payment Fraud Prevention at Big Data Scale," https://feedzai.com/wp-content/uploads/2015/10/Feedzai-Whitepaper-Modern-Payment-Fraud-Prevention-at-Big-Data-Scale.pdf, 2013, last access: 01/02/21.Google ScholarGoogle Scholar
  14. A. Arasu, B. Babcock, S. Babu, J. Cieslewicz, M. Datar, K. Ito, R. Motwani, U. Srivastava, and J. Widom, "STREAM: the stanford data stream management system," in Data Stream Management - Processing High-Speed Data Streams, ser. Data-Centric Systems and Applications, M. N. Garofalakis, J. Gehrke, and R. Rastogi, Eds.hskip 1em plus 0.5em minus 0.4emrelax Springer, 2016, pp. 317--336. [Online]. Available: https://doi.org/10.1007/978--3--540--28608-0_16Google ScholarGoogle Scholar
  15. G. Cugola and A. Margara, "Processing flows of information: From data stream to complex event processing," ACM Comput. Surv., vol. 44, no. 3, pp. 15:1--15:62, 2012. [Online]. Available: http://doi.acm.org/10.1145/2187671.2187677Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. L. Brenna, A. J. Demers, J. Gehrke, M. Hong, J. Ossher, B. Panda, M. Riedewald, M. Thatte, and W. M. White, "Cayuga: a high-performance event processing engine," in Proceedings of the ACM SIGMOD International Conference on Management of Data, Beijing, China, June 12--14, 2007, 2007, pp. 1100--1102. [Online]. Available: http://doi.acm.org/10.1145/1247480.1247620Google ScholarGoogle Scholar
  17. G. Cugola and A. Margara, "Complex event processing with T-REX," J. Syst. Softw., vol. 85, no. 8, pp. 1709--1728, 2012. [Online]. Available: https://doi.org/10.1016/j.jss.2012.03.056Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Adi and O. Etzion, "Amit - the situation manager," VLDB J., vol. 13, no. 2, pp. 177--203, 2004. [Online]. Available: https://doi.org/10.1007/s00778-003-0108-yGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Chakravarthy and D. Mishra, "Snoop: An expressive event specification language for active databases," Data Knowl. Eng., vol. 14, no. 1, pp. 1--26, 1994. [Online]. Available: https://doi.org/10.1016/0169-023X(94)90006-XGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. Kingman, Poisson Processes, ser. Oxford Studies in Probability.hskip 1em plus 0.5em minus 0.4emrelax Clarendon Press, 1992. [Online]. Available: https://books.google.de/books?id=VEiM-OtwDHkCGoogle ScholarGoogle Scholar
  21. M. Akdere, U. cC etintemel, and N. Tatbul, "Plan-based complex event detection across distributed sources," Proc. VLDB Endow., vol. 1, no. 1, pp. 66--77, 2008. [Online]. Available: http://www.vldb.org/pvldb/vol1/1453869.pdfGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  22. H. Kellerer, U. Pferschy, and D. Pisinger, Knapsack problems. hskip 1em plus 0.5em minus 0.4emrelax Springer, 2004.Google ScholarGoogle Scholar
  23. C. Chekuri and S. Khanna, "A polynomial time approximation scheme for the multiple knapsack problem," SIAM J. Comput., vol. 35, no. 3, pp. 713--728, 2005. [Online]. Available: https://doi.org/10.1137/S0097539700382820Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. NOAA, "GOES-16: A GAME-CHANGER FOR FIGHTING DEADLY WILDFIRES," https://www.goes-r.gov/featureStories/goes16Wildfires.html, 2018, last access: 01/02/21.Google ScholarGoogle Scholar
  25. NOAA, "GOES-R Fire Detection and Characterization," https://www.goes-r.gov/education/docs/fs_fire.pdf, 2019, last access: 01/02/21.Google ScholarGoogle Scholar
  26. C. Reiss, J. Wilkes, and J. L. Hellerstein, "Google cluster-usage traces: format+schema," Google Inc., Mountain View, CA, USA, Technical Report, Nov. 2011, revised 2014--11--17 for version 2.1. Posted at https://github.com/google/cluster-data.Google ScholarGoogle Scholar
  27. S. P. Vanderwiel and D. J. Lilja, "Data prefetch mechanisms," ACM Comput. Surv., vol. 32, no. 2, pp. 174--199, 2000. [Online]. Available: https://doi.org/10.1145/358923.358939Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. S. Mittal, "A survey of recent prefetching techniques for processor caches," ACM Comput. Surv., vol. 49, no. 2, pp. 35:1--35:35, 2016. [Online]. Available: https://doi.org/10.1145/2907071Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. A. J. Smith, "Sequentiality and prefetching in database systems," ACM Trans. Database Syst., vol. 3, no. 3, pp. 223--247, 1978. [Online]. Available: https://doi.org/10.1145/320263.320276Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. H. Wedekind and G. Zö rntlein, "Prefetching in realtime database applications," in Proceedings of the 1986 ACM SIGMOD International Conference on Management of Data, Washington, DC, USA, May 28--30, 1986, C. Zaniolo, Ed.hskip 1em plus 0.5em minus 0.4emrelax ACM Press, 1986, pp. 215--226. [Online]. Available: https://doi.org/10.1145/16894.16876Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. S. Chen, A. Ailamaki, P. B. Gibbons, and T. C. Mowry, "Improving hash join performance through prefetching," in Proceedings of the 20th International Conference on Data Engineering, ICDE 2004, 30 March - 2 April 2004, Boston, MA, USA, Z. M. Ö zsoyoglu and S. B. Zdonik, Eds.hskip 1em plus 0.5em minus 0.4emrelax IEEE Computer Society, 2004, pp. 116--127. [Online]. Available: https://doi.org/10.1109/ICDE.2004.1319989Google ScholarGoogle Scholar
  32. M. Annavaram, J. M. Patel, and E. S. Davidson, "Call graph prefetching for database applications," ACM Trans. Comput. Syst., vol. 21, no. 4, pp. 412--444, 2003. [Online]. Available: https://doi.org/10.1145/945506.945509Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Y. O. Kocc berber, B. Falsafi, and B. Grot, "Asynchronous memory access chaining," Proc. VLDB Endow., vol. 9, no. 4, pp. 252--263, 2015. [Online]. Available: http://www.vldb.org/pvldb/vol9/p252-kocberber.pdfGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  34. I. T. Bowman and K. Salem, "Optimization of query streams using semantic prefetching," in Proceedings of the ACM SIGMOD International Conference on Management of Data, Paris, France, June 13--18, 2004, G. Weikum, A. C. Kö nig, and S. Deßloch, Eds.hskip 1em plus 0.5em minus 0.4emrelax ACM, 2004, pp. 179--190. [Online]. Available: https://doi.org/10.1145/1007568.1007591Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. L. Battle, R. Chang, and M. Stonebraker, "Dynamic prefetching of data tiles for interactive visualization," in Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016, F. Özcan, G. Koutrika, and S. Madden, Eds.hskip 1em plus 0.5em minus 0.4emrelax ACM, 2016, pp. 1363--1375. [Online]. Available: https://doi.org/10.1145/2882903.2882919Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. D. Kossmann, "The state of the art in distributed query processing," ACM Comput. Surv., vol. 32, no. 4, pp. 422--469, 2000. [Online]. Available: https://doi.org/10.1145/371578.371598Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. K. Amiri, S. Park, R. Tewari, and S. Padmanabhan, "Dbproxy: A dynamic data cache for web applications," in Proceedings of the 19th International Conference on Data Engineering, March 5--8, 2003, Bangalore, India, U. Dayal, K. Ramamritham, and T. M. Vijayaraman, Eds.hskip 1em plus 0.5em minus 0.4emrelax IEEE Computer Society, 2003, pp. 821--831. [Online]. Available: https://doi.org/10.1109/ICDE.2003.1260881Google ScholarGoogle Scholar
  38. C. Bornhö vd, M. Altinel, C. Mohan, H. Pirahesh, and B. Reinwald, "Adaptive database caching with dbcache," IEEE Data Eng. Bull., vol. 27, no. 2, pp. 11--18, 2004. [Online]. Available: ftp://ftp.research.microsoft.com/pub/debull/A04june/bornhoevd.psGoogle ScholarGoogle Scholar
  39. P. Larson, J. Goldstein, and J. Zhou, "Mtcache: Transparent mid-tier database caching in SQL server," in Proceedings of the 20th International Conference on Data Engineering, ICDE 2004, 30 March - 2 April 2004, Boston, MA, USA, Z. M. Özsoyoglu and S. B. Zdonik, Eds.hskip 1em plus 0.5em minus 0.4emrelax IEEE Computer Society, 2004, pp. 177--188. [Online]. Available: https://doi.org/10.1109/ICDE.2004.1319994Google ScholarGoogle Scholar
  40. S. Podlipnig and L. Böszörményi, "A survey of web cache replacement strategies," ACM Comput. Surv., vol. 35, no. 4, pp. 374--398, 2003. [Online]. Available: https://doi.org/10.1145/954339.954341Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. J. Mertz and I. Nunes, "Understanding application-level caching in web applications: A comprehensive introduction and survey of state-of-the-art approaches," ACM Comput. Surv., vol. 50, no. 6, pp. 98:1--98:34, 2018. [Online]. Available: https://doi.org/10.1145/3145813Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. EIRES: Efficient Integration of Remote Data in Event Stream Processing

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '21: Proceedings of the 2021 International Conference on Management of Data
      June 2021
      2969 pages
      ISBN:9781450383431
      DOI:10.1145/3448016

      Copyright © 2021 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 18 June 2021

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate785of4,003submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader