ABSTRACT
To support reactive and predictive applications, complex event processing (CEP) systems detect patterns in event streams based on predefined queries. To determine the events that constitute a query match, their payload data may need to be assessed together with data from remote sources. Such dependencies are problematic, since waiting for remote data to be fetched interrupts the processing of the stream. Yet, without event selection based on remote data, the query state to maintain may grow exponentially. In either case, the performance of the CEP system degrades drastically.
To tackle these issues, we present EIRES, a framework for efficient integration of static data from remote sources in CEP. It employs a cost-model to determine when to fetch certain remote data elements and how long to keep them in a cache for future use. EIRES combines strategies for (i) prefetching that queries remote data based on anticipated use and (ii) lazy evaluation that postpones the event selection based on remote data without interrupting the stream processing. Our experiments indicate that the combination of these strategies improves the latency of query evaluation by up to 3,725x for synthetic data and 47x for real-world data.
Supplemental Material
Available for Download
- N. Giatrakos, E. Alevizos, A. Artikis, A. Deligiannakis, and M. N. Garofalakis, "Complex event recognition in the big data era: a survey," VLDB J., vol. 29, no. 1, pp. 313--352, 2020. [Online]. Available: https://doi.org/10.1007/s00778-019-00557-wGoogle ScholarDigital Library
- E. Wu, Y. Diao, and S. Rizvi, "High-performance complex event processing over streams," in Proceedings of the ACM SIGMOD International Conference on Management of Data, Chicago, Illinois, USA, June 27--29, 2006, S. Chaudhuri, V. Hristidis, and N. Polyzotis, Eds.hskip 1em plus 0.5em minus 0.4emrelax ACM, 2006, pp. 407--418. [Online]. Available: http://doi.acm.org/10.1145/1142473.1142520Google Scholar
- J. Agrawal, Y. Diao, D. Gyllstrom, and N. Immerman, "Efficient pattern matching over event streams," in Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, Vancouver, BC, Canada, June 10--12, 2008, J. T. Wang, Ed.hskip 1em plus 0.5em minus 0.4emrelax ACM, 2008, pp. 147--160. [Online]. Available: http://doi.acm.org/10.1145/1376616.1376634Google Scholar
- R. C. Fernandez, M. Weidlich, P. R. Pietzuch, and A. Gal, "Scalable stateful stream processing for smart grids," in The 8th ACM International Conference on Distributed Event-Based Systems, DEBS '14, Mumbai, India, May 26--29, 2014, U. Bellur and R. Kothari, Eds.hskip 1em plus 0.5em minus 0.4emrelax ACM, 2014, pp. 276--281. [Online]. Available: http://doi.acm.org/10.1145/2611286.2611326Google Scholar
- Y. Mei and S. Madden, "Zstream: a cost-based query processor for adaptively detecting composite events," in Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2009, Providence, Rhode Island, USA, June 29 - July 2, 2009, U. cC etintemel, S. B. Zdonik, D. Kossmann, and N. Tatbul, Eds.hskip 1em plus 0.5em minus 0.4emrelax ACM, 2009, pp. 193--206. [Online]. Available: http://doi.acm.org/10.1145/1559845.1559867Google Scholar
- H. Zhang, Y. Diao, and N. Immerman, "On complexity and optimization of expensive queries in complex event processing," in International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, June 22--27, 2014, C. E. Dyreson, F. Li, and M. T. Ö zsu, Eds.hskip 1em plus 0.5em minus 0.4emrelax ACM, 2014, pp. 217--228. [Online]. Available: http://doi.acm.org/10.1145/2588555.2593671Google Scholar
- M. Ray, C. Lei, and E. A. Rundensteiner, "Scalable pattern sharing on event streams," in Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016, F. Ö zcan, G. Koutrika, and S. Madden, Eds.hskip 1em plus 0.5em minus 0.4emrelax ACM, 2016, pp. 495--510. [Online]. Available: https://doi.org/10.1145/2882903.2882947Google ScholarDigital Library
- L. Ding, S. Chen, E. A. Rundensteiner, J. Tatemura, W. Hsiung, and K. S. Candan, "Runtime semantic query optimization for event stream processing," in Proceedings of the 24th International Conference on Data Engineering, ICDE 2008, April 7--12, 2008, Cancú n, Mexico, G. Alonso, J. A. Blakeley, and A. L. P. Chen, Eds.hskip 1em plus 0.5em minus 0.4emrelax IEEE Computer Society, 2008, pp. 676--685. [Online]. Available: https://doi.org/10.1109/ICDE.2008.4497476Google ScholarDigital Library
- M. Weidlich, H. Ziekow, A. Gal, J. Mendling, and M. Weske, "Optimizing event pattern matching using business process models," IEEE Trans. Knowl. Data Eng., vol. 26, no. 11, pp. 2759--2773, 2014. [Online]. Available: https://doi.org/10.1109/TKDE.2014.2302306Google ScholarCross Ref
- B. Zhao, N. Q. V. Hung, and M. Weidlich, "Load shedding for complex event processing: Input-based and state-based techniques," in 36th IEEE International Conference on Data Engineering, ICDE 2020, Dallas, TX, USA, April 20--24, 2020. hskip 1em plus 0.5em minus 0.4emrelax IEEE, 2020, pp. 1093--1104. [Online]. Available: https://doi.org/10.1109/ICDE48307.2020.00099Google Scholar
- Y. He, S. Barman, and J. F. Naughton, "On load shedding in complex event processing," in Proc. 17th International Conference on Database Theory (ICDT), Athens, Greece, March 24--28, 2014., 2014, pp. 213--224. [Online]. Available: https://doi.org/10.5441/002/icdt.2014.23Google Scholar
- A. Kundu, S. Panigrahi, S. Sural, and A. K. Majumdar, "Blast-ssaha hybridization for credit card fraud detection," IEEE transactions on dependable and Secure Computing, vol. 6, no. 4, pp. 309--315, 2009.Google ScholarDigital Library
- feedzai.com, "Modern Payment Fraud Prevention at Big Data Scale," https://feedzai.com/wp-content/uploads/2015/10/Feedzai-Whitepaper-Modern-Payment-Fraud-Prevention-at-Big-Data-Scale.pdf, 2013, last access: 01/02/21.Google Scholar
- A. Arasu, B. Babcock, S. Babu, J. Cieslewicz, M. Datar, K. Ito, R. Motwani, U. Srivastava, and J. Widom, "STREAM: the stanford data stream management system," in Data Stream Management - Processing High-Speed Data Streams, ser. Data-Centric Systems and Applications, M. N. Garofalakis, J. Gehrke, and R. Rastogi, Eds.hskip 1em plus 0.5em minus 0.4emrelax Springer, 2016, pp. 317--336. [Online]. Available: https://doi.org/10.1007/978--3--540--28608-0_16Google Scholar
- G. Cugola and A. Margara, "Processing flows of information: From data stream to complex event processing," ACM Comput. Surv., vol. 44, no. 3, pp. 15:1--15:62, 2012. [Online]. Available: http://doi.acm.org/10.1145/2187671.2187677Google ScholarDigital Library
- L. Brenna, A. J. Demers, J. Gehrke, M. Hong, J. Ossher, B. Panda, M. Riedewald, M. Thatte, and W. M. White, "Cayuga: a high-performance event processing engine," in Proceedings of the ACM SIGMOD International Conference on Management of Data, Beijing, China, June 12--14, 2007, 2007, pp. 1100--1102. [Online]. Available: http://doi.acm.org/10.1145/1247480.1247620Google Scholar
- G. Cugola and A. Margara, "Complex event processing with T-REX," J. Syst. Softw., vol. 85, no. 8, pp. 1709--1728, 2012. [Online]. Available: https://doi.org/10.1016/j.jss.2012.03.056Google ScholarDigital Library
- A. Adi and O. Etzion, "Amit - the situation manager," VLDB J., vol. 13, no. 2, pp. 177--203, 2004. [Online]. Available: https://doi.org/10.1007/s00778-003-0108-yGoogle ScholarDigital Library
- S. Chakravarthy and D. Mishra, "Snoop: An expressive event specification language for active databases," Data Knowl. Eng., vol. 14, no. 1, pp. 1--26, 1994. [Online]. Available: https://doi.org/10.1016/0169-023X(94)90006-XGoogle ScholarDigital Library
- J. Kingman, Poisson Processes, ser. Oxford Studies in Probability.hskip 1em plus 0.5em minus 0.4emrelax Clarendon Press, 1992. [Online]. Available: https://books.google.de/books?id=VEiM-OtwDHkCGoogle Scholar
- M. Akdere, U. cC etintemel, and N. Tatbul, "Plan-based complex event detection across distributed sources," Proc. VLDB Endow., vol. 1, no. 1, pp. 66--77, 2008. [Online]. Available: http://www.vldb.org/pvldb/vol1/1453869.pdfGoogle ScholarDigital Library
- H. Kellerer, U. Pferschy, and D. Pisinger, Knapsack problems. hskip 1em plus 0.5em minus 0.4emrelax Springer, 2004.Google Scholar
- C. Chekuri and S. Khanna, "A polynomial time approximation scheme for the multiple knapsack problem," SIAM J. Comput., vol. 35, no. 3, pp. 713--728, 2005. [Online]. Available: https://doi.org/10.1137/S0097539700382820Google ScholarDigital Library
- NOAA, "GOES-16: A GAME-CHANGER FOR FIGHTING DEADLY WILDFIRES," https://www.goes-r.gov/featureStories/goes16Wildfires.html, 2018, last access: 01/02/21.Google Scholar
- NOAA, "GOES-R Fire Detection and Characterization," https://www.goes-r.gov/education/docs/fs_fire.pdf, 2019, last access: 01/02/21.Google Scholar
- C. Reiss, J. Wilkes, and J. L. Hellerstein, "Google cluster-usage traces: format+schema," Google Inc., Mountain View, CA, USA, Technical Report, Nov. 2011, revised 2014--11--17 for version 2.1. Posted at https://github.com/google/cluster-data.Google Scholar
- S. P. Vanderwiel and D. J. Lilja, "Data prefetch mechanisms," ACM Comput. Surv., vol. 32, no. 2, pp. 174--199, 2000. [Online]. Available: https://doi.org/10.1145/358923.358939Google ScholarDigital Library
- S. Mittal, "A survey of recent prefetching techniques for processor caches," ACM Comput. Surv., vol. 49, no. 2, pp. 35:1--35:35, 2016. [Online]. Available: https://doi.org/10.1145/2907071Google ScholarDigital Library
- A. J. Smith, "Sequentiality and prefetching in database systems," ACM Trans. Database Syst., vol. 3, no. 3, pp. 223--247, 1978. [Online]. Available: https://doi.org/10.1145/320263.320276Google ScholarDigital Library
- H. Wedekind and G. Zö rntlein, "Prefetching in realtime database applications," in Proceedings of the 1986 ACM SIGMOD International Conference on Management of Data, Washington, DC, USA, May 28--30, 1986, C. Zaniolo, Ed.hskip 1em plus 0.5em minus 0.4emrelax ACM Press, 1986, pp. 215--226. [Online]. Available: https://doi.org/10.1145/16894.16876Google ScholarDigital Library
- S. Chen, A. Ailamaki, P. B. Gibbons, and T. C. Mowry, "Improving hash join performance through prefetching," in Proceedings of the 20th International Conference on Data Engineering, ICDE 2004, 30 March - 2 April 2004, Boston, MA, USA, Z. M. Ö zsoyoglu and S. B. Zdonik, Eds.hskip 1em plus 0.5em minus 0.4emrelax IEEE Computer Society, 2004, pp. 116--127. [Online]. Available: https://doi.org/10.1109/ICDE.2004.1319989Google Scholar
- M. Annavaram, J. M. Patel, and E. S. Davidson, "Call graph prefetching for database applications," ACM Trans. Comput. Syst., vol. 21, no. 4, pp. 412--444, 2003. [Online]. Available: https://doi.org/10.1145/945506.945509Google ScholarDigital Library
- Y. O. Kocc berber, B. Falsafi, and B. Grot, "Asynchronous memory access chaining," Proc. VLDB Endow., vol. 9, no. 4, pp. 252--263, 2015. [Online]. Available: http://www.vldb.org/pvldb/vol9/p252-kocberber.pdfGoogle ScholarDigital Library
- I. T. Bowman and K. Salem, "Optimization of query streams using semantic prefetching," in Proceedings of the ACM SIGMOD International Conference on Management of Data, Paris, France, June 13--18, 2004, G. Weikum, A. C. Kö nig, and S. Deßloch, Eds.hskip 1em plus 0.5em minus 0.4emrelax ACM, 2004, pp. 179--190. [Online]. Available: https://doi.org/10.1145/1007568.1007591Google ScholarDigital Library
- L. Battle, R. Chang, and M. Stonebraker, "Dynamic prefetching of data tiles for interactive visualization," in Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016, F. Özcan, G. Koutrika, and S. Madden, Eds.hskip 1em plus 0.5em minus 0.4emrelax ACM, 2016, pp. 1363--1375. [Online]. Available: https://doi.org/10.1145/2882903.2882919Google ScholarDigital Library
- D. Kossmann, "The state of the art in distributed query processing," ACM Comput. Surv., vol. 32, no. 4, pp. 422--469, 2000. [Online]. Available: https://doi.org/10.1145/371578.371598Google ScholarDigital Library
- K. Amiri, S. Park, R. Tewari, and S. Padmanabhan, "Dbproxy: A dynamic data cache for web applications," in Proceedings of the 19th International Conference on Data Engineering, March 5--8, 2003, Bangalore, India, U. Dayal, K. Ramamritham, and T. M. Vijayaraman, Eds.hskip 1em plus 0.5em minus 0.4emrelax IEEE Computer Society, 2003, pp. 821--831. [Online]. Available: https://doi.org/10.1109/ICDE.2003.1260881Google Scholar
- C. Bornhö vd, M. Altinel, C. Mohan, H. Pirahesh, and B. Reinwald, "Adaptive database caching with dbcache," IEEE Data Eng. Bull., vol. 27, no. 2, pp. 11--18, 2004. [Online]. Available: ftp://ftp.research.microsoft.com/pub/debull/A04june/bornhoevd.psGoogle Scholar
- P. Larson, J. Goldstein, and J. Zhou, "Mtcache: Transparent mid-tier database caching in SQL server," in Proceedings of the 20th International Conference on Data Engineering, ICDE 2004, 30 March - 2 April 2004, Boston, MA, USA, Z. M. Özsoyoglu and S. B. Zdonik, Eds.hskip 1em plus 0.5em minus 0.4emrelax IEEE Computer Society, 2004, pp. 177--188. [Online]. Available: https://doi.org/10.1109/ICDE.2004.1319994Google Scholar
- S. Podlipnig and L. Böszörményi, "A survey of web cache replacement strategies," ACM Comput. Surv., vol. 35, no. 4, pp. 374--398, 2003. [Online]. Available: https://doi.org/10.1145/954339.954341Google ScholarDigital Library
- J. Mertz and I. Nunes, "Understanding application-level caching in web applications: A comprehensive introduction and survey of state-of-the-art approaches," ACM Comput. Surv., vol. 50, no. 6, pp. 98:1--98:34, 2018. [Online]. Available: https://doi.org/10.1145/3145813Google ScholarDigital Library
Index Terms
- EIRES: Efficient Integration of Remote Data in Event Stream Processing
Recommendations
Efficient Integration of Compiler-Directed Cache Coherence and Data Prefetching
Cache coherence enforcement and memory latency reduction and hiding are very important and challenging problems in the design of large-scale distributed shared-memory (DSM) multiprocessors. We propose an integrated approach to solve these problems ...
Efficient Integration of Compiler-Directed Cache Coherence and Data Prefetching
IPDPS '00: Proceedings of the 14th International Symposium on Parallel and Distributed ProcessingCache coherence enforcement and memory latency reduction and hiding are very important and challenging problems in the design of large-scale distributed shared-memory (DSM) multiprocessors. We propose an integrated framework to solve these problems ...
Maintaining Cache Coherence through Compiler-Directed Data Prefetching
In this paper, we propose a compiler-directed cache coherence scheme which makes use of data prefetching to enforce cache coherence in large-scale distributed shared-memory (DSM) systems. TheCache Coherence With Data Prefetching(CCDP) scheme uses ...
Comments