Abstract
Let T be a text of length n on an alphabet \(\Sigma \) of size \(\sigma \), and let \(H_0\) be the zero-order empirical entropy of T. We show that the LZ77 factorization of T can be computed in \(nH_0+o(n\log \sigma ) + \mathcal {O}(\sigma \log n)\) bits of working space with an online algorithm running in \(\mathcal {O}(n\log n)\) time. Previous space-efficient online solutions either work in compact space and \(\mathcal {O}(n\log n)\) time, or in succinct space and \(\mathcal {O}(n\log ^3 n)\) time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Belazzougui, D., Cunial, F., Gagie, T., Prezza, N., Raffinot, M.: Composite repetition-aware data structures. In: Cicalese, F., Porat, E., Vaccaro, U. (eds.) CPM 2015. LNCS, vol. 9133, pp. 26–39. Springer, Heidelberg (2015)
Crochemore, M., Ilie, L.: Computing longest previous factor in linear time and applications. Information Processing Letters 106(2), 75–80 (2008)
Crochemore, M., Ilie, L., Smyth, W.F.: A simple algorithm for computing the Lempel-Ziv factorization. In: 18th Data Compression Conference (DCC 2008), pp. 482–488. IEEE Computer Society Press, Los Alamitos (2008)
Ferragina, P., Manzini, G.: Opportunistic data structures with applications. In: Proceedings of the 41st Annual Symposium on Foundations of Computer Science, 2000, pp. 390–398. IEEE (2000)
Ferragina, P., Manzini, G.: Indexing compressed text. Journal of the ACM (JACM) 52(4), 552–581 (2005)
Ferragina, P., Manzini, G., Mäkinen, V., Navarro, G.: An alphabet-friendly FM-index. In: Apostolico, A., Melucci, M. (eds.) SPIRE 2004. LNCS, vol. 3246, pp. 150–160. Springer, Heidelberg (2004)
Kärkkäinen, J., Kempa, D., Puglisi, S.J.: Linear time Lempel-Ziv factorization: simple, fast, small. In: Fischer, J., Sanders, P. (eds.) CPM 2013. LNCS, vol. 7922, pp. 189–200. Springer, Heidelberg (2013)
Kreft, S., Navarro, G.: Self-index based on LZ77 (Ph.D. thesis) (2011). arXiv preprint arXiv:1112.4578
Kreft, S., Navarro, G.: Self-indexing based on LZ77. In: Giancarlo, R., Manzini, G. (eds.) CPM 2011. LNCS, vol. 6661, pp. 41–54. Springer, Heidelberg (2011)
Lempel, A., Ziv, J.: On the complexity of finite sequences. IEEE Transactions on Information Theory 22(1), 75–81 (1976)
Navarro, G., Nekrich, Y.: Optimal dynamic sequence representations. SIAM Journal on Computing 43(5), 1781–1806 (2014)
Navarro, G., Raffinot, M.: Practical and flexible pattern matching over Ziv-Lempel compressed text. Journal of Discrete Algorithms 2(3), 347–371 (2004)
Ohlebusch, E., Gog, S.: Lempel-Ziv factorization revisited. In: Giancarlo, R., Manzini, G. (eds.) CPM 2011. LNCS, vol. 6661, pp. 15–26. Springer, Heidelberg (2011)
Okanohara, D., Sadakane, K.: An online algorithm for finding the longest previous factors. In: Halperin, D., Mehlhorn, K. (eds.) ESA 2008. LNCS, vol. 5193, pp. 696–707. Springer, Heidelberg (2008)
Policriti, A., Gigante, N., Prezza, N.: Average linear time and compressed space construction of the Burrows-Wheeler transform. In: Dediu, A.-H., Formenti, E., MartÃn-Vide, C., Truthe, B. (eds.) LATA 2015. LNCS, vol. 8977, pp. 587–598. Springer, Heidelberg (2015)
Starikovskaya, T.: Computing Lempel-Ziv factorization online. In: Rovan, B., Sassone, V., Widmayer, P. (eds.) MFCS 2012. LNCS, vol. 7464, pp. 789–799. Springer, Heidelberg (2012)
Yamamoto, J., I, T., Bannai, H., Inenaga, S., Takeda, M.: Faster compact on-line Lempel-Ziv factorization. In: 31st International Symposium on Theoretical Aspects of Computer Science (STACS 2014). Leibniz International Proceedings in Informatics (LIPIcs), vol. 25, pp. 675–686. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, Dagstuhl (2014)
Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Transactions on information theory 23(3), 337–343 (1977)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Policriti, A., Prezza, N. (2015). Fast Online Lempel-Ziv Factorization in Compressed Space. In: Iliopoulos, C., Puglisi, S., Yilmaz, E. (eds) String Processing and Information Retrieval. SPIRE 2015. Lecture Notes in Computer Science(), vol 9309. Springer, Cham. https://doi.org/10.1007/978-3-319-23826-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-23826-5_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23825-8
Online ISBN: 978-3-319-23826-5
eBook Packages: Computer ScienceComputer Science (R0)