Skip to main content

Fast Online Lempel-Ziv Factorization in Compressed Space

  • Conference paper
  • First Online:
String Processing and Information Retrieval (SPIRE 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9309))

Included in the following conference series:

  • International Symposium on String Processing and Information Retrieval

Abstract

Let T be a text of length n on an alphabet \(\Sigma \) of size \(\sigma \), and let \(H_0\) be the zero-order empirical entropy of T. We show that the LZ77 factorization of T can be computed in \(nH_0+o(n\log \sigma ) + \mathcal {O}(\sigma \log n)\) bits of working space with an online algorithm running in \(\mathcal {O}(n\log n)\) time. Previous space-efficient online solutions either work in compact space and \(\mathcal {O}(n\log n)\) time, or in succinct space and \(\mathcal {O}(n\log ^3 n)\) time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Belazzougui, D., Cunial, F., Gagie, T., Prezza, N., Raffinot, M.: Composite repetition-aware data structures. In: Cicalese, F., Porat, E., Vaccaro, U. (eds.) CPM 2015. LNCS, vol. 9133, pp. 26–39. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  2. Crochemore, M., Ilie, L.: Computing longest previous factor in linear time and applications. Information Processing Letters 106(2), 75–80 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  3. Crochemore, M., Ilie, L., Smyth, W.F.: A simple algorithm for computing the Lempel-Ziv factorization. In: 18th Data Compression Conference (DCC 2008), pp. 482–488. IEEE Computer Society Press, Los Alamitos (2008)

    Google Scholar 

  4. Ferragina, P., Manzini, G.: Opportunistic data structures with applications. In: Proceedings of the 41st Annual Symposium on Foundations of Computer Science, 2000, pp. 390–398. IEEE (2000)

    Google Scholar 

  5. Ferragina, P., Manzini, G.: Indexing compressed text. Journal of the ACM (JACM) 52(4), 552–581 (2005)

    Article  MathSciNet  Google Scholar 

  6. Ferragina, P., Manzini, G., Mäkinen, V., Navarro, G.: An alphabet-friendly FM-index. In: Apostolico, A., Melucci, M. (eds.) SPIRE 2004. LNCS, vol. 3246, pp. 150–160. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  7. Kärkkäinen, J., Kempa, D., Puglisi, S.J.: Linear time Lempel-Ziv factorization: simple, fast, small. In: Fischer, J., Sanders, P. (eds.) CPM 2013. LNCS, vol. 7922, pp. 189–200. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  8. Kreft, S., Navarro, G.: Self-index based on LZ77 (Ph.D. thesis) (2011). arXiv preprint arXiv:1112.4578

  9. Kreft, S., Navarro, G.: Self-indexing based on LZ77. In: Giancarlo, R., Manzini, G. (eds.) CPM 2011. LNCS, vol. 6661, pp. 41–54. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  10. Lempel, A., Ziv, J.: On the complexity of finite sequences. IEEE Transactions on Information Theory 22(1), 75–81 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  11. Navarro, G., Nekrich, Y.: Optimal dynamic sequence representations. SIAM Journal on Computing 43(5), 1781–1806 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  12. Navarro, G., Raffinot, M.: Practical and flexible pattern matching over Ziv-Lempel compressed text. Journal of Discrete Algorithms 2(3), 347–371 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  13. Ohlebusch, E., Gog, S.: Lempel-Ziv factorization revisited. In: Giancarlo, R., Manzini, G. (eds.) CPM 2011. LNCS, vol. 6661, pp. 15–26. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  14. Okanohara, D., Sadakane, K.: An online algorithm for finding the longest previous factors. In: Halperin, D., Mehlhorn, K. (eds.) ESA 2008. LNCS, vol. 5193, pp. 696–707. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  15. Policriti, A., Gigante, N., Prezza, N.: Average linear time and compressed space construction of the Burrows-Wheeler transform. In: Dediu, A.-H., Formenti, E., Martín-Vide, C., Truthe, B. (eds.) LATA 2015. LNCS, vol. 8977, pp. 587–598. Springer, Heidelberg (2015)

    Google Scholar 

  16. Starikovskaya, T.: Computing Lempel-Ziv factorization online. In: Rovan, B., Sassone, V., Widmayer, P. (eds.) MFCS 2012. LNCS, vol. 7464, pp. 789–799. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  17. Yamamoto, J., I, T., Bannai, H., Inenaga, S., Takeda, M.: Faster compact on-line Lempel-Ziv factorization. In: 31st International Symposium on Theoretical Aspects of Computer Science (STACS 2014). Leibniz International Proceedings in Informatics (LIPIcs), vol. 25, pp. 675–686. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, Dagstuhl (2014)

    Google Scholar 

  18. Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Transactions on information theory 23(3), 337–343 (1977)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nicola Prezza .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Policriti, A., Prezza, N. (2015). Fast Online Lempel-Ziv Factorization in Compressed Space. In: Iliopoulos, C., Puglisi, S., Yilmaz, E. (eds) String Processing and Information Retrieval. SPIRE 2015. Lecture Notes in Computer Science(), vol 9309. Springer, Cham. https://doi.org/10.1007/978-3-319-23826-5_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23826-5_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23825-8

  • Online ISBN: 978-3-319-23826-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics