Skip to main content

Optimal Wheeler Language Recognition

  • Conference paper
  • First Online:
String Processing and Information Retrieval (SPIRE 2023)

Abstract

A Wheeler automaton is a finite state automaton whose states admit a total Wheeler order, reflecting the co-lexicographic order of the strings labeling source-to-node paths). A Wheeler language is a regular language admitting an accepting Wheeler automaton. Wheeler languages admit efficient and elegant solutions to hard problems such as automata compression and regular expression matching, therefore deciding whether a regular language is Wheeler is relevant in applications requiring efficient solutions to those problems. In this paper, we show that it is possible to decide whether a DFA with n states and m transitions recognizes a Wheeler language in O(mn) time. This is a significant improvement over the running time \(O(n^{13} + m\log n)\) of the previous polynomial-time algorithm (Alanko et al. Information and Computation 2021). A proof-of-concept implementation of this algorithm is available in a public repository. We complement this upper bound with a conditional matching lower bound stating that, unless the strong exponential time hypothesis (SETH) fails, the problem cannot be solved in strongly subquadratic time. The same problem is known to be PSPACE-complete when the input is an NFA (D’Agostino et al. Theoretical Computer Science 2023). Together with that result, our paper essentially closes the algorithmic problem of Wheeler language recognition.

The full version of this paper can be found in [3]. Ruben Becker, Davide Cenzato, Sung-Hwan Kim, Bojana Kodric, and Nicola Prezza are funded by the European Union (ERC, REGINDEX, 101039208. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the granting authority can be held responsible for them. Alberto Policriti is supported by project National Biodiversity Future Center-NBFC (CN_00000033, CUP G23C22001110007) under the National Recovery and Resilience Plan of Italian Ministry of University and Research funded by European Union–NextGenerationEU.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    While the authors only claim \(m^{O(1)}\) time, a finer analysis yields this bound.

  2. 2.

    Our lower bound states that there is no algorithm solving all instances in \(O(m^{2-\eta })\) time. On sparse DFAs (\(m \in \varTheta (n)\)) our algorithm runs in \(O(mn) = O(m^2)\) time.

  3. 3.

    In the full version [3], we discuss more in detail how to apply [2, Sec. 4] on \(\mathcal A_{min}\).

References

  1. Alanko, J., D’Agostino, G., Policriti, A., Prezza, N.: Wheeler languages. Inf. Comput. 281, 104820 (2021). https://doi.org/10.1016/j.ic.2021.104820

    Article  MathSciNet  MATH  Google Scholar 

  2. Becker, R., et al.: Sorting Finite Automata via Partition Refinement (2023). 10.48550/arXiv. 2305.05129, to appear at ESA 2023

    Google Scholar 

  3. Becker, R., Cenzato, D., Kim, S.H., Kodric, B., Policriti, A., Prezza, N.: Optimal wheeler language recognition (2023). https://doi.org/10.48550/arXiv.2306.04737

  4. Conte, A., Cotumaccio, N., Gagie, T., Manzini, G., Prezza, N., Sciortino, M.: Computing matching statistics on wheeler DFAs. In: Data Compression Conference (DCC), pp. 150–159 (2023). https://doi.org/10.1109/DCC55655.2023.00023

  5. Cotumaccio, N., D’Agostino, G., Policriti, A., Prezza, N.: Co-lexicographically ordering automata and regular languages - part I. J. ACM 70, 1–73 (2023). https://doi.org/10.1145/3607471, (online available)

  6. Cotumaccio, N., Prezza, N.: On indexing and compressing finite automata. In: Proceedings of the 32nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 2585–2599 (2021). https://doi.org/10.1137/1.9781611976465.153

  7. D’Agostino, G., Martincigh, D., Policriti, A.: Ordering regular languages and automata: complexity. Theoret. Comput. Sci. 949, 113709 (2023). https://doi.org/10.1016/j.tcs.2023.113709

    Article  MathSciNet  MATH  Google Scholar 

  8. Eizenga, J.M., et al.: Pangenome graphs. Ann. Rev. Genomics Hum. Genet. 21(1), 139–162 (2020). https://doi.org/10.1146/annurev-genom-120219-080406, pMID: 32453966

  9. Equi, M., Grossi, R., Mäkinen, V., Tomescu, A.I.: On the complexity of string matching for graphs. In: Proceedings of the 46th International Colloquium on Automata, Languages, and Programming (ICALP), pp. 55:1–55:15 (2019). https://doi.org/10.4230/LIPIcs.ICALP.2019.55

  10. Gagie, T., Manzini, G., Sirén, J.: Wheeler graphs: a framework for BWT-based data structures. Theoret. Comput. Sci. 698, 67–78 (2017). https://doi.org/10.1016/j.tcs.2017.06.016

    Article  MathSciNet  MATH  Google Scholar 

  11. Hopcroft, J.: An \(n\log n\) algorithm for minimizing states in a finite automaton. In: Proceedings of an International Symposium on the Theory of Machines and Computations, pp. 189–196 (1971). https://doi.org/10.1016/B978-0-12-417750-5.50022-1

  12. Impagliazzo, R., Paturi, R.: On the complexity of k-SAT. J. Comput. Syst. Sci. 62(2), 367–375 (2001). https://doi.org/10.1006/jcss.2000.1727

    Article  MathSciNet  MATH  Google Scholar 

  13. Kim, S.H., Olivares, F., Prezza, N.: Faster prefix-sorting algorithms for deterministic finite automata. In: Proceedings of the 34th Annual Symposium on Combinatorial Pattern Matching (CPM), pp. 16:1–16:16 (2023). https://doi.org/10.4230/LIPIcs.CPM.2023.16

  14. Nerode, A.: Linear automaton transformations. In: Proceedings of the American Mathematical Society, vol. 9, no. 4, pp. 541–544 (1958). https://doi.org/10.2307/2033204

  15. Williams, R.: A new algorithm for optimal 2-constraint satisfaction and its implications. Theor. Comput. Sci. 348(2–3), 357–365 (2005). https://doi.org/10.1016/j.tcs.2005.09.023

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nicola Prezza .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Becker, R., Cenzato, D., Kim, SH., Kodric, B., Policriti, A., Prezza, N. (2023). Optimal Wheeler Language Recognition. In: Nardini, F.M., Pisanti, N., Venturini, R. (eds) String Processing and Information Retrieval. SPIRE 2023. Lecture Notes in Computer Science, vol 14240. Springer, Cham. https://doi.org/10.1007/978-3-031-43980-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43980-3_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43979-7

  • Online ISBN: 978-3-031-43980-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics