Abstract
A Wheeler automaton is a finite state automaton whose states admit a total Wheeler order, reflecting the co-lexicographic order of the strings labeling source-to-node paths). A Wheeler language is a regular language admitting an accepting Wheeler automaton. Wheeler languages admit efficient and elegant solutions to hard problems such as automata compression and regular expression matching, therefore deciding whether a regular language is Wheeler is relevant in applications requiring efficient solutions to those problems. In this paper, we show that it is possible to decide whether a DFA with n states and m transitions recognizes a Wheeler language in O(mn) time. This is a significant improvement over the running time \(O(n^{13} + m\log n)\) of the previous polynomial-time algorithm (Alanko et al. Information and Computation 2021). A proof-of-concept implementation of this algorithm is available in a public repository. We complement this upper bound with a conditional matching lower bound stating that, unless the strong exponential time hypothesis (SETH) fails, the problem cannot be solved in strongly subquadratic time. The same problem is known to be PSPACE-complete when the input is an NFA (D’Agostino et al. Theoretical Computer Science 2023). Together with that result, our paper essentially closes the algorithmic problem of Wheeler language recognition.
The full version of this paper can be found in [3]. Ruben Becker, Davide Cenzato, Sung-Hwan Kim, Bojana Kodric, and Nicola Prezza are funded by the European Union (ERC, REGINDEX, 101039208. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the granting authority can be held responsible for them. Alberto Policriti is supported by project National Biodiversity Future Center-NBFC (CN_00000033, CUP G23C22001110007) under the National Recovery and Resilience Plan of Italian Ministry of University and Research funded by European Union–NextGenerationEU.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
While the authors only claim \(m^{O(1)}\) time, a finer analysis yields this bound.
- 2.
Our lower bound states that there is no algorithm solving all instances in \(O(m^{2-\eta })\) time. On sparse DFAs (\(m \in \varTheta (n)\)) our algorithm runs in \(O(mn) = O(m^2)\) time.
- 3.
References
Alanko, J., D’Agostino, G., Policriti, A., Prezza, N.: Wheeler languages. Inf. Comput. 281, 104820 (2021). https://doi.org/10.1016/j.ic.2021.104820
Becker, R., et al.: Sorting Finite Automata via Partition Refinement (2023). 10.48550/arXiv. 2305.05129, to appear at ESA 2023
Becker, R., Cenzato, D., Kim, S.H., Kodric, B., Policriti, A., Prezza, N.: Optimal wheeler language recognition (2023). https://doi.org/10.48550/arXiv.2306.04737
Conte, A., Cotumaccio, N., Gagie, T., Manzini, G., Prezza, N., Sciortino, M.: Computing matching statistics on wheeler DFAs. In: Data Compression Conference (DCC), pp. 150–159 (2023). https://doi.org/10.1109/DCC55655.2023.00023
Cotumaccio, N., D’Agostino, G., Policriti, A., Prezza, N.: Co-lexicographically ordering automata and regular languages - part I. J. ACM 70, 1–73 (2023). https://doi.org/10.1145/3607471, (online available)
Cotumaccio, N., Prezza, N.: On indexing and compressing finite automata. In: Proceedings of the 32nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 2585–2599 (2021). https://doi.org/10.1137/1.9781611976465.153
D’Agostino, G., Martincigh, D., Policriti, A.: Ordering regular languages and automata: complexity. Theoret. Comput. Sci. 949, 113709 (2023). https://doi.org/10.1016/j.tcs.2023.113709
Eizenga, J.M., et al.: Pangenome graphs. Ann. Rev. Genomics Hum. Genet. 21(1), 139–162 (2020). https://doi.org/10.1146/annurev-genom-120219-080406, pMID: 32453966
Equi, M., Grossi, R., Mäkinen, V., Tomescu, A.I.: On the complexity of string matching for graphs. In: Proceedings of the 46th International Colloquium on Automata, Languages, and Programming (ICALP), pp. 55:1–55:15 (2019). https://doi.org/10.4230/LIPIcs.ICALP.2019.55
Gagie, T., Manzini, G., Sirén, J.: Wheeler graphs: a framework for BWT-based data structures. Theoret. Comput. Sci. 698, 67–78 (2017). https://doi.org/10.1016/j.tcs.2017.06.016
Hopcroft, J.: An \(n\log n\) algorithm for minimizing states in a finite automaton. In: Proceedings of an International Symposium on the Theory of Machines and Computations, pp. 189–196 (1971). https://doi.org/10.1016/B978-0-12-417750-5.50022-1
Impagliazzo, R., Paturi, R.: On the complexity of k-SAT. J. Comput. Syst. Sci. 62(2), 367–375 (2001). https://doi.org/10.1006/jcss.2000.1727
Kim, S.H., Olivares, F., Prezza, N.: Faster prefix-sorting algorithms for deterministic finite automata. In: Proceedings of the 34th Annual Symposium on Combinatorial Pattern Matching (CPM), pp. 16:1–16:16 (2023). https://doi.org/10.4230/LIPIcs.CPM.2023.16
Nerode, A.: Linear automaton transformations. In: Proceedings of the American Mathematical Society, vol. 9, no. 4, pp. 541–544 (1958). https://doi.org/10.2307/2033204
Williams, R.: A new algorithm for optimal 2-constraint satisfaction and its implications. Theor. Comput. Sci. 348(2–3), 357–365 (2005). https://doi.org/10.1016/j.tcs.2005.09.023
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Becker, R., Cenzato, D., Kim, SH., Kodric, B., Policriti, A., Prezza, N. (2023). Optimal Wheeler Language Recognition. In: Nardini, F.M., Pisanti, N., Venturini, R. (eds) String Processing and Information Retrieval. SPIRE 2023. Lecture Notes in Computer Science, vol 14240. Springer, Cham. https://doi.org/10.1007/978-3-031-43980-3_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-43980-3_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43979-7
Online ISBN: 978-3-031-43980-3
eBook Packages: Computer ScienceComputer Science (R0)