Succinct non-overlapping indexing
Given a text T having n characters, we consider the nonoverlapping indexing problem defined as follows: pre-process T into a data-structure, such that whenever a pattern P comes as input, we can report a maximal set of non-overlapping occurrences of P in T. The best known solution for this problem takes linear space, in which a suffix tree of T is augmented with O(n)-word data structures. A query P can be answered in optimal O(|P| + nocc) time, where nocc is the output size [Cohen and Porat, ISAAC 2009]. We present the following new result: let CSA (not necessarily a compressed suffix array) be an index of T that can compute (i) the suffix range of P in search(P) time, and (ii) a suffix array or an inverse suffix array value in tSA time; then by using CSA alone, we can answer a query P in O(search(P) + nocc · tSA) time. Additionally, we present an improved result for a generalized version of this problem called range non-overlapping indexing.
Publication Source (Journal or Book title)
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Ganguly, A., Shah, R., & Thankachan, S. (2015). Succinct non-overlapping indexing. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 9133, 185-195. https://doi.org/10.1007/978-3-319-19929-0_16