Institut für Informatik

Technical Report No. 129 - Abstract

Stefan Edelkamp
Dictionary Automaton in Optimal Space

In this paper we describe the data structure of a time and space efficient string dictionary automaton, providing insertion and deletion of strings and finite state machine based substring searching. If the input alphabet is bounded and the pattern are mutually substring free an optimal worst case bound of O(|m|) for both insertion and deletion of a pattern m is achieved. The underlying structure is a multi suffix tree. Let M be the set of strings stored in the dictionary resulting from an arbitrary sequence of \emph{Insert} and \emph{Delete} operations. One main result in this paper is that the space complexity of the augmented multi suffix tree is O(d) with d=\sum_{m \in M} |m|. This bound is optimal if we assume that each pattern in the dictionary uses at least linear space. Additionally, we present a new on-line substring search algorithm to find one substring of x in time O(|x|). The novelity of the approach is to realize a multi suffix tree based finite state automaton for substring searching that performs a state transition in amortized constant time.

Report No. 129 (PostScript)