Uni-Logo

Department of Computer Science
 

Technical Report No. 34, July 1991 - Abstract


Petteri Jokinen, Esko Ukkonen:
Using Suffix Automata as an Index for Approximate String Searches

This report is not available in electronic form!

The problem of finding all approximate occurences p' of a pattern string P in a text string T such that the edit distance between P and P' is greater than or equal k is considered. We concentrate on a scheme in which T is first preprocessed to make the subsequent searches with different P fast. It is shown that if T is preprocessed into an annotated suffix automaton then the search phase can be performed fast by applying dynamic programming over P and a part of the transition graph of the automaton. The method is applicable for edit distances with general edit operation costs. The preprocessing needs time and space O(|T|). The search algorithm runs in the worst case in time O(|P||T|), and in the best case in time O(|P|).