Speaker: | Jan Janousek |
Department of Theoretical Computer Science | |
Czech Technical University in Prague |
Title: Pattern matching in trees inspired by pattern matching in strings
Abstract: Trees are one of the most ubiquitous data structures in Computer Science. In particular, pattern matching in trees is an important tool in compiler optimization. Since many standard linear notations of subtrees are substrings of the linear notations for trees, many principles for pattern matching algorithms on strings can be used to design effective algorithms for pattern matching in trees. Though the principles may be the same, the information on the structure of the tree must be processed differently. We present an overview of using some basic principles known from pattern matching in strings, which is a well-researched algorithmic discipline, in pattern matching algorithms for trees, a much younger and smaller discipline. Many pattern matching problems for strings were successfully solved by the use of finite automata as a useful model of computation. For pattern matching in trees, it is possible to construct analogously pushdown automata reading the linear notations of the tree, or, as we call it, finite tree automata. Using the principles from pattern matching in strings to pattern matching in trees is not limited to the use of automata: as an example, we present a new algorithm of backward tree pattern matching inspired by the standard Boyer-Moore-Horspool's backward pattern-matching algorithm for strings.