### Technical Report No. 38, October 1991 - Abstract

**Anne Brüggemann-Klein, Derick Wood:**

*Deterministic Regular Languages*

The ISO standard for Standard Generalized Markup Language (SGML)
provides a syntactic meta-language for the definition of textual markup systems.
In the standard the right hand sides of productions are called *content
models* and they are based on regular expressions.
The allowable regular expressions are those that are "unambiguous"
as defined by the standard.
Unfortunately, the standard's use of the term "unambiguous" does not
correspond to the two well known notions, since not all
regular languages are denoted by "unambiguous" expressions.
Furthermore, the standard's definition of "unambiguous" is somewhat
vague.
Therefore, we provide a precise definition of "unambiguous expressions"
and rename them
deterministic regular expressions to avoid any confusion.
A regular expression *E* is *deterministic* if the canonical
epsilon-free
finite automaton *M_E* recognizing L*(E)*
is deterministic.
A regular language is *deterministic*
if there is a deterministic expression that denotes it.
We give a Kleene-like theorem for deterministic regular languages and
we characterize them in terms of the structural
properties of the minimal deterministic automata recognizing them.
The latter result enables us to decide if a
given regular expression denotes a deterministic regular
language and, if so, to construct an equivalent deterministic expression.

report38.ps.gz