word-parse/words/1905.05519.txt

A (co)algebraic theory of succinct automata✩
Gerco van Heerdta , Joshua Moermana,b , Matteo Sammartinoa , Alexandra
Silvaa
a University

College London
University

arXiv:1905.05519v1 [cs.FL] 14 May 2019

b Radboud

Abstract
The classical subset construction for non-deterministic automata can be generalized to other side-eﬀects captured by a monad. The key insight is that both the
state space of the determinized automaton and its semantics—languages over
an alphabet—have a common algebraic structure: they are Eilenberg-Moore algebras for the powerset monad. In this paper we study the reverse question to
determinization. We will present a construction to associate succinct automata
to languages based on diﬀerent algebraic structures. For instance, for classical
regular languages the construction will transform a deterministic automaton
into a non-deterministic one, where the states represent the join-irreducibles of
the language accepted by a (potentially) larger deterministic automaton. Other
examples will yield alternating automata, automata with symmetries, CABAstructured automata, and weighted automata.

1. Introduction
Non-deterministic automata are often used to provide compact representations of regular languages. Take, for instance, the language
L = {w ∈ {a, b}∗ | |w| > 2 and the 3rd symbol from the right is an a}.
There is a simple non-deterministic automaton accepting it (below, top automaton) and it is not very diﬃcult to see that the smallest deterministic automaton
(below, bottom automaton) will have 8 states.
a, b
a, b
a
s1
s2
s3
s4
a, b

✩ This work was partially supported by ERC starting grant ProFoundNet (679127) and a
Leverhulme Prize (PLP-2016-129).

Preprint submitted to Elsevier

May 15, 2019

a
a

1

b b

a

b
b

14

a

12

b
a

13

123

a

1234

b

a
124

b

a

134

b
The labels we chose for the states of the deterministic automaton are not
coincidental—they represent the subsets of states of the non-deterministic automaton that would be obtained when constructing a deterministic one using
the classical subset construction.
The question we want to study in this paper has as starting point precisely
the observation that non-deterministic automata provide compact representations of languages and hence are more amenable to be used in algorithms and
promote scalability. In fact, the origin of our study goes back to our own work
on automata learning [15], where we encountered large nominal automata that,
in order for the algorithm to work for more realistic examples, had to be represented non-deterministically. In other recent work [7, 3], diﬀerent forms of nondeterminism are used to learn compact representations of regular languages.
This left us wondering whether other side-effects could be used to overcome
scalability issues.
Moggi [16] introduced the idea that monads could be used a general abstraction for side-eﬀects. A monad is a triple (T, η, µ) in which T is an endofunctor
over a category whose objects can be thought of as capturing pure computations.
The monad is equipped with a unit η : X → T X, a natural transformation that
enables embedding any pure computation into an eﬀectful one, and a multiplication µ : T T X → T X that allows ﬂattening nested eﬀectful computations.
Examples of monads capturing side-eﬀects include powerset (non-determinism)
and distributions (randomness).
Monads have been used extensively in programming language semantics (see
e.g. [22] and references therein). More recently, they were used in categorical
studies of automata theory [6]. One example of a construction in which they
play a key role is a generalization of the classical subset construction to a class
of automata [21, 20], which we will describe next.
The classical subset construction, connecting non-deterministic and deterministic automata, can be described concisely by the following diagram.
X

{−}

δ

2 × P(X)A

P(X)

l

2A

∗

<ǫ?,∂>
δ♯
id×l

A

∗

2 × (2A )A

We omit initial states and represent a non-deterministic automaton as a pair
(X, δ) where X is the state space and δ : X → 2 × P(X) is the transition
function which has in the ﬁrst component the (non-)ﬁnal state classiﬁer. The
language semantics of a non-deterministic automaton (X, δ) is obtained by ﬁrst
constructing a deterministic automaton (P(X), δ ♯ ) which has a larger state space
2

consisting of subsets of the original state space and then computing the accepted
language of the determinized automaton. The language map l associating the accepted language to a state is a universal map: for every deterministic automaton
(Q, Q → 2 × QA ) the map l is the unique map into the automaton of languages
∗

∗

<ǫ?,∂>

∗

(2A , 2A −−−−−→ 2 × (2A )A ).
The universal property of the automaton of languages inspired the development of a categorical generalization of automata theory, including of the subset
construction which we detail below. In particular, we can consider general aut
tomata as pairs (X, X −
→ F X) where the transition dynamics t is parametric on
a functor F . Such pairs are usually called coalgebras for the functor F [18]. For
a wide class of functors F , the category of coalgebras has a ﬁnal object (Ω, ω),
the so-called final coalgebra, which plays the analogue role to languages.
The classical subset construction was generalized in previous work [21] by
replacing deterministic automata with coalgebras for a functor F and the powerset monad with a suitable monad T . As above, it can be summarized in a
diagram:
X
δ

FTX

η

TX

l

Ω
ω

δ♯
Fl

FΩ

The monad T will be the structure we will explore to enable succinct representations. The crucial ingredient in generalizing the subset construction was the
observation that the target of the transition dynamics—2 × P(−)A —and the
∗
set of languages—2A —both have a complete join-semilattice structure. This
enables one to deﬁne the determinized automaton as a unique lattice extension
of the non-deterministic one, and, moreover, the language map l preserves the
semantics: l({s1 , s2 }) = l({s1 }) ∪ l({s2 }).
This latter somewhat trivial observation was also exploited in the work of
Bonchi and Pous [8] in deﬁning an eﬃcient algorithm for language equivalence of
NFAs by using coinduction-up-to. Join-semilattices are precisely the EilenbergMoore algebras of the powerset monad, and one can show that if a functor has
a ﬁnal coalgebra in Set, this can be lifted to the category of Eilenberg-Moore
algebras of a monad T (T -algebras). This makes it possible to construct the
more general diagram above, where the coalgebra structure is generalized using
a functor F and a monad T . The only assumptions for the existence of T -algebra
maps δ ♯ and l are the existence of a ﬁnal coalgebra for F in Set and that F T X
can be given a T -algebra structure.
In this paper we ask the reverse question—given a deterministic automaton,
if we assume the state space has a join-semilattice structure, can we build a corresponding succinct non-deterministic one? More generally, given an F -coalgebra
in the category of T -algebras, can we build a succinct F T -coalgebra in the base
category that represents the same behavior?
We will provide an abstract framework to understand this construction,
based on previous work by Arbib and Manes [4]. Our abstract framework relies on alternative, more modern, presentation of some of their results. Due to
our focus on set-based structures, we will conduct our investigation within the
category Set, which enables us to provide eﬀective procedures. This does mean
that not all of the results due to Arbib and Manes will be given in their original
3

generality. We present a comprehensive set of examples that will illustrate the
versatility of the framework. We also discuss more algorithmic aspects that are
essential if the present framework is to be used as an optimization, for instance
as part of a learning algorithm.
After recalling basic facts about monads and structured automata in Section 2, the rest of this paper is organized as follows:
• In Section 3 we introduce a general notion of generators for a T -algebra,
and we show that automata whose state space form a T -algebra—which we
call T -automata—admit an equivalent T -succinct automaton, deﬁned over
generators. We also characterize minimal generators and give a condition
under which they are globally minimal in size.
• In Section 4 we give an eﬀective procedure to ﬁnd a minimal set of generators for a T -algebra, and we present an algorithm that uses that procedure
to compute the T -succinct version of a given T -automaton. The algorithm
works by ﬁrst minimising the T -automaton: the explicit algebraic structure
allows states that correspond to algebraic combinations of other states to
be detected, and then discarded when generators are computed.
• In Section 5 we show how the algorithm of Section 4 can be applied to
“plain” ﬁnite automata—without any algebraic structure—in order to derive an equivalent T -succinct automaton. We conclude with a result about
the compression power of our construction: it produces an automaton that
is at least as small as the minimal version of the original automaton.
• Finally, in Section 6 we give several examples, and in Section 7 we discuss
related and future work.
2. Preliminaries
Side-eﬀects and diﬀerent notions of non-determinism can be conveniently
captured as a monad T on a category C. A monad T = (T, µ, η) is a triple
consisting of an endofunctor T on C and two natural transformations: a unit
η : Id ⇒ T and a multiplication µ : T 2 ⇒ T . They satisfy the following laws:
µ ◦ ηT = id = µ ◦ T η
µ ◦ µT = µ ◦ T µ.
S
An example is the triple (P, {−}, ) where P denotes the powerset functor in
Set that assigns to each setSthe set of all its subsets, {−} is the function that
returns a singleton set, and is just union of sets.
Given a monad T , the category of CT of Eilenberg-Moore algebras over T , or
simply T -algebras, has as objects pairs (X, h) consisting of an object X, called
carrier, and a morphism h : T X → X such that h◦µX = h◦T h and h◦ηX = idX .
A T -homomorphism between two T -algebras (X, h) and (Y, k) is a morphism
f : X → Y such that f ◦ h = k ◦ T f .
We will often refer to a T -algebra (X, h) as X if h is understood or if its
speciﬁc deﬁnition is irrelevant. Given an object X, (T X, µX ) is a T -algebra
called the free T -algebra on X. Given an object U and a T -algebra (V, v), there
is a bijective correspondence between T -algebra homomorphisms T U → V and
morphisms U → V : for a T -algebra homomorphism f : T U → V , deﬁne f † =
4

f ◦ η : U → V ; for a morphism g : U → V , deﬁne g ♯ = v ◦ T g : T U → V . Then
g ♯ is a T -algebra homomorphism called the free T -extension of g, and we have
f †♯ = f

g ♯† = g.

(1)

Furthermore, for all objects S and morphisms h : S → U ,
g ♯ ◦ T h = (g ◦ h)♯ .

(2)

Example 2.1. For the monad P the associated Eilenberg-Moore category is
the category of (complete) join-semilattices.
Given a set X, the free P-algebra
S
on X is the join-semilattice (PX, ) of subsets of X with the union operation
as join.
Although some results are completely abstract, the central deﬁnition of minimal generators in Section 3 is speciﬁc to monads T on the category Set. Therefore we restrict ourselves to this setting. More precisely, we consider automata
over a ﬁnite alphabet A with outputs in a set O. In order to deﬁne automata
in SetT as (pointed) coalgebras for the functor O × (−)A , we need to lift this
functor from Set to SetT . Such a lifting corresponds to a distributive law of T
over O × (−)A [see e.g., 13]. A distributive law of the monad T over a functor
F : Set → Set is a natural transformation ρ : T F ⇒ F T satisfying ρ ◦ ηF = F η
and F µ ◦ ρT ◦ T ρ = ρ ◦ µF . In most examples we will deﬁne a T -algebra
structure β : T O → O on O, which is well known to induce a distributive law
ρ : T (O × (−)A ) ⇒ O × T (−)A given by
β×ρ′

hT π1 ,T π2 i

ρX = T (O × X A ) −−−−−−−→ T O × T (X A ) −−−−X
→ O × T (X)A

(3)

for any set X, where ρ′ (U )(a) = T (λf : A → X.f (a)). In general, we assume
an arbitrary distributive law ρ : T (O × (−)A ) ⇒ O × T (−)A , which gives us the
following notion of automaton.
Definition 2.2 (T -automaton). A T -automaton is a triple (X, i : 1 → X, δ : X →
O × X A ), where X is an object of SetT denoting the state space of the automaton, i is a function designating the initial state, and δ is a T -algebra map
assigning an output and transitions to each state.
Notice that the initial state map i : 1 → X in the above deﬁnition is not
required to be a T -algebra map. However, it corresponds to the T -algebra map
i♯ : T 1 → X. Thus, a T -automaton is an automaton in SetT .
The functor F (X) = O × X A has a ﬁnal coalgebra in SetT [12] that can be
used to deﬁne the language accepted by a T -automaton.
Definition 2.3 (Language accepted). Given a T -automaton (X, i : 1 → X, δ : X →
∗
O × X A ), the language accepted by X is l ◦ i : 1 → OA , where l is the ﬁnal
coalgebra map. In the diagram below, ω is the ﬁnal coalgebra.
1

i

X

l

∗

ω(ϕ) = (ϕ(ε), λa w.ϕ(aw))
l(x)(ε) = π1 (δ(x))

ω

δ

O×X

OA

A
A id×l

∗

O × (OA )A

We use ε to denote the empty word.
5

l(x)(aw) = l(π2 (δ(x))(a))(w)

If the monad T is ﬁnitary, then the category SetT is locally ﬁnitely presentable, and hence it admits (strong epi, mono)-factorizations [2]. As in [4], we
use these factorizations to quotient the state-space of an automaton under language equivalence. The transition structure, γ, is obtained by diagonalization
via the factorization system. Diagramatically:
1
j

i

e

X

OA

γ

δ

O × XA

m

M
A

id×e

O × MA

∗

(4)

ω
id×m

A

∗

O × (OA )A

Here the epi e and mono m are obtained by factorizing the ﬁnal coalgebra map
∗
l : X → OA . We call the quotient automaton (M, j, γ) the observable quotient
of (X, i, δ).
3. T -succinct automata
Given a T -automaton X = (X, i, δ), our aim is to obtain an equivalent
automaton in Set with transition function Y → O × T (Y )A , where Y is smaller
than X.1 The key idea is to ﬁnd generators for X. Our deﬁnition of generators
is equivalent to the deﬁnition of a scoop due to Arbib and Manes [4, Section 7,
Deﬁnition 8].
Definition 3.1 (Generators for an algebra). We say that a set G is a set of
generators for a T -algebra X whenever there exists a function g : G → X such
that g ♯ : T G → X is a split epi in Set.
The intuition of requiring a split epi is that every element of X can now be
decomposed into a “combination” (deﬁned by T ) of elements of G. We show two
simple results on generators, which will allow us to ﬁnd initial sets of generators
for a given T -algebra.
Lemma 3.2. The carrier of any T -algebra X is a set of generators for it.
χ

Proof. Let T X −
→ X be the T -algebra structure on X. Then idX satisﬁes id♯X =
χ, and χ is a split epi because it is required to satisfy χ ◦ ηX = idX .
Lemma 3.3. Any set X is a set of generators for the free T -algebra T X.
♯
Proof. Follows directly from the fact that ηX : X → T X satisﬁes ηX
= idT X .

Once we have a set of generators G for X, we can deﬁne an equivalent free
representation of X , that is, an automaton whose state space is freely generated
from G.
1 Here, we are abusing notation and using O and A for both the objects in SetT and in the
base category Set. In particular, we use T C to also denote the free T -algebra over C.

6

Proposition 3.4 (Free representation of an automaton [4, Section 7, Proposition 9]). The free algebra T G forms the state space of an automaton equivalent
to X .
Proof. Let g : G → X witness G being a set of generators for X and let s : X →
T G be a right inverse of g ♯ . Recall that X = (X, i, δ) and deﬁne
i

s

j=1−
→X −
→ TG
g

id×sA

δ

γ=G−
→X−
→ O × X A −−−−→ O × (T G)A
Then (T G, j, γ ♯ ) is an automaton. We will show that g ♯ : T G → X is an automaton homomorphism. We have g ♯ ◦ j = g ♯ ◦ s ◦ i = i, and, writing F for the
functor O × (−)A and χ for the T -algebra structure on X,
TG

Tg

Tδ

TX

TFX

TFs

2

ρ
g♯

ρ

TFTG

FTs
id

FTX

χ

F T 2G

Fµ

FTG

F T g♯

FTX

1

3

F g♯

Fχ
δ

X

FX

commutes. Here 1 commutes because δ is a T -algebra homomorphism, 2 commutes by naturality of the distributive law ρ, and 3 commutes because g ♯ is a
T -algebra homomorphism. The triangle on the left unfolds the deﬁnition of g ♯ ,
and the remaining triangle commutes by s being right inverse to g ♯ . Note that
the composition in the top row of the diagram is γ ♯ . We conclude that g ♯ is
an automaton homomorphism, which using the ﬁnality in Deﬁnition 2.3 implies
that (T G, j, γ ♯ ) accepts the same language as X .
The state space T G of this free representation can be extremely large. Fortunately, the fact that T G is a free algebra allows for a much more succinct
version of this automaton.
Definition 3.5 (T -succinct automaton). Given an automaton of the form
(T X, i, δ), where T X is the free T -algebra on X, the corresponding T -succinct
automaton is the triple (X, i, δ ◦ η). The language accepted by the T -succinct
automaton is the language l ◦ i accepted by (T X, i, δ):
1
i

X
δ◦η

O × (T X)A

η

TX

l

OA

∗

ω

δ
id×lA

∗

O × (OA )A

The goal of our construction is to build a T -succinct automaton from a set
of generators that is minimal in a way that we will deﬁne now. In what follows
below we use the following piece of notation: if U and V are sets such that
U ⊆ V , then we write ιU
V for the inclusion map U → V .
7

Definition 3.6 (Minimal generators). Given a T -algebra X and a set of generators G for X witnessed by g : G → X, we say that r ∈ G is redundant if there
G\{r} ♯
exists a U ∈ T (G \ {r}) satisfying (g ◦ ιG
) (U ) = g(r); all other elements
are said to be isolated [4]2 . We call G a minimal set of generators for X if G
contains no redundant elements.
A minimal set of generators is not necessarily minimal in size. However,
under certain conditions this is the case. The following result was mentioned
but not proved by Arbib and Manes [4], who showed that its conditions are
satisﬁed for any ﬁnitely generated P-algebra. We note that these conditions do
not apply (in general) to any of the further examples in Section 6.
Proposition 3.7. If a T -algebra X is generated by the isolated elements I of
the set of generators X (Lemma 3.2) with their inclusion map ιIX and I is finite,
then there is no set of generators for X smaller than I, and every minimal set
of generators for X has the same size as I.
g

Proof. Let G −
→ X be a set of generators for X, and assume towards a contradiction that G is smaller than I. Then there must be an i ∈ I such that there
is no v ∈ G satisfying g(v) = i. Let g ′ : G → X \ {i} be pointwise equal to
g. Because g ♯ is a split epi and thus surjective, there is a U ∈ T G such that
g ♯ (U ) = i. Note that by (2),
X\{i}

g ♯ = (ιX

T (g′ )

(ι

X\{i} ♯

)

◦ g ′ )♯ = T G −−−→ T (X \ {i}) −−X
−−−−→ X.

X\{i}

Then (id ◦ ιX
)♯ (T (g ′ )(U )) = i, contradicting the fact that i is isolated in
the full set of generators X. Thus, G cannot be smaller than I. In fact, we see
that for every i ∈ I there is a v ∈ G satisfying g(v) = i. This yields a function
h : I → G such that g ◦ h = ιIX .
Suppose G is a minimal set of generators, and take any v ∈ G not in the
image of h. We will show that v is redundant in G. Since I constitutes a set of
generators for X, there exists a U ∈ T I such that (ιIX )♯ (U ) = g(v). Then
g ♯ (T (h)(U )) = (g ◦ h)♯ (U ) = (ιIX )♯ (U ) = g(v).
It follows that v is redundant in G, which contradicts G being minimal. Therefore, h is surjective and G has the same size as I.
4. T -minimization
In this section we describe a construction to compute a “minimal” succinct
T -automaton equivalent to a given T -automaton. This crucially relies on a procedure that ﬁnds a minimal set of generators by removing redundant elements
one by one. All that needs to be done for speciﬁc monads is determining whether
an element is redundant.
2 Arbib and Manes [4] define isolated elements only for the full set X rather than relative
to a set of generators for X. Our refinement plays an important role in finding a minimal set
of generators.

8

Proposition 4.1 (Generator reduction). Given a T -algebra X and a set of
generators G for X, if r ∈ G is redundant, then G \ {r} is a set of generators
for X.
Proof. Let G′ = G \ {r} and let g ′ : G′ → X be the restriction of g : G → X to
G′ . Since r is redundant, there is a U ∈ T (G′ ) such that g ′♯ (U ) = g(r). Deﬁne
e : G → T (G′ ) by
(
U
if x = r
e(x) =
η(x) if x 6= r.
We will show that g ′♯ ◦ e = g. Consider any x ∈ G. If x = r, then
g ′♯ (e(x)) = g ′♯ (e(r)) = g ′♯ (U ) = g(r) = g(x).
If x 6= r, then, using (1),
g ′♯ (e(x)) = g ′♯ (η(x)) = g ′♯† = g ′ (x) = g(x).
Let χ : T X → X be the algebra structure on X and take any right inverse
s : X → T G of g ♯ . Then
g ′♯ ◦ e♯ ◦ s = g ′♯ ◦ µ ◦ T e ◦ s

(deﬁnition of e♯ )

= χ ◦ T (g ′♯ ) ◦ T e ◦ s

(g ′♯ is a T -algebra homomorphism)

= χ ◦ T (g ′♯ ◦ e) ◦ s

(functoriality of T )

= χ ◦ Tg ◦ s

(g ′♯ ◦ e = g as shown above)

= g♯ ◦ s

(deﬁnition of g ♯ )

= idX

(s is right inverse to g ♯ ).

We thus see that e♯ ◦ s is right inverse to g ′♯ , which means that G′ is a set of
generators for X.
If we determine that an element is isolated, there is no need to check this
again later when the set of generators has been reduced. This is thanks to the
following result.
g′

g

Proposition 4.2. If G −
→ X and G′ −→ X are sets of generators for a T ′
algebra X such that G ⊆ G and g ′ is the restriction of g to the domain G′ , then
whenever an element r ∈ G′ is isolated in G, it is also isolated in G′ .
Proof. We will show that redundant elements in G′ are also redundant in G.
If r ∈ G′ is isolated in G′ , then there exists U ∈ T (G′ \ {r}) such that (g ′ ◦
′
G′ \{r} ♯
) (U ) = g ′ (r). Note that g ′ = g ◦ ιG
ιG′
G . We have
G\{r} ♯

(g ◦ ιG

G′ \{r}

G\{r}

) (T (ιG\{r} )(U )) = (g ◦ ιG

′

G′ \{r}

◦ ιG\{r} )♯ (U )
′

G \{r}

♯
= (g ◦ ιG
G ◦ ιG\{r} ) (U )
G′ \{r}

= (g ′ ◦ ιG\{r} )♯ (U )
= g ′ (r)
= g(r),
so r is redundant in G.
9

(2)

Finally, taking the observable quotient M of a T -automaton Q preserves
generators, considering that the T -automaton homomorphism m : Q → M is a
split epi in Set under the axiom of choice.
Proposition 4.3. If Q and M are T -algebras, m : Q → M is a T -algebra
g
homomorphism that is a split epi in Set, and G −
→ Q is a set of generators for
g
m
Q, then G −
→ Q −→ M is a set of generators for M .
Proof. Let a : T Q → Q be the T -algebra structure on Q and b : T M → M the
one on M . We have
(m ◦ g)♯ = b ◦ T (m ◦ g) = b ◦ T (m) ◦ T (g) = m ◦ a ◦ T g = m ◦ g ♯
using that m is a T -algebra homomorphism. It is well known that compositions
of split epis are split epis themselves, so G is a set of generators for M .
Now we are ready to deﬁne the construction that builds a T -succinct automaton accepting the same language as a T -automaton.
Construction 4.4 (T -minimization). Starting from a T -automaton (X, i, δ),
where X has a ﬁnite set of generators, we execute the following steps.
1. Take the observable quotient (M, i0 , δ0 ) of (X, i, δ).
2. Compute a minimal set of generators G of M by starting from the full set
M and applying Proposition 4.1.
3. Compute and return the corresponding T -succinct automaton as deﬁned
in Deﬁnition 3.5 via Proposition 3.4.
Generic minimization algorithms have been proposed in the literature. For
example, Adámek et al. give a general procedure to compute the observable
quotient [1], and König and Küpper provide a generic partition reﬁnement algorithm for coalgebras, with a focus on instantiations to weighted automata [14].
None of these works provide any complexity analysis. Recently, Dorsch et al. [11]
have presented a coalgebraic Paige–Tarjan algorithm and provided a complexity analysis for a class of functors in categories with image-factorization. These
restrictions match well the ones we make, and therefore their algorithm could
be applied in our ﬁrst step. Given a ﬁnite set of generators G, the loop in the
second step involves considering each element of G and checking whether it is
redundant. If so, we will remove the element from G and continue the loop.
The redundancy check is the only part for which computability needs to be
determined in each speciﬁc setting.
Example 4.5 (Join-semilattices). We give an example of the construction in
the category JSL of complete join-semilattices. We start from a minimal Pautomaton (in JSL) that has 4 states and is depicted below on the left. The
dashed blue lines indicate the JSL structure.
a, b
b

z

b

a
a, b

y

x

⊥

y

x

a
a, b

b
10

a, b
b

Since the automaton is minimal, it is isomorphic to its observable quotient.
We start from the full set of generators {⊥, x, y, z}. Note that z is the union
of x and y, so we can eliminate it. Additionally, ⊥ is the empty union and can
be removed as well. Both x and y are isolated elements and form the unique
minimal set of generators G = {x, y} (see the remark above Proposition 3.7).
These are exactly the join-irreducibles of M . They induce by Proposition 3.4
an automaton (T G, j, γ), where γ is the same transition structure as the above
automaton, but with {x, y} substituted for z; the initial state is the singleton set
{x}. The P-succinct automaton corresponding to this minimal set of generators
(Deﬁnition 3.5) is the non-deterministic automaton shown on the right.
Note that the deﬁnition of the automaton deﬁned in Proposition 3.4 depends
on the right inverse chosen for the extension of the generator map. When the
original JSL automaton is reachable (every state is reached by some set of
words, where a set of words reaches the join of the states reached by the words it
contains), this right inverse may be chosen in such a way to recover the canonical
residual finite state automaton (RFSA), as well as the simpliﬁed canonical RFSA,
both due to Denis et al. [10]. Details are given in [23]. See [17] for conditions
under which the canonical RFSA, referred to as the jiromaton, is a state-minimal
NFA.
5. Main construction
In this section we present the main construction of the paper. Given a finite
automaton (X, i, δ) in Set, i.e., an automaton where X is ﬁnite, this construction
builds an equivalent T -succinct automaton.
The ﬁrst step is taking the reachable part R of X and converting this automaton into a T -automaton recognising the same language.
Proposition 5.1. Let (T R, î, δ̂) be the T -automaton defined as follows:
1
i

R

î=ηR ◦i
ηR

TR
A
δ̂=((id×ηR
)◦δ)♯

δ

O × RA

A
id×ηR

O × T (R)A

Then (R, i, δ) and (T R, î, δ̂) accept the same language.
Proof. The diagram above means that ηR is a coalgebra homomorphism, and
as such it preserves language. Explicitly: x ∈ R accepts the same language as
ηR (x), which in particular holds for i(⋆) and î(⋆).
Now we can T -minimize (T R, î, δ̂) (Construction 4.4), which yields an equivalent T -automaton. Notice that, R being ﬁnite, any quotient of T R has a ﬁnite
set of generators. This is a consequence of R being a set of generators for T R
(Lemma 3.3) and of generators being preserved by quotients (Proposition 4.3).
It follows that every step of the T -minimization construction terminates.
Proposition 5.2. The T -succinct automaton defined above is at least as small
as the minimal deterministic automaton equivalent to X.
11

Proof. The situation is summed up in the following commutative diagram:
G

R

g

η

TR

e

M

m

OA

∗

Here G is the ﬁnal minimal set of generators for M resulting from the construction. Commutativity follows from G being a subset of the set of generators
R.
The minimal deterministic automaton equivalent to X is obtained from R
by merging language-equivalent states. Recalling (4) and the proof of Proposition 5.1, we see that e ◦ ηR is a coalgebra homomorphism. Together with
commutativity of the above diagram, this means that the language accepted by
r ∈ G (seen as a state of R) is given by (m ◦ g)(r). Since G is a subset of R,
to show that G is at least as small as the minimal deterministic automaton, we
only have to show that diﬀerent states in G accept diﬀerent languages. That is,
we will show that m ◦ g is injective. We know that m is injective by deﬁnition;
to see that g is injective, consider r1 , r2 ∈ G such that g(r1 ) = g(r2 ). Then
g(r1 ) = g(r2 ) = g ♯ (η(r2 )). Assuming r1 6= r2 leads to the contradiction that G
is not a minimal set of generators because in this case η(r2 ) ∈ T (G \ {r1 }).
Computing the determinization T R is an expensive operation that only terminates if T preserves ﬁnite sets. One could devise an optimized version of
Construction 4.4 in which the determinization is not computed completely in
order to minimize it. Instead, we could choose to work with data structures as
Böllig et al. [7] did for non-deterministic automata, and which we generalized
in recent work [23]. In these papers, partial representations of the determinized
automaton are used in an iterative process to compute the generators of the
state space of the minimal one.
6. Examples
6.1. Monads preserving finite sets
If T preserves ﬁnite sets, then there is a naive method to ﬁnd a redundant
element: assuming a ﬁnite set of generators G for a T -algebra X, the set T (G \
{r}) is also ﬁnite for any r ∈ G. Thus, we can loop over all U ∈ T (G \ {r}) and
check if the generator map g : G → X satisﬁes g ♯ (U ) = g(r).
6.1.1. Alternating automata.
We now use our construction to get small alternating ﬁnite automata (AFAs)
over a ﬁnite alphabet A. AFAs generalize both non-deterministic and universal
automata, where the latter are the dual of non-deterministic automata: a word
is accepted when all paths reading it are accepting. In an AFA, reading a symbol
leads to a DNF formula (without negation) of next states.
We use the characterization of alternating automata due to Bertrand [5].
Given a partially ordered set (P, ≤), an upset is a subset U of P such that
whenever x ∈ U and x ≤ y, then y ∈ U . Given Q ⊆ P , we write ↑Q for the
upward closure of Q, that is the smallest upset of P containing Q. We consider

12

a

q0
b

b
q1

q2
a

a

a, b
b

q4

q0

a

q2
b

a, b
a

q3

(a) Deterministic automaton

a, b
q1

(b) Small corresponding AFA

Figure 1: Automata for the language {a, ba, bb, baa}

the monad TAlt that maps a set X to the set of all upsets of P(X). Its unit is
given by ηX (x) =↑{{x}} and its multiplication by
µX (U ) = {V ⊆ X | ∃W ∈U ∀Y ∈W ∃Z∈Y Z ⊆ V }.
The sets of sets in TAlt (X) can be seen as DNF formulae over elements of X:
the outer powerset is interpreted disjunctively and the inner one conjunctively.
Accordingly, we deﬁne an algebra structure β : TAlt (2) → 2 on the output set
2 by letting β(U ) = 1 if {1} ∈ U and β(U ) = 0 otherwise. Recall from (3) in
Section 2 that such an algebra structure induces a distributive law.
We now explicitly spell out the T -minization algorithm that turns a DFA
(X, i, δ) into a TAlt -succinct AF A.
1. Compute the reachable states R of (X, i, δ) via a standard visit of its
graph.
2. Compute the corresponding freely-generated TAlt -automaton (TAlt R, î, δ̂),
by generating all DNF formulae TAlt R on R.
3. Compute the observable quotient (M, i0 , δ0 ) of (TAlt R, î, δ̂) via a standard minimization algorithm, such as the coalgebraic Paige–Tarjan algorithm [11].
4. Compute a minimal set of generators for M as follows. Consider the generator map idM : M → M , for which we have that id♯ is the algebra map
of M . Pick r ∈ M , and iterate over all DNF formulae ϕ over M \ {r}; if
there is ϕ which is mapped to r by the algebra map of M (i.e., id♯ ), r is
redundant and can be removed from M . Repeat until no more elements
are removed from M , which yields a minimal set of generators G.
5. Return the TAlt -succinct automaton (G, i0 , i0 ◦ η).
Note that every step of this algorithm terminates, as X is ﬁnite and the size of
|R|
TAlt R is 22 .
Example 6.1. Consider the regular language over A = {a, b} given by the
ﬁnite set {a, ba, bb, baa}. The minimal DFA accepting this language is given in
Figure 1a.
According to our construction, we ﬁrst construct a TAlt -automaton with
state space freely generated from this automaton (which is already reachable).
Then we TAlt -minimize it in order to obtain a small AFA. In this case, there is
a unique minimal subset of 3 generators: G = {q0 , q1 , q2 }. To see this, consider

13

the languages JqK accepted by states q of the deterministic automaton:
Jq0 K = {a, ba, bb, baa}

Jq2 K = {ε}

Jq1 K = {a, b, aa}

Jq3 K = {ε, a}.

Jq4 K = ∅

These languages generate the states of the minimal TAlt -automaton by interpreting joins as unions and meets as intersections. We note that Jq4 K is just an
empty join and Jq3 K = (Jq0 K ∩ Jq1 K) ∪ Jq2 K.3 These are the only redundant generators. Removing them leads to the AFA in Figure 1b. Here the black square
represents a conjunction of next states.
6.1.2. Complete Atomic Boolean Algebras
We now consider the monad C given by the double contravariant powerset
X
functor, namely CX = 22 . Here the outer powerset is treated disjunctively as
in the case of TAlt , and the sets provided by the inner powerset are interpreted
as valuations. Thus, elements of C(X) can be seen as full DNF formulae over
X: every conjunctive clause contains for each x ∈ X either x or the negation x
of x. The unit assigns to an element x the disjunction of all full conjunctions
containing x, and the multiplication turns formulae of formulae into full DNF
formulae in the usual way. Algebras for this monad are known as complete
atomic boolean algebras (CABAs).
Using the fact that 2 is a free CABA (2 ∼
= C(∅)), we obtain the following
semantics for C-succinct automata: a set of sets of states is accepting if and
only if it contains the exact set F of accepting states. This is diﬀerent from
alternating automata, where a subset of F is suﬃcient. Reading a symbol in a
C-succinct automaton works as follows. Suppose we are in a set of sets of states
S ∈ C(Q), where we read a symbol a. The resulting set of sets contains U ⊆ Q
if and only if there is a set V ∈ S such that every state in V transitions into a
set of sets containing U , and every state not in V does not transition into any
set of sets containing U .
Note that every DNF formula can be converted to a full DNF formula. This
implies that C-succinct automata can always be as small as the smallest AFAs
for a given language. With the following example we show that they can actually
be strictly smaller. The T -minimization algorithm for AFA we have given in
the previous section applies to this setting as well (including negation in DNF
formulae).
Example 6.2. Consider the regular language of words over the singleton alphabet A = {a} whose length is non-zero and even. The minimal DFA accepting this
language is shown in Figure 2a. We start the algorithm with the C-automaton
with state space freely generated from this this DFA and merge the languageequivalent states. Initially, the set of generators is the set of states of the original
DFA. By noting that the language accepted by q2 is the negation of the one accepted by q1 , in full DNF form Jq2 K = (Jq0 K ∩ Jq1 K) ∪ (Jq0 K ∩ Jq1 K) (where for
any language U its complement is deﬁned as U = A∗ \ U ), we see that q2 is
3 Strictly speaking, we should take the upwards-closure of this disjunction (adding any
possible set of elements to each conjunction as an additional clause). We choose to use the
equivalent succinct formula both here and in the subsequent AFA construction to aid readability.

14

q0
q0
a
a

a
q1

a

q2
q1

a

a

(b) C-succinct automaton

(a) Deterministic automaton

Figure 2: Automata for the language of non-zero even words over {a}

redundant. The set of generators {q0 , q1 } is minimal and corresponds to the Csuccinct automaton in Figure 2b. We depict C-succinct automata in the same
manner as AFAs, but note that their interpretation is diﬀerent. Here the transition into the black square represents the transition into the conjunction of the
negations of q0 and q1 .
We now show that there is no AFA with two states accepting the same
language. Suppose such an AFA exists, and let the state space be X = {x0 , x1 }.
Since a and aaa are not in the language but aa is, one of these states must
be accepting and the other must be rejecting.4 Without loss of generality we
assume that x0 is rejecting and x1 is accepting. The empty word is not in the
language, so our initial conﬁguration has to be ↑{{x0 }}. Since a is also not in
the language, x0 will have to transition to ↑{{x0 }} as well. However, this implies
that aa is not accepted by the AFA, which contradicts the assumption that it
accepts the right language.
Unfortunately, the fact that the transition behavior of a set of states depends
on states not in that set generally makes it diﬃcult to work with C-succinct
automata by hand.
6.1.3. Symmetry
We now consider succinct automata that exploit symmetry present in their
accepted language. Given a ﬁnite group G, consider the monad G × (−), where
the unit pairs any element with the unit of G and the multiplication applies the
multiplication of G. The algebras for G × (−) are precisely left group actions.
We assume an action on the alphabet A; if no such action is relevant, one may
π2
A. We also assume an action on the
consider the trivial action G × A −→
output set O. Group actions will be denoted by a centered dot. We consider the
distributive law ρ : G × (O × (−)A ) ⇒ O × (G × (−))A given by
ρX (g, o, f ) = (g · o, λa.(g, f (g −1 · a))).
We explain the resulting semantics of (G × (−))-succinct automata in an example.
4 If there were no rejecting states, the only way to reject a word is by ending up in the
empty set of sets of states. However, this means that extensions of that word are rejected
as well. Similarly, if there are no accepting states one can only accept by ending up in ↑{∅},
which accepts everything.

15

a, b

a
q0 , ⊥

q1 , ⊥
a

a

q3 , a

b/(ab)

a, b

b

a/e
q0 , ⊥

b

q2 , ⊥

b

(a) Deterministic automaton

a/e, b/e

q4 , b

q1 , ⊥

a/e

q3 , a

b/(ab)
(b) Corresponding (G × (−))-succinct automaton

Figure 3: Automata outputting the first symbol to appear twice in a row

Example 6.3. Consider the group Perm({a, b}) = {e, (ab)} of permutations
over elements a and b. Here e is the identity and (ab) swaps a and b. We consider
the alphabet A = {a, b} with an action Perm(A) × A → A given by applying
the permutation to the element of A, and the output set O = A ∪ {⊥} with an
action given by
(ab) · a = b

(ab) · b = a

(ab) · ⊥ = ⊥.

Figure 3a shows a deterministic automaton over the alphabet A with outputs
in O. States are labeled by pairs (q, o), where q is a state label and o the output
of the state. The recognized language is the one assigning to a word over A the
ﬁrst input symbol appearing twice in a row, or ⊥ if no such symbol exists. This
deterministic automaton is in fact the minimal (Perm(A)×(−))-automaton. The
action on its state space is deﬁned by
(ab) · q0 = q0

(ab) · q1 = q2

(ab) · q2 = q1

(ab) · q3 = q4

(ab) · q4 = q3 .

We note that in the set of generators given by the full state space, q1 , q2 , q3 , and
q4 are redundant. After removing q2 , only q3 and q4 are redundant. Subsequently
removing q4 leaves no redundant elements.
The ﬁnal (G × (−))-succinct automaton is shown in Figure 3b. Its actual
conﬁgurations are pairs of a group element and a state. Transition labels are of
the form x/g, where x ∈ A and g ∈ Perm(A). If we are in a conﬁguration (g, q)
and state q has an associated output o ∈ O, the actual output is g ·o. On reading
a symbol x ∈ A, we ﬁnd the outgoing transition of which the label starts with
the symbol g −1 ·x. Supposing this label contains a group element g ′ and leads to
a state q ′ , the resulting conﬁguration is (gg ′ , q ′ ). For example, consider reading
the word bb. We start in the conﬁguration (e, q0 ). Reading b here simply takes
the transition corresponding to b, which brings us to ((ab), q1 ). Now reading the
second b, we actually read (ab)−1 · b = (ab) · b = a. This brings us to ((ab), q3 ).
The output is then given by (ab) · a = b.
In general, sets of generators in this setting correspond to subsets in which
all orbits are represented. The orbits of a set X with a left group action are the
equivalence classes of the relation that identiﬁes elements x, y ∈ X whenever
there exists g ∈ G such that g ·x = y. Minimal sets of generators contain a single
16

representative for each orbit. The algorithm given for AFAs in section 6.1.1 can
be applied to this setting as well: step 4 will remove elements until only orbit
representatives are left.
6.2. Vector Spaces
We now exploit vector space structures. Given a ﬁeld F, consider the free
vector space monad V . It maps each set X to the set of functions X → F with
ﬁnite support (ﬁnitely many elements of X are mapped to a non-zero value). A
function f : X → Y is mapped to the function V (f ) : V (X) → V (Y ) given by
X
g(x).
V (f )(g)(y) =
x∈X,f (x)=y

The unit η : X → V (X) and multiplication µ : V V (X) → V (X) of the monad
are given by
(
X
1 if x = x′
′
f (g) · g(x) ∈ F.
η(x)(x ) =
µ(f )(x) =
′
0 if x 6= x
g∈V (X)
Here 0 and 1, as well as addition and multiplication, are those of the ﬁeld F.
Elements of V (X) can alternatively be written as formal sums v1 x1 + · · · + vn xn
with vi ∈ F and xi ∈ X for all i. We will use this notation in the example below.
Algebras for the free vector space monad are precisely vector spaces. We use
the output set O = F, and the alphabet can be any ﬁnite set A. Instantiating
(3), this leads to a pointwise distributive law ρ : V (O × (−)A ) ⇒ O × V (−)A
given at a set X by


X
X
ρ(f ) = 
f (o, g) · o, λa.λx.
f (o, g) .
(o,g)∈O×X A

(o,g)∈O×X A ,g(a)=x

With these deﬁnitions, the V -succinct automata are weighted automata. We
note that if F is inﬁnite, any non-trivial V -automaton will also be inﬁnite. However, we can still start from a given weighted automaton and apply a slight
modiﬁcation of Construction 4.4: minimize from the succinct representation,
use the states of the succinct representation as initial set of generators, and
ﬁnally ﬁnd a minimal set of generators. Moreover, we may add a reachability
analysis, which in this case cannot lead to a larger automaton. Thus, the resulting algorithm essentially comes down to the standard minimization algorithm
for weighted automata [19], where the process of removing redundant generators
is integrated into the minimization. If F is ﬁnite and we do want to start from
a deterministic automaton, we can consider this automaton as a weighted one
by assigning each transition a weight of 1.

Example 6.4. Consider for F = R the deterministic automaton in Figure 4a.
This is a minimal automaton in Set; the freely generated V -automaton is inﬁnite, and so is its minimization. However, that minimization has the states
of the automaton in Figure 4a as a set of generators. To gain insight into this
minimization, we compute the languages accepted by those generators (apart

17

q1 , 1
a, b, c

a
b

q0 , 0

q1 , 1

a, b, c

q2 , 1

c

a, c
a, b, c

q4 , 0

q0 , 0

a, b, c

a, b, c

b, c/2

q3 , 3

q2 , 1

(a) Deterministic automaton

(b) Succinct weighted automaton

Figure 4: Succinctness via a weighted automaton

from q0 ):
q1 :
q2 :

ε 7→ 1
ε 7→ 1

a 7→ 1
a 7→ 0

b 7→ 1
b 7→ 0

c 7→ 1
c 7→ 0

q3 :
q4 :

ε 7→ 3
ε 7→ 0

a 7→ 1
a 7→ 0

b 7→ 1
b 7→ 0

c 7→ 1
c 7→ 0

Words not displayed are mapped to 0 by any state. The language of q0 is the
only one assigning non-zero values to certain words of length two, such as aa,
and therefore q0 cannot be a redundant generator. The other generators are
redundant: writing JqK for the language of a state q, Jq4 K is just a zero-ary sum,
and we have
Jq1 K = Jq3 K − 2Jq2 K

Jq2 K =

1
1
Jq3 K − Jq1 K
2
2

Jq3 K = Jq1 K + 2Jq2 K.

Once q4 is removed, all other generators are still redundant. Further removing
q3 makes q1 and q2 isolated. Therefore, V -minimization yields the weighted
automaton shown in Figure 4b. Here a transition on an input x ∈ A with
weight w ∈ F receives the label x/w, or just x if w = 1. Weights multiply along
a path, and diﬀerent possible paths add up to assign a value to a word. Reading
c from q0 , for example, we move to q1 + 2q2 , which has an output of 1 + 2 ∗ 1 = 3.
In general, the (sub)sets of generators of a vector space are its subsets that
span the whole space, and such a set of generators is minimal precisely when
it forms a basis. The weighted automaton resulting from our algorithm is the
usual minimal weighted automaton for the language. Redundant elements can
be found using standard techniques such as Gaussian elimination.
7. Conclusions
We have presented a construction to obtain succinct representations of deterministic ﬁnite automata as automata with side-eﬀects. This construction is
very general in that it is based on the abstract characterisation of side-eﬀects
as monads. Nonetheless, it can be easily implemented. An essential part of our
construction is the computation of a minimal set of generators for an algebra.
We have provided an algorithm for this that works for any suitable Set monad.
18

We have applied the construction to several non trivial examples: alternating automata, automata with symmetries, CABA-structured automata, and weighted
automata.
Related work. This work revamps and extends results of Arbib and Manes [4],
as discussed throughout the paper. We note that most of their results are formulated in a more general category, whereas here we work speciﬁcally in Set.
The reason for this is that we focus on the procedure for ﬁnding minimal sets
of generators by removing redundant elements, which are deﬁned using set subtraction (Deﬁnition 3.6). This limitation is already present in the work of Arbib and Manes, who spend little time on the subject and only study the nondeterministic case in detail. Our main contribution, the general procedure for
ﬁnding a minimal set of generators, is not present in their work. It generalizes
several techniques to obtain compact automaton representations of languages,
some of them presented in the context of learning algorithms [10, 7, 3]. Preliminary results on generalizing succinct automaton constructions within a learning
algorithm can be found in [23].
In [17], Myers et al. present a coalgebraic construction of canonical nondeterministic automata. Their speciﬁc examples are the átomaton [9], obtained
from the atoms of the boolean algebra generated by the residual languages (the
languages accepted by the states of the minimal DFA); the canonical RFSA;
the minimal xor automaton [24], actually a weighted automaton over ﬁeld with
two elements rather than a non-deterministic one; and what they call the distromaton, obtained from the atoms of the distributive lattice generated by the
residual languages. They further provide speciﬁc algorithms for obtaining some
of their example succinct automata.
The underlying idea in the work of Myers et al. for ﬁnding succinct representations of algebras is similar to ours, and the deterministic structured automata
they start from are equivalent: in their paper the deterministic automata live
in a locally ﬁnite variety, which translates to the category of algebras for a
monad that preserves ﬁnite sets (such as those in Section 6.1). They also deﬁne the succinct automaton using a minimal set of generators for the algebra,
but instead of our algorithmic approach of getting to this set by removing redundant generators, they use a dual equivalence between ﬁnite algebras and a
suitable modiﬁcation of the category of sets and relations between them. This
seems to restrict their work to non-deterministic automata, although there may
be an easy generalization: the equivalence would be with a modiﬁcation of a
Kleisli category. A major diﬀerence with our work is that they have no general
algorithm to construct the succinct automata; as mentioned, speciﬁc ones are
provided for their examples. In fact, they provide no guidelines on how to ﬁnd
a suitable equivalence for a given variety. On the other hand, their equivalences
guarantee uniqueness up to isomorphism of the succinct automata, which is a
desirable property for many applications.
The restriction in the work of Myers et al. to locally ﬁnite varieties means
that our example of weighted automata over an inﬁnite ﬁeld (Section 6.2) cannot
be captured in their work. Conversely, since both the átomaton and the distromaton are non-deterministic NFAs obtained from categories of algebras with
more structure than JSLs, these examples are not covered by our work. Their
other examples, however, the canonical RFSA and the minimal xor automaton,
are obtained using instances of our method as well. The fact that the problem
19

of ﬁnding in general a suitable equivalence is open means it is not trivial to
determine whether our approach can be seen as a special case of a generalized
version of theirs when we restrict to monads that preserve ﬁnite sets.
Future work. The main question that remains is under which conditions the
notion of a minimal set of generators actually describes a size-minimal set of
generators. Proposition 3.7 provides a partial answer to this question, but its
conditions fail to apply the majority of our examples, even though in some of
these cases minimal does mean size-minimal. A related question is whether we
can ﬁnd heuristics to increase the state space of a T -automaton in such a way
that the number of generators decreases. The reason the canonical RFSAs of
Denis et al. [10] are not always state-minimal NFAs is because the states of
these NFAs, seen as singletons in the determinized automaton, in general are
not reachable. Hence, removing unreachable states from a T -automaton may
increase the size of minimal sets of generators, which is why Construction 4.4
does not include a reachability analysis. Although ﬁnding state-minimal NFAs
is PSPACE-complete, a moderate gain might still be possible.