\project thesis \startcomponent ordered-nominal-sets \startchapter [title={Fast Computations on Ordered Nominal Sets}, reference=chap:ordered-nominal-sets] \midaligned{~ \author[width=0.5\hsize]{David Venhoek \\ Radboud University} \author[width=0.5\hsize]{Joshua Moerman \\ Radboud University} } \midaligned{~ \author[width=0.5\hsize]{Jurriaan Rot \\ Radboud University} } \startabstract We show how to compute efficiently with nominal sets over the total order symmetry by developing a direct representation of such nominal sets and basic constructions thereon. In contrast to previous approaches, we work directly at the level of orbits, which allows for an accurate complexity analysis. The approach is implemented as the library \ONS{}~(Ordered Nominal Sets). Our main motivation is nominal automata, which are models for recognising languages over infinite alphabets. We evaluate \ONS{} in two applications: minimisation of automata and active automata learning. In both cases, \ONS{} is competitive compared to existing implementations and outperforms them for certain classes of inputs. \stopabstract \basedon[VenhoekMR18] \page \noindent Automata over infinite alphabets are natural models for programs with unbounded data domains. Such automata, often formalised as \emph{register automata}, are applied in modelling and analysis of communication protocols, hardware, and software systems (see \citenp[BojanczykKL14, DAntoniV17, GrigoreT16, KaminskiF94, MontanariP97, Segoufin06] and references therein). Typical infinite alphabets include sequence numbers, timestamps, and identifiers. This means one can model data flow in such automata beside the basic control flow provided by ordinary automata. Recently, it has been shown in a series of papers that such models are amenable to learning \citep[AartsFKV15, BolligHLM13, CasselHJS16, DrewsD17, MoermanS0KS17, Vaandrager17] with the verification of (closed source) TCP implementations by \citet[Fiterau-Brostean16] as a prominent example. A foundational approach to infinite alphabets is provided by the notion of \emph{nominal set}, originally introduced in computer science as an elegant formalism for name binding \citep[GabbayP02, Pitts16]. Nominal sets have been used in a variety of applications in semantics, computation, and concurrency theory (see \citenp[Pitts13] for an overview). \citet[BojanczykKL14] introduce \emph{nominal automata}, which allow one to model languages over infinite alphabets with different symmetries. Their results are parametric in the structure of the data values. Important examples of data domains are ordered data values (e.g., timestamps) and data values that can only be compared for equality (e.g., identifiers). In both data domains, nominal automata and register automata are equally expressive. Important for applications of nominal sets and automata are implementations. A couple of tools exist to compute with nominal sets. Notably, \NLambda{} \citep[KlinS16] and \LOIS{} \citep[KopczynskiT16, KopczynskiT17] provide a general purpose programming language to manipulate infinite sets. \footnote{Other implementations of nominal techniques that are less directly related to our setting (Mihda, Fresh OCaml, and Nominal Isabelle) are discussed in \in{Section}[sec:related].} Both tools are based on SMT solvers and use logical formulas to represent the infinite sets. These implementations are very flexible, and the SMT solver does most of the heavy lifting, which makes the implementations themselves relatively straightforward. Unfortunately, this comes at a cost as SMT solving is in general {\sc Pspace}-hard. Since the formulas used to describe sets tend to grow as more calculations are done, running times can become unpredictable. In this chapter, we use a direct representation based on symmetries and orbits, to represent nominal sets. We focus on the \emph{total order symmetry}, where data values are rational numbers and can be compared for their order. Nominal automata over the total order symmetry are more expressive than automata over the equality symmetry (i.e., traditional register automata of \citenp[KaminskiF94]). A key insight is that the representation of nominal sets from \citet[BojanczykKL14] becomes rather simple in the total order symmetry; each orbit is presented solely by a natural number, intuitively representing the number of variables or registers. \noindent Our main contributions include the following. \startitemize[after] \item We develop the \emph{representation theory} of nominal sets over the total order symmetry. We give concrete representations of nominal sets, their products, and equivariant maps. \item We provide \emph{time complexity bounds} for operations on nominal sets such as intersections and membership. Using those results we give the time complexity of Moore's minimisation algorithm (generalised to nominal automata) and prove that it is polynomial in the number of orbits. \item Using the representation theory, we are able to \emph{implement nominal sets in a C++ library} \ONS{}. The library includes all the results from the representation theory (sets, products, and maps). \item We \emph{evaluate the performance} of \ONS{} and compare it to \NLambda{} and \LOIS{}, using two algorithms on nominal automata: minimisation \citep[BojanczykL12] and automata learning \citep[MoermanS0KS17]. We use randomly generated automata as well as concrete, logically structured models such as FIFO queues. For random automata, our methods are drastically faster than the other tools. On the other hand, \LOIS{} and \NLambda{} are faster in minimising the structured automata as they exploit their logical structure. In automata learning, the logical structure is not available a-priori, and \ONS{} is faster in most cases. \stopitemize The structure of this chapter is as follows. \in{Section}[sec:nomsets] contains background on nominal sets and their representation. \in{Section}[sec:total-order] describes the concrete representation of nominal sets, equivariant maps and products in the total order symmetry. \in{Section}[sec:implementation-ons] describes the implementation \ONS{} with complexity results, and \in{Section}[sec:appl] the evaluation of \ONS{} on algorithms for nominal automata. Related work is discussed in \in{Section}[sec:related], and future work in \in{Section}[sec:fw]. \startsection [title={Nominal sets}, reference=sec:nomsets] Nominal sets are infinite sets that carry certain symmetries, allowing a finite representation in many interesting cases. We recall their formalisation in terms of group actions, following \citet[BojanczykKL14, Pitts13], to which we refer for an extensive introduction. \startsubsection [title={Group actions}] Let $G$ be a group and $X$ be a set. A \emph{(left) $G$-action} is a function ${\cdot} \colon G \times X \to X$ satisfying $1 \cdot x = x$ and $(h g) \cdot x = h \cdot (g \cdot x)$ for all $x \in X$ and $g, h \in G$. A set $X$ with a $G$-action is called a \emph{$G$-set} and we often write $g x$ instead of $g \cdot x$. The \emph{orbit} of an element $x \in X$ is the set $\{g x \mid g \in G\}$. A $G$-set is always a disjoint union of its orbits (in other words, the orbits partition the set). We say that $X$ is \emph{orbit-finite} if it has finitely many orbits, and we denote the number of orbits by $N(X)$. A map $f \colon X \to Y$ between $G$-sets is called \emph{equivariant} if it preserves the group action, i.e., for all $x \in X$ and $g \in G$ we have $g \cdot f(x) = f(g \cdot x)$. If an equivariant map $f$ is bijective, then $f$ is an \emph{isomorphism} and we write $X \cong Y$. A subset $Y \subseteq X$ is equivariant if the corresponding inclusion map is equivariant. The \emph{product} of two $G$-sets $X$ and $Y$ is given by the Cartesian product $X \times Y$ with the point-wise group action on it, i.e., $g (x, y) = (g x, g y)$. Union and intersection of $X$ and $Y$ are well-defined if the two actions agree on their common elements. \stopsubsection \startsubsection [title={Nominal sets}] A \emph{data symmetry} is a pair $(\mathcal{D}, G)$ where $\mathcal{D}$ is a set and $G$ is a subgroup of $\Sym(\mathcal{D})$, the group of bijections on $\mathcal{D}$. Note that the group $G$ naturally acts on $\mathcal{D}$ by defining $g x = g(x)$. In the most studied instance, called the \emph{equality symmetry}, $\mathcal{D}$ is a countably infinite set and $G = \Sym(\mathcal{D})$. In this chapter, we focus on the \emph{total order symmetry} given by $\mathcal{D} = \Q$ and $G = \{\pi \mid \pi \in \Sym(\Q), \pi \text{ is monotone}\}$. Let $(\mathcal{D}, G)$ be a data symmetry and $X$ be a $G$-set. A set of data values $S\subseteq \mathcal{D}$ is called a \emph{support} of an element $x \in X$ if for all $g\in G$ with $\forall s \in S: g s = s$ we have $g x = x$. A $G$-set $X$ is called \emph{nominal} if every element $x\in X$ has a finite support. \startexample We list several examples for the total order symmetry. The set $\Q^2$ is nominal as each element $(q_1, q_2) \in \Q^2$ has the finite set $\{q_1, q_2\}$ as its support. The set has the following three orbits: \startformulas \startformula \{(q_1, q_2) \mid q_1 < q_2\} \stopformula \startformula \{(q_1, q_2) \mid q_1 = q_2\} \stopformula \startformula \{(q_1, q_2) \mid q_1 > q_2\}. \stopformula \stopformulas For a set $X$, the set of all subsets of size $n \in \N$ is denoted by $\mathcal{P}_n(X) = \{Y \subseteq X \mid \#Y = n\}$. The set $\mathcal{P}_n(\Q)$ is a single-orbit nominal set for each $n$, with the action defined by direct image: $g Y = \{ g y \mid y \in Y \}$. The group of monotone bijections also acts by direct image on the full power set $\mathcal{P}(\Q)$, but this is \emph{not} a nominal set. For instance, the set $\mathbb{Z} \in \mathcal{P}(\Q)$ of integers has no finite support. \stopexample If $S \subseteq \mathcal{D}$ is a support of an element $x \in X$, then any set $S' \subseteq \mathcal{D}$ such that $S \subseteq S'$ is also a support of $x$. A set $S \subset \mathcal{D}$ is a \emph{least finite support} of $x \in X$ if it is a finite support of $x$ and $S \subseteq S'$ for any finite support $S'$ of $x$. The existence of least finite supports is crucial for representing orbits. Unfortunately, even when elements have a finite support, in general they do not always have a least finite support. A data symmetry $(\mathcal{D}, G)$ is said to \emph{admit least supports} if every element of every nominal set has a least finite support. Both the equality and the total order symmetry admit least supports. (\citenp[BojanczykKL14] give additional (counter)examples of data symmetries admitting least supports.) Having least finite supports is useful for a finite representation. Henceforth, we will write \emph{least support} to mean least finite support. Given a nominal set $X$, the size of the least support of an element $x \in X$ is denoted by $\dim(x)$, the \emph{dimension} of $x$. We note that all elements in the orbit of $x$ have the same dimension. For an orbit-finite nominal set $X$, we define $\dim(X) = \max \{ \dim(x) \mid x \in X \}$. For a single-orbit set $O$, observe that $\dim(O) = \dim(x)$ where $x$ is any element $x \in O$. \stopsubsection \startsubsection [title={Representing nominal orbits}, reference=subsec:nomorbit] We represent nominal sets as collections of single orbits. The finite representation of single orbits is based on the theory of \citet[BojanczykKL14], which uses the technical notions of \emph{restriction} and \emph{extension}. We only briefly report their definitions here. However, the reader can safely move to the concrete representation theory in \in{Section}[sec:total-order] with only a superficial understanding of \in{Theorem}[representation_tuple] below. The \emph{restriction} of an element $\pi \in G$ to a subset $C \subseteq \mathcal{D}$, written as $\pi|_C$, is the restriction of the function $\pi \colon \mathcal{D} \to \mathcal{D}$ to the domain $C$. The restriction of a group $G$ to a subset $C \subseteq \mathcal{D}$ is defined as $G|_C = \{\pi|_C \mid \pi \in G,\, \pi C = C\}$. The \emph{extension} of a subgroup $S \le G|_C$ is defined as $\ext_G(S) = \{\pi \in G \mid \pi|_C \in S\}$. For $C \subseteq \mathcal{D}$ and $S\le G|_C$, define $[C,S]^{ec} = \{\{g s \mid s \in \ext_G(S)\}\mid g\in G\}$, i.e., the set of right cosets of $\ext_G(S)$ in $G$. Then $[C,S]^{ec}$ is a single-orbit nominal set. Using the above, we can formulate the representation theory from \citet[BojanczykKL14]. This gives a finite description for all single-orbit nominal sets $X$, namely a finite set $C$ together with some of its symmetries. \starttheorem[reference=representation_tuple] Let $X$ be a single-orbit nominal set for a data symmetry $(\mathcal{D}, G)$ that admits least supports and let $C \subseteq \mathcal{D}$ be the least support of some element $x \in X$. Then there exists a subgroup $S \le G|_C$ such that $X \cong [C,S]^{ec}$. \stoptheorem The proof by \citet[BojanczykKL14] uses a bit of category theory: it establishes an equivalence of categories between single-orbit sets and the pairs $(C, S)$. We will not use the language of category theory much in order to keep the chapter self-contained. \stopsubsection \stopsection \startsection [title={Representation in the total order symmetry}, reference=sec:total-order] This section develops a concrete representation of nominal sets over the total order symmetry, as well as their equivariant maps and products. It is based on the abstract representation theory from \in{Section}[subsec:nomorbit]. From now on, by \emph{nominal set} we always refer to a nominal set over the total order symmetry. Hence, our data domain is $\Q$ and we take $G$ to be the group of monotone bijections. \startsubsection [title={Orbits and nominal sets}, reference=sec:orbits-and-nominal-set] From the representation in \in{Section}[subsec:nomorbit], we find that any single-orbit set $X$ can be represented as a tuple $(C,S)$. Our first observation is that the finite group $S$ of \quote{local symmetries} in this representation is always trivial, i.e., $S = I$, where $I = \{1\}$ is the trivial group. This follows from the following lemma and $S \le G|_C$. \startlemma[reference=group_trivial] For every finite subset $C \subset \Q$, we have $G|_C = I$. \stoplemma Immediately, we see that $(C, S) = (C, I)$, and hence that the orbit is fully represented by the set $C$. A further consequence of \in{Lemma}[group_trivial] is that each \emph{element} of an orbit can be uniquely identified by its least support. This leads us to the following characterisation of $[C,I]^{ec}$. \startlemma[reference=orbitrep] Given a finite subset $C \subset \Q$, we have $[C,I]^{ec} \cong \mathcal{P}_{\#C}(\Q)$. \stoplemma By \in{Theorem}[representation_tuple] and the above lemmas, we can represent an orbit by a single integer $n$, the size of the least support of its elements. This naturally extends to (orbit-finite) nominal sets with multiple orbits by using a multiset of natural numbers, representing the size of the least support of each of the orbits. These multisets are formalised here as functions $f \colon \N \to \N$. \startdefinition[reference=def:present-nom] Given a function $f \colon \N \to \N$, we define a nominal set $[f]^{o}$ by \startformula \[f\]^{o} = \bigcup_{\startsubstack n \in \N \NR 1 \le i \le f(n) \stopsubstack} \{i\} \times \mathcal{P}_n(\Q). \stopformula \stopdefinition \startproposition[reference=thm:pres-nom] For every orbit-finite nominal set $X$, there is a function $f \colon \N \to \N$ such that $X \cong [f]^{o}$ and the set $\{n \mid f(n) \neq 0\}$ is finite. Furthermore, the mapping between $X$ and $f$ is one-to-one (up to isomorphism of nominal sets) when restricting to $f \colon \N \to \N$ for which the set $\{n\mid f(n) \ne 0\}$ is finite. \stopproposition The presentation in terms of a function $f \colon \N \to \N$ enforces that there are only finitely many orbits of any given dimension. The first part of the above proposition generalises to arbitrary nominal sets by replacing the codomain of $f$ by the class of all sets and adapting \in{Definition}[def:present-nom] accordingly. However, the resulting correspondence will no longer be one-to-one. As a brief example, let us consider the set $\Q \times \Q$. The elements $(a, b)$ split in three orbits, one for $a < b$, one for $a = b$ and one for $a > b$. These have dimension 2, 1 and 2 respectively, so the set $\Q \times \Q$ is represented by the multiset $\{1, 2, 2\}$. \stopsubsection \startsubsection [title={Equivariant maps}, reference=sec:eq-maps] We show how to represent equivariant maps, using two basic properties. Let $f \colon X \to Y$ be an equivariant map. The first property is that the direct image of an orbit (in $X$) is again an orbit (in $Y$), that is to say, $f$ is defined \quote{orbit-wise}. Second, equivariant maps cannot introduce new elements in the support (but they can drop them). More precisely: \startlemma[reference=lem:eqiorbit] Let $f \colon X \to Y$ be an equivariant map, and $O\subseteq X$ a single orbit. The direct image $f(O) = \{f(x) \mid x \in O\}$ is a single-orbit nominal set. \stoplemma \startlemma[reference=eqimap_contained] Let $f \colon X \to Y$ be an equivariant map between two nominal sets $X$ and $Y$. Let $x\in X$ and let $C$ be a support of $x$. Then $C$ supports $f(x)$. \stoplemma Hence, equivariant maps are fully determined by associating two pieces of information for each orbit in the domain: the orbit on which it is mapped and a string denoting which elements of the least support of the input are preserved. These ingredients are formalised in the first part of the following definition. The second part describes how these ingredients define an equivariant function. \in{Proposition}[eqimap_rep] then states that every equivariant function can be described in this way. \startdefinition[reference=def:eqmap] Let $H = \{(I_1, F_1, O_1), \ldots, (I_n, F_n, O_n)\}$ be a finite set of tuples where the $I_i$'s are disjoint single-orbit nominal sets, the $O_i$'s are single-orbit nominal sets with $\dim(O_i) \leq \dim(I_i)$, and the $F_i$'s are bit strings of length $\dim(I_i)$ with exactly $\dim(O_i)$ ones. Given a set $H$ as above, we define $f_H \colon \bigcup I_i \to \bigcup O_i$ as the unique equivariant function such that, given $x \in I_i$ with least support $C$, $f_H(x)$ is the unique element of $O_i$ with support $\{C(j) \mid F_i(j) = 1\}$, where $F_i(j)$ is the $j$-th bit of $F_i$ and $C(j)$ is the $j$-th smallest element of $C$. \stopdefinition \startproposition[reference=eqimap_rep] For every equivariant map $f \colon X \to Y$ between orbit-finite nominal sets $X$ and $Y$ there is a set $H$ as in \in{Definition}[def:eqmap] such that $f = f_H$. \stopproposition Consider the example function $\min \colon \mathcal{P}_3(\Q) \to \Q$ which returns the smallest element of a 3-element set. Note that both $\mathcal{P}_3(\Q)$ and $\Q$ are single orbits. Since for the orbit $\mathcal{P}_3(\Q)$ we only keep the smallest element of the support, we can thus represent the function $\min$ with $H = \{(\mathcal{P}_3(\Q), 100, \Q)\}$. \stopsubsection \startsubsection [title={Products}, reference=subsec:products] The product $X \times Y$ of two nominal sets is again a nominal set and hence, it can be represented itself in terms of the dimension of each of its orbits as shown in \in{Section}[sec:orbits-and-nominal-set]. However, this approach has some disadvantages. \startexample[reference=ex:product] We start by showing that the orbit structure of products can be non-trivial. Consider the product of $X = \Q$ and the set $Y = \{(a,b) \in \Q^2 \mid a < b\}$. This product consists of \emph{five} orbits, more than one might naively expect from the fact that both sets are single-orbit: \startformula\startalign[width=broad] \NC \{(a,(b,c)) \mid a,b,c \in \Q, a < b < c\}, \NC \quad \{(a,(a,b)) \mid a,b \in \Q, a < b\}, \NR \NC \{(b,(a,c)) \mid a,b,c \in \Q, a < b < c\}, \NC \quad \{(b,(a,b)) \mid a,b \in \Q, a < b\}, \NR \NC \{(c,(a,b)) \mid a,b,c \in \Q, a < b < c\}. \NC \NR \stopalign\stopformula \stopexample We find that this product is represented by the multiset $\{2,2,3,3,3\}$. Unfortunately, this is not sufficient to accurately describe the product as it abstracts away from the relation between its elements with those in $X$ and $Y$. In particular, it is not possible to reconstruct the projection maps from such a representation. The essence of our representation of products is that each orbit $O$ in the product $X \times Y$ is described entirely by the dimension of $O$ together with the two (equivariant) projections $\pi_1 \colon O \to X$ and $\pi_2 \colon O \to Y$. This combination of the orbit and the two projection maps can already be represented using \in{Propositions}[thm:pres-nom] and \in{}[eqimap_rep]. However, as we will see, a combined representation for this has several advantages. For discussing such a representation, let us first introduce what it means for tuples of a set and two functions to be isomorphic: \startdefinition Given nominal sets $X, Y, Z_1$ and $Z_2$, and equivariant functions $l_1 \colon Z_1 \to X$, $r_1 \colon Z_1 \to Y$, $l_2 \colon Z_2 \to X$ and $r_2 \colon Z_2 \to Y$, we define $(Z_1, l_1, r_1) \cong (Z_2, l_2, r_2)$ if there exists an isomorphism $h \colon Z_1 \to Z_2$ such that $l_1 = l_2 \circ h$ and $r_1 = r_2 \circ h$. \stopdefinition Our goal is to have a representation that, for each orbit $O$, produces a tuple $(A, f_1, f_2)$ isomorphic to the tuple $(O, \pi_1, \pi_2)$. The next lemma gives a characterisation that can be used to simplify such a representation. \startlemma[reference=lm:supp-product] Let $X$ and $Y$ be nominal sets and $(x, y) \in X \times Y$. If $C$, $C_x$, and $C_y$ are the least supports of $(x, y)$, $x$, and $y$ respectively, then $C = C_x \cup C_y$. \stoplemma With \in{Proposition}[eqimap_rep] we represent the maps $\pi_1$ and $\pi_2$ by tuples $(O, F_1, O_1)$ and $(O, F_2, O_2)$ respectively. Using \in{Lemma}[lm:supp-product] and the definitions of $F_1$ and $F_2$, we see that at least one of $F_1(i)$ and $F_2(i)$ equals $1$ for each $i$. We can thus combine the strings $F_1$ and $F_2$ into a single string $P \in \{L, R, B\}^{*}$ as follows. We set $P(i) = L$ when only $F_1(i)$ is $1$, $P(i) = R$ when only $F_2(i)$ is $1$, and $P(i) = B$ when both are $1$. The string $P$ fully describes the strings $F_1$ and $F_2$. This process for constructing the string $P$ gives it two useful properties. The number of $L$s and $B$s in the string gives the size dimension of $O_1$. Similarly, the number of $R$s and $B$s in the string gives the dimension of $O_2$. We will call strings with that property \emph{valid}. In conclusion, to describe a single orbit of the product $X \times Y$, a valid string $P$ together with the images of $\pi_1$ and $\pi_2$ is sufficient. \startdefinition[reference=def:orbstringdef] Let $P \in \{L,R,B\}^{*}$, and $O_1 \subseteq X$, $O_2 \subseteq Y$ be single-orbit sets. Given a tuple $(P, O_1, O_2)$, where the string $P$ is valid, define \startformula \[(P,O_1,O_2)\]^t = (\mathcal{P}_{|P|}(\Q), f_{H_1}, f_{H_2}), \stopformula where $H_i = \{(\mathcal{P}_{|P|}(\Q), F_i, O_i)\}$ and the string $F_1$ is defined as the string $P$ with $L$s and $B$s replaced by $1$s and $R$s by $0$s. The string $F_2$ is similarly defined with the roles of $L$ and $R$ swapped. \stopdefinition \startproposition[reference=orbstring] There exists a one-to-one correspondence between the orbits $O \subseteq X \times Y$, and tuples $(P, O_1, O_2)$ satisfying $O_1 \subseteq X$, $O_2 \subseteq Y$, and where $P$ is a valid string, such that $[(P,O_1,O_2)]^{t} \cong (O, \pi_1|_O, \pi_2|_O)$. \stopproposition From the above proposition it follows that we can generate the product $X \times Y$ simply by enumerating all valid strings $P$ for all pairs of orbits $(O_1, O_2)$ of $X$ and $Y$. Given this, we can calculate the multiset representation of a product from the multiset representations of both factors. \starttheorem[reference=thm:prod-combinatorics] For $X \cong [f]^{o}$ and $Y \cong [g]^{o}$ we have $X \times Y \cong [h]^{o}$, where \startformula h(n) = \sum_{\startsubstack 0 \le i, j \le n \NR i+j \ge n \stopsubstack} f(i) g(j) {{n}\choose{j}}{{j}\choose{n-i}}. \stopformula \stoptheorem \startexample To illustrate some aspects of the above representation, let us use it to calculate the product of \in{Example}[ex:product]. First, we observe that both $\Q$ and $S = \{(a,b) \in \Q^2 \mid a < b\}$ consist of a single orbit. Hence any orbit of the product corresponds to a triple $(P, \Q, S)$, where the string $P$ satisfies $|P|_L+|P|_B = \dim(\Q) = 1$ and $|P|_R+|P|_B = \dim(S) = 2$. We can now find the orbits of the product $\Q \times S$ by enumerating all strings satisfying these equations. This yields \startitemize \item LRR, corresponding to the orbit $\{(a,(b,c)) \mid a,b,c \in \Q, a < b < c\}$, \item RLR, corresponding to the orbit $\{(b,(a,c)) \mid a,b,c \in \Q, a < b < c\}$, \item RRL, corresponding to the orbit $\{(c,(a,b)) \mid a,b,c \in \Q, a < b < c\}$, \item RB, corresponding to the orbit $\{(b,(a,b)) \mid a,b \in \Q, a < b\}$, and \item BR, corresponding to the orbit $\{(a,(a,b)) \mid a,b \in \Q, a < b\}$. \stopitemize Each product string fully describes the corresponding orbit. To illustrate, consider the string BR. The corresponding bit strings for the projection functions are $F_1 = 10$ and $F_2 = 11$. From the lengths of the string we conclude that the dimension of the orbit is $2$. The string $F_1$ further tells us that the left element of the tuple consists only of the smallest element of the support. The string $F_2$ indicates that the right element of the tuple is constructed from both elements of the support. Combining this, we find that the orbit is $\{(a,(a,b)) \mid a,b \in \Q, a < b\}$. \stopexample \stopsubsection \startsubsection [title={Summary}] We summarise our concrete representation in the following table. \in{Propositions}[thm:pres-nom], \in{}[eqimap_rep] and \in{}[orbstring] correspond to the three rows in the table. \placetable[here][tab:overview]{Overview of representation.} \starttabulate[|l|p|] \NC \emph{Object} \NC \emph{Representation} \NC\NR \HL %---------------------------------------- \NC Single orbit $O$ \NC Natural number $n = \dim(O)$ \NC\NR \NC Nominal set $X = \bigcup_i O_i$ \NC Multiset of these numbers \NC\NR \HL %---------------------------------------- \NC Map from single orbit $f \colon O \to Y$ \NC The orbit $f(O)$ and a bit string $F$ \NC\NR \NC Equivariant map $f \colon X \to Y$ \NC Set of tuples $(O, F, f(O))$, one for each orbit \NC\NR \HL %---------------------------------------- \NC Orbit in a product $O \subseteq X \times Y$ \NC The corresponding orbits of $X$ and $Y$, and a string $P$ relating their supports \NC\NR \NC Product $X \times Y$ \NC Set of tuples $(P, O_X, O_Y)$, one for each orbit \NC\NR \stoptabulate Notice that in the case of maps and products, the orbits are inductively represented using the concrete representation. As a base case we can represent single orbits by their dimension. \stopsubsection \stopsection \startsection [title={Implementation and Complexity of ONS}, reference=sec:implementation-ons] The ideas outlined above have been implemented in a \cpp{} library, \ONS{}, and a \haskell{} library, \ONShs{}. \footnote{\ONS{} can be found at \citeurl[ons-code] and \ONShs{} can be found at \citeurl[ons-hs-code].} We focus here on the \cpp{} library only, as the \haskell{} one is very similar. The library can represent orbit-finite nominal sets and their products, (disjoint) unions, and maps. A full description of the possibilities is given in the documentation included with \ONS{}. As an example, the following program computes the product from \in{Example}[ex:product]. Initially, the program creates the nominal set $A$, containing the entirety of $\Q$. Then it creates a nominal set $B$, such that it consists of the orbit containing the element $(1,2) \in \Q \times \Q$. For this, the library determines to which orbit of the product $\Q \times \Q$ the element $(1,2)$ belongs, and then stores a description of the orbit as described in \in{Section}[sec:total-order]. Note that this means that it internally never needs to store the element used to create the orbit. The function \kw{nomset_product} then uses the enumeration of product strings mentioned in \in{Section}[subsec:products] to calculate the product of $A$ and $B$. Finally, it prints a representative element for each of the orbits in the product. These elements are constructed based on the description of the orbits stored, filled in to make their support equal to sets of the form $\{1, 2, \ldots, n\}$. \starttyping nomset A = nomset_rationals(); nomset> B({rational(1),rational(2)}); auto AtimesB = nomset_product(A, B); // compute the product for (auto orbit : AtimesB) cout << orbit.getElement() << " "; \stoptyping \noindent Running this gives the following output (where {\tt /1} signifies the denominator): \starttyping (1/1,(2/1,3/1)) (1/1,(1/1,2/1)) (2/1,(1/1,3/1)) (2/1,(1/1,2/1)) (3/1,(1/1,2/1)) \stoptyping Internally, \kw{orbit} is implemented following the theory presented in \in{Section}[sec:total-order], storing the dimension of the orbit it represents. It also contains sufficient information to reconstruct elements given their least support, such as the product string for orbits resulting from a product. The class \kw{nomset} then uses a standard set data structure to store the collection of orbits contained in the nominal set it represents. In a similar way, \kw{eqimap} stores equivariant maps by associating each orbit in the domain with the image orbit and the string representing which of the least support to keep. This is stored using a map data structure. For both nominal sets and equivariant maps, the underlying data structure is currently implemented using trees. \startsubsection [title={Complexity of operations}] Using the concrete representation of nominal sets, we can determine the complexity of common operations. To simplify such an analysis, we will make the following assumptions: \startitemize \item The comparison of two orbits takes $O(1)$. \item Constructing an orbit from an element takes $O(1)$. \item Checking whether an element is in an orbit takes $O(1)$. \stopitemize These assumptions are justified as each of these operations takes time proportional to the size of the representation of an individual orbit, which in practice is small and approximately constant. For instance, the orbit $\mathcal{P}_n(\Q)$ is represented by just the integer $n$ and its type. \starttheorem[reference=thmcomplexity] If nominal sets are implemented with a tree-based set structure (as in \ONS{}), the complexity of the following set operations is as follows. Recall that $\Nsize(X)$ denotes the number of orbits of $X$. We use $p$ and $f$ to denote functions implemented in whatever way the user wants, which we assume to take $O(1)$ time. The software assumes these are equivariant, but this is not verified. \placetable[here][tab:complexity]{Time complexity of operations on nominal sets.} \starttabulate[|r|l|] \NC \emph{Operation} \VL \emph{Complexity} \NC\NR \HL \NC Test $x \in X$ \VL $O(\log \Nsize(X))$ \NC\NR \NC Test $X\subseteq Y$ \VL $O(\min(\Nsize(X)+\Nsize(Y), \Nsize(X)\log \Nsize(Y)))$ \NC\NR \NC Calculate $X \cup Y$ \VL $O(\Nsize(X)+\Nsize(Y))$ \NC\NR \NC Calculate $X \cap Y$ \VL $O(\Nsize(X)+\Nsize(Y))$ \NC\NR \NC Calculate $\{x \in X \mid p(x)\}$ \VL $O(\Nsize(X))$ \NC\NR \NC Calculate $\{f(x) \mid x \in X\}$ \VL $O(\Nsize(X)\log \Nsize(X))$ \NC\NR \NC Calculate $X\times Y$ \VL $O(\Nsize(X\times Y)) \,\subseteq\, O(3^{\dim(X)+\dim(Y)}\Nsize(X)\Nsize(Y))$ \NC\NR \stoptabulate \stoptheorem \startproofnoqed Since most parts are proven similarly, we only include proofs for the first and last item. \description{Membership.} To decide $x \in X$, we first construct the orbit containing $x$, which is done in constant time. Then we use a logarithmic lookup to decide whether this orbit is in our set data structure. Hence, membership checking is $O(\log(\Nsize(X)))$. \startdescription[title={Products.}] Calculating the product of two nominal sets is the most complicated construction. For each pair of orbits in the original sets $X$ and $Y$, all product strings need to be generated. Each product orbit itself is constructed in constant time. By generating these orbits in-order, the resulting set takes $O(\Nsize(X \times Y))$ time to construct. We can also give an explicit upper bound for the number of orbits in terms of the input. Recall that orbits in a product are represented by strings of length at most $\dim(X)+\dim(Y)$. (If the string is shorter, we pad it with one of the symbols.) Since there are three symbols ($L, R$ and $B$), the product of $X$ and $Y$ will have at most $3^{\dim(X)+\dim(Y)}\Nsize(X)\Nsize(Y)$ orbits. It follows that taking products has time complexity of $O(3^{\dim(X)+\dim(Y)}\Nsize(X)\Nsize(Y))$. \QED \stopdescription \stopproofnoqed \stopsubsection \stopsection \startsection [title={Results and evaluation in automata theory}, reference=sec:appl] In this section we consider applications of nominal sets to automata theory. As mentioned in the introduction, nominal sets are used to formalise languages over infinite alphabets. These languages naturally arise as the semantics of register automata. The definition of register automata is not as simple as that of ordinary finite automata. Consequently, transferring results from automata theory to this setting often requires non-trivial proofs. Nominal automata, instead, are defined as ordinary automata by replacing finite sets with orbit-finite nominal sets. The theory of nominal automata is developed by \citet[BojanczykKL14] and it is shown that many algorithms, such as minimisation (based on the Myhill-Nerode equivalence), from automata theory transfer to nominal automata. Not all algorithms work: e.g., the subset construction fails for nominal automata. As an example we consider the following language on rational numbers: \startformula \Lint = \{ a_1 b_1 \cdots a_n b_n \mid a_i, b_i \in \Q, a_i < a_{i+1} < b_{i+1} < b_i \text{ for all } i \}. \stopformula We call this language the \emph{interval language} as a word $w \in \Q^{\ast}$ is in the language when it denotes a sequence of nested intervals. This language contains arbitrarily long words. For this language it is crucial to work with an infinite alphabet as for each finite set $C \subset \Q$, the restriction $\Lint \cap C^{\ast}$ is just a finite language. Note that the language is equivariant: $w \in \Lint \iff wg \in \Lint$ for any monotone bijection $g$, because nested intervals are preserved by monotone maps. \footnote{The $G$-action on words is defined point-wise: $g (w_1 \ldots w_n) = (g w_1) \ldots (g w_n)$.} Indeed, $\Lint$ is a nominal set, although it is not orbit-finite. Informally, the language $\Lint$ can be accepted by the automaton depicted in \in{Figure}[fig:lint-minimal]. Here we allow the automaton to store rational numbers and compare them to new symbols. For example, the transition from $q_2$ to $q_3$ is taken if any value $c$ between $a$ and $b$ is read and then the currently stored value $a$ is replaced by $c$. For any other value read at state $q_2$ the automaton transitions to the sink state $q_4$. Such a transition structure is made precise by the notion of nominal automata. \startplacefigure [title={Example automaton that accepts the language $\Lint$.}, reference=fig:lint-minimal] \hbox{\starttikzpicture[node distance=3.5cm] \node (S 0) [state] {$q_0$}; \node (S 1) [state, right of=S 0] {$q_1(a)$}; \node (S 2) [state, accepting, right of=S 1] {$q_2(a,b)$}; \node (S 3) [state, right of=S 2] {$q_3(a,b)$}; \node (S 4) [state, below=1.5cm of S 2] {$q_4$}; \path[trans] (S 0) edge node[above] {$a$} (S 1) (S 1) edge node[above] {$b > a$} (S 2) (S 1) edge[bend right] node[below left] {$b \le a$} (S 4) (S 2) edge[bend left=10] node[above,align=center] {$a