Archived
1
Fork 0

Learning nominal atuomata

This commit is contained in:
Joshua Moerman 2018-11-09 16:22:50 +01:00
parent deed1efa90
commit 58b3562c44
4 changed files with 255 additions and 166 deletions

View file

@ -628,6 +628,22 @@
bibsource = {dblp computer science bibliography, https://dblp.org} bibsource = {dblp computer science bibliography, https://dblp.org}
} }
@article{CameronST89,
title = {Chains of subgroups in symmetric groups},
author = {Peter J. Cameron and
Ron Solomon and
Alexandre Turull},
journal = {Journal of algebra},
volume = {127},
number = {2},
pages = {340--352},
year = {1989},
publisher = {Academic Press},
url = {https://doi.org/10.1016/0021-8693(89)90256-1},
doi = {10.1016/0021-8693(89)90256-1},
issn = {0021-8693}
}
@article{DBLP:journals/jlp/CasselHJMS15, @article{DBLP:journals/jlp/CasselHJMS15,
author = {Sofia Cassel and author = {Sofia Cassel and
Falk Howar and Falk Howar and
@ -1380,6 +1396,15 @@
bibsource = {dblp computer science bibliography, https://dblp.org} bibsource = {dblp computer science bibliography, https://dblp.org}
} }
@phdthesis{Isberner15,
title = {Foundations of Active Automata Learning: An Algorithmic Perspective},
author = {Malte Isberner},
year = {2015},
month = {October},
school = {Technical University of Dortmund, Germany},
url = {https://eldorado.tu-dortmund.de/bitstream/2003/34282/1/Dissertation.pdf}
}
@inproceedings{DBLP:conf/birthday/JacobsS14a, @inproceedings{DBLP:conf/birthday/JacobsS14a,
author = {Bart Jacobs and author = {Bart Jacobs and
Alexandra Silva}, Alexandra Silva},

View file

@ -18,7 +18,7 @@
\startabstract \startabstract
We present an Angluin-style algorithm to learn nominal automata, which are acceptors of languages over infinite (structured) alphabets. We present an Angluin-style algorithm to learn nominal automata, which are acceptors of languages over infinite (structured) alphabets.
The abstract approach we take allows us to seamlessly extend known variations of the algorithm to this new setting. The abstract approach we take allows us to seamlessly extend known variations of the algorithm to this new setting.
In particular we can learn a subclass of nominal non-deterministic automata. In particular, we can learn a subclass of nominal non-deterministic automata.
An implementation using a recently developed Haskell library for nominal computation is provided for preliminary experiments. An implementation using a recently developed Haskell library for nominal computation is provided for preliminary experiments.
\stopabstract \stopabstract
@ -35,23 +35,23 @@ In this context, it has become popular to infer or learn a model from a given sy
The learned model can then be used to ensure the system is complying to desired properties or to detect bugs and design possible fixes. The learned model can then be used to ensure the system is complying to desired properties or to detect bugs and design possible fixes.
Automata learning, or regular inference, is a widely used technique for creating an automaton model from observations. Automata learning, or regular inference, is a widely used technique for creating an automaton model from observations.
The original algorithm by \cite[authoryears][DBLP:journals/iandc/Angluin87] works for deterministic finite automata, but since then has been extended to other types of automata, including Mealy machines and I/O automata (see \cite[left=,right=,righttext={\sectionmark 8.5}][DBLP:phd/de/Niese2003] and \cite[left=,right=][DBLP:conf/concur/AartsV10]), and even a special class of context-free grammars. The original algorithm by \cite[authoryears][DBLP:journals/iandc/Angluin87] works for deterministic finite automata, but since then has been extended to other types of automata, including Mealy machines and I/O automata (see \citenp[DBLP:phd/de/Niese2003], \sectionmark 8.5, and \citenp[DBLP:conf/concur/AartsV10]), and even a special class of context-free grammars (see \citenp[Isberner15], \sectionmark 6)
\todo{Deze nog citeren \cite[DBLP:conf/colt/AngluinC97]?}
Angluin's algorithm is sometimes referred to as {\em active learning}, because it is based on direct interaction of the learner with an oracle (\quotation{the Teacher}) that can answer different types of queries. Angluin's algorithm is sometimes referred to as {\em active learning}, because it is based on direct interaction of the learner with an oracle (\quotation{the Teacher}) that can answer different types of queries.
This is in contrast with \emph{passive} learning, where a fixed set of positive and negative examples is given and no interaction with the system is possible. This is in contrast with \emph{passive} learning, where a fixed set of positive and negative examples is given and no interaction with the system is possible.
In this paper, staying in the realm of active learning, we will extend Angluin's algorithm to a richer class of automata. In this chapter, staying in the realm of active learning, we will extend Angluin's algorithm to a richer class of automata.
We are motivated by situations in which a program model, besides control flow, needs to represent basic data flow, where data items are compared for equality (or for other theories such as total ordering). We are motivated by situations in which a program model, besides control flow, needs to represent basic data flow, where data items are compared for equality (or for other theories such as total ordering).
In these situations, values for individual symbols are typically drawn from an infinite domain and automata over \emph{infinite alphabets} become natural models, as witnessed by a recent trend \cite[DBLP:conf/dlt/BolligHLM13, DBLP:conf/ictac/AartsFKV15,DBLP:conf/popl/DAntoniV14, DBLP:journals/corr/BojanczykKL14, DBLP:journals/fac/CasselHJS16]. In these situations, values for individual symbols are typically drawn from an infinite domain and automata over \emph{infinite alphabets} become natural models, as witnessed by a recent trend \cite[DBLP:conf/dlt/BolligHLM13, DBLP:conf/ictac/AartsFKV15,DBLP:conf/popl/DAntoniV14, DBLP:journals/corr/BojanczykKL14, DBLP:journals/fac/CasselHJS16].
One of the foundational approaches to formal language theory for infinite alphabets uses the notion of nominal sets \cite[DBLP:journals/corr/BojanczykKL14]. One of the foundational approaches to formal language theory for infinite alphabets uses the notion of nominal sets \cite[DBLP:journals/corr/BojanczykKL14].
The theory of nominal sets originates from the work of Fraenkel in 1922, and they were originally used to prove the independence of the axiom of choice and other axioms. The theory of nominal sets originates from the work of Fraenkel in 1922, and they were originally used to prove the independence of the axiom of choice and other axioms.
They have been rediscovered in Computer Science by Gabbay and Pitts \cite[Pitts13], as an elegant formalism for modeling name binding, and since then they form the basis of many research projects in the semantics and concurrency community. They have been rediscovered in Computer Science by Gabbay and Pitts (see \citenp[Pitts13] for historical notes), as an elegant formalism for modelling name binding, and since then they form the basis of many research projects in the semantics and concurrency community.
In a nutshell, nominal sets are infinite sets equipped with symmetries which make them finitely representable and tractable for algorithms. In a nutshell, nominal sets are infinite sets equipped with symmetries which make them finitely representable and tractable for algorithms.
We make crucial use of this feature in the development of a learning algorithm. We make crucial use of this feature in the development of a learning algorithm.
\noindent
Our main contributions are the following. Our main contributions are the following.
\startitemize \startitemize[after]
\item A generalisation of Angluin's original algorithm to nominal automata. \item A generalisation of Angluin's original algorithm to nominal automata.
The generalisation follows a generic pattern for transporting computation models from finite sets to nominal sets, which leads to simple correctness proofs and opens the door to further generalisations. The generalisation follows a generic pattern for transporting computation models from finite sets to nominal sets, which leads to simple correctness proofs and opens the door to further generalisations.
The use of nominal sets with different symmetries also creates potential for generalisation, e.g., to languages with time features \cite[DBLP:conf/icalp/BojanczykL12] or data dependencies represented as graphs \cite[DBLP:journals/tcs/MontanariS14]. The use of nominal sets with different symmetries also creates potential for generalisation, e.g., to languages with time features \cite[DBLP:conf/icalp/BojanczykL12] or data dependencies represented as graphs \cite[DBLP:journals/tcs/MontanariS14].
@ -61,35 +61,38 @@ It is important to note that, in the nominal setting, NFAs are strictly more exp
We learn a subclass of the languages accepted by nominal NFAs, which includes all the languages accepted by nominal DFAs. We learn a subclass of the languages accepted by nominal NFAs, which includes all the languages accepted by nominal DFAs.
The main advantage of learning NFAs directly is that they can provide exponentially smaller automata when compared to their deterministic counterpart. The main advantage of learning NFAs directly is that they can provide exponentially smaller automata when compared to their deterministic counterpart.
This can be seen both as a generalisation and as an optimisation of the algorithm. This can be seen both as a generalisation and as an optimisation of the algorithm.
\item An implementation using our recently developed Haskell library tailored to nominal computation -- NLambda, or \NLambda{}, by \cite[authoryears][DBLP:journals/corr/KlinS16]. \item An implementation using a recently developed Haskell library tailored to nominal computation -- NLambda, or \NLambda{}, by \cite[authoryears][DBLP:journals/corr/KlinS16].
Our implementation is the first non-trivial application of a novel programming paradigm of functional programming over infinite structures, which allows the programmer to rely on convenient intuitions of searching through infinite sets in finite time. Our implementation is the first non-trivial application of a novel programming paradigm of functional programming over infinite structures, which allows the programmer to rely on convenient intuitions of searching through infinite sets in finite time.
\stopitemize \stopitemize
The paper is organised as follows. This chapter is organised as follows.
In \in{Section}[sec:overview], we present an overview of our contributions (and the original algorithm) highlighting the challenges we faced in the various steps. In \in{Section}[sec:overview], we present an overview of our contributions (and the original algorithm) highlighting the challenges we faced in the various steps.
In \in{Section}[sec:prelim], we revise some basic concepts of nominal sets and automata. In \in{Section}[sec:prelim], we revise some basic concepts of nominal sets and automata.
\in{Section}[sec:nangluin] contains the core technical contributions of our paper: The new algorithm and proof of correctness. \in{Section}[sec:nangluin] contains the core technical contributions:
The new algorithm and proof of correctness.
In \in{Section}[sec:nondet], we describe an algorithm to learn nominal non-deterministic automata. In \in{Section}[sec:nondet], we describe an algorithm to learn nominal non-deterministic automata.
\in{Section}[sec:impl] contains a description of NLambda, details of the implementation, and results of preliminary experiments. \in{Section}[sec:impl] contains a description of NLambda, details of the implementation, and results of preliminary experiments.
\in{Section}[sec:related] contains a discussion of related work. We conclude the paper with a discussion section where also future directions are presented. \in{Section}[sec:related] contains a discussion of related work.
We conclude this chapter with a discussion section where also future directions are presented.
\startsection \startsection
[title={Overview of the Approach}, [title={Overview of the Approach},
reference=sec:overview] reference=sec:overview]
In this section, we give an overview of the work developed in the paper through examples. In this section, we give an overview through examples.
We will start by explaining the original algorithm for regular languages over finite alphabets, and then explain the challenges in extending it to nominal languages. We will start by explaining the original algorithm for regular languages over finite alphabets, and then explain the challenges in extending it to nominal languages.
Angluin's algorithm \LStar{} provides a procedure to learn the minimal DFA accepting a certain (unknown) language $\lang$. Angluin's algorithm \LStar{} provides a procedure to learn the minimal DFA accepting a certain (unknown) language $\lang$.
The algorithm has access to a \emph{teacher} which answers two types of queries: The algorithm has access to a \emph{teacher} which answers two types of queries:
\startitemize \startitemize[after]
\item \emph{membership queries}, consisting of a single word $w \in A^{\ast}$, to which the teacher will reply whether $w \in \lang$ or not; \item \emph{membership queries}, consisting of a single word $w \in A^{\ast}$, to which the teacher will reply whether $w \in \lang$ or not;
\item \emph{equivalence queries}, consisting of a hypothesis DFA $H$, to which the teacher replies {\bf yes} if $\lang(H) = \lang$, and {\bf no} otherwise, providing a counterexample $w \in \lang(H) \triangle \lang$ ($\triangle$ denotes the symmetric difference of two languages). \item \emph{equivalence queries}, consisting of a hypothesis DFA $H$, to which the teacher replies {\bf yes} if $\lang(H) = \lang$, and {\bf no} otherwise, providing a counterexample $w \in \lang(H) \triangle \lang$ (where $\triangle$ denotes the symmetric difference of two languages).
\stopitemize \stopitemize
The learning algorithm works by incrementally building an {\em observation table}, which at each stage contains partial information about the language $\lang$. The learning algorithm works by incrementally building an {\em observation table}, which at each stage contains partial information about the language $\lang$.
The algorithm is able to fill the table with membership queries. The algorithm is able to fill the table with membership queries.
As an example, and to set notation, consider the following table (over $A=\{a, b\}$). As an example, and to set notation, consider the following table (over the alphabet $A=\{a, b\}$).
\placefigure[force, none]{}{ \placefigure[force, none]{}{
\hbox{\starttikzpicture \hbox{\starttikzpicture
@ -135,7 +138,10 @@ An observation table $(S,E)$ is {\em consistent} if, whenever $s_1$ and $s_2$ ar
Each time the algorithm constructs an automaton, it poses an equivalence query to the teacher. Each time the algorithm constructs an automaton, it poses an equivalence query to the teacher.
It terminates when the answer is {\bf yes}, otherwise it extends the table with the counterexample provided. It terminates when the answer is {\bf yes}, otherwise it extends the table with the counterexample provided.
\startplacealgorithm[title={The \LStar{} learning algorithm from \cite[authoryears][DBLP:journals/iandc/Angluin87].}, reference=alg:alg] \startplacealgorithm
[title={The \LStar{} learning algorithm from \cite[authoryears][DBLP:journals/iandc/Angluin87].},
location=top,
reference=alg:alg]
\startalgorithmic \startalgorithmic
\startlinenumbering \startlinenumbering
\STATE $S,E \gets \{\epsilon\}$ \STATE $S,E \gets \{\epsilon\}$
@ -167,19 +173,22 @@ It terminates when the answer is {\bf yes}, otherwise it extends the table with
Angluin's algorithm is displayed in \in{Algorithm}[alg:alg]. Throughout this section, we will consider the language(s) Angluin's algorithm is displayed in \in{Algorithm}[alg:alg]. Throughout this section, we will consider the language(s)
\startformula \startformula
\lang_n = \left{ ww \mid w \in A^{\ast}, |w| = n \right} \lang_n = \left{ ww \mid w \in A^{\ast}, |w| = n \right}.
\stopformula \stopformula
If the alphabet $A$ is finite then $\lang_n$ is regular for any $n \in \N$, and there is a finite DFA accepting it. If the alphabet $A$ is finite then $\lang_n$ is regular for any $n \in \N$, and there is a finite DFA accepting it.
The language $\lang_1 = \{ aa , bb \}$ looks trivial, but the minimal DFA recognizing it has as many as 5 states. The language $\lang_1 = \{ aa , bb \}$ looks trivial, but the minimal DFA recognising it has as many as 5 states.
Angluin's algorithm will terminate in (at most) 5 steps. Angluin's algorithm will terminate in (at most) 5 steps.
We illustrate some relevant ones. We illustrate some relevant ones.
\startstep{1} \startstep{1}
We start from $S, E = \{\epsilon\}$, and we fill the entries of the table below by asking membership queries for $\epsilon$, $a$ and $b$. We start from $S, E = \{\epsilon\}$, and we fill the entries of the table below by asking membership queries for $\epsilon$, $a$ and $b$.
The table is closed and consistent, so we construct the hypothesis $\autom_1$. The table is closed and consistent, so we construct the hypothesis $\autom_1$,
where $q_0 = row(\epsilon) = \{ \epsilon \mapsto 0 \}$:
\starttikzpicture \placefigure[force, none]{}{
\startcombination[nx=2, ny=1, location=middle]
{\hbox{\starttikzpicture
\matrix[obs table](tab) \matrix[obs table](tab)
{ {
\phantom{\epsilon} \& \epsilon \\ \phantom{\epsilon} \& \epsilon \\
@ -190,14 +199,13 @@ The table is closed and consistent, so we construct the hypothesis $\autom_1$.
\draw (tab-2-2.north -| tab.west) -- (tab-2-2.north -| tab.east); \draw (tab-2-2.north -| tab.west) -- (tab-2-2.north -| tab.east);
\draw (tab-3-2.north -| tab.west) -- (tab-3-2.north -| tab.east); \draw (tab-3-2.north -| tab.west) -- (tab-3-2.north -| tab.east);
\draw (tab-2-2.west |- tab.north) -- (tab-2-2.west |- tab.south); \draw (tab-2-2.west |- tab.north) -- (tab-2-2.west |- tab.south);
\stoptikzpicture \stoptikzpicture}}{}
\starttikzpicture {\hbox{\starttikzpicture
\node[initial,state,initial text={$\autom_1 = $}] (q0) {$q_0$}; \node[initial,state,initial text={$\autom_1 = $}] (q0) {$q_0$};
\path (q0) edge[loop right] node[right] {$a,b$} (q0); \path (q0) edge[loop right] node[right] {$a,b$} (q0);
\stoptikzpicture \stoptikzpicture}
\startformula }{}
q_0 = row(\epsilon) = \{ \epsilon \mapsto 0 \} \stopcombination}
\stopformula
The Teacher replies {\bf no} and gives the counterexample $aa$, which is in $\lang_1$ but it is not accepted by $\autom_1$. The Teacher replies {\bf no} and gives the counterexample $aa$, which is in $\lang_1$ but it is not accepted by $\autom_1$.
Therefore, \inline{line}[line:addrow-ex] of the algorithm is triggered and we set $S = \{\epsilon, a, aa\}$. Therefore, \inline{line}[line:addrow-ex] of the algorithm is triggered and we set $S = \{\epsilon, a, aa\}$.
@ -207,10 +215,12 @@ Therefore, \inline{line}[line:addrow-ex] of the algorithm is triggered and we se
The table becomes the one on the left below. The table becomes the one on the left below.
It is closed, but not consistent: It is closed, but not consistent:
Rows $\epsilon$ and $a$ are identical, but appending $a$ leads to different rows, as depicted. Rows $\epsilon$ and $a$ are identical, but appending $a$ leads to different rows, as depicted.
Therefore, line 9 is triggered and an extra column $a$, highlighted in red, is added. Therefore, \inline{line}[line:cons-witness] is triggered and an extra column $a$, highlighted in red, is added.
The new table is closed and consistent and a new hypothesis $\autom_2$ is constructed. The new table is closed and consistent and a new hypothesis $\autom_2$ is constructed.
\starttikzpicture \placefigure[force, none]{}{
\startcombination[nx=3, ny=1, location=middle]
{\hbox{\starttikzpicture[bend angle=45]
\matrix[obs table](tab) \matrix[obs table](tab)
{ {
\phantom{\epsilon} \& \epsilon \\ \phantom{\epsilon} \& \epsilon \\
@ -230,10 +240,10 @@ The new table is closed and consistent and a new hypothesis $\autom_2$ is constr
\node[fit=(tab-3-1)(tab-3-2), dotted, draw, inner sep=-2pt] (r2) {}; \node[fit=(tab-3-1)(tab-3-2), dotted, draw, inner sep=-2pt] (r2) {};
\node[fit=(tab-4-1)(tab-4-2), dotted, draw, inner sep=-2pt] (r3) {}; \node[fit=(tab-4-1)(tab-4-2), dotted, draw, inner sep=-2pt] (r3) {};
\path[->] (r1.west) edge[in=160,out=190,looseness=2] node[left] {$a$} (r2.west); \path[->] (r1.west) edge[bend right] node[left] {$a$} (r2.west);
\path[->] (r2.east) edge[in=20,out=-10,looseness=2] node[right] {$a$} (r3.east); \path[->] (r2.east) edge[bend left] node[right] {$a$} (r3.east);
\stoptikzpicture \stoptikzpicture}}{}
\starttikzpicture {\hbox{\starttikzpicture
\matrix[obs table, column 3/.style={nodes={color=red}}](tab) \matrix[obs table, column 3/.style={nodes={color=red}}](tab)
{ {
\phantom{\epsilon} \& \epsilon \& a \\ \phantom{\epsilon} \& \epsilon \& a \\
@ -248,8 +258,8 @@ The new table is closed and consistent and a new hypothesis $\autom_2$ is constr
\draw (tab-2-2.north -| tab.west) -- (tab-2-2.north -| tab.east); \draw (tab-2-2.north -| tab.west) -- (tab-2-2.north -| tab.east);
\draw (tab-5-2.north -| tab.west) -- (tab-5-2.north -| tab.east); \draw (tab-5-2.north -| tab.west) -- (tab-5-2.north -| tab.east);
\draw (tab-2-2.west |- tab.north) -- (tab-2-2.west |- tab.south); \draw (tab-2-2.west |- tab.north) -- (tab-2-2.west |- tab.south);
\stoptikzpicture \stoptikzpicture}}{}
\starttikzpicture {\hbox{\starttikzpicture
\node[initial,state,initial text={$\autom_2 = $}] (q0) {$q_0$}; \node[initial,state,initial text={$\autom_2 = $}] (q0) {$q_0$};
\node[state,right of=q0] (q1) {$q_1$}; \node[state,right of=q0] (q1) {$q_1$};
\node[state,accepting,below of=q1] (q2) {$q_2$}; \node[state,accepting,below of=q1] (q2) {$q_2$};
@ -260,7 +270,8 @@ The new table is closed and consistent and a new hypothesis $\autom_2$ is constr
(q1) edge[bend right] node[above] {$b$} (q0) (q1) edge[bend right] node[above] {$b$} (q0)
(q1) edge node[right] {$a$} (q2) (q1) edge node[right] {$a$} (q2)
(q2) edge node[midway,fill=white] {$a, b$} (q0); (q2) edge node[midway,fill=white] {$a, b$} (q0);
\stoptikzpicture \stoptikzpicture}}{}
\stopcombination}
The Teacher again replies {\bf no} and gives the counterexample $bb$, which should be accepted by $\autom_2$ but it is not. Therefore we put $S \gets S \cup \{b,bb\}$. The Teacher again replies {\bf no} and gives the counterexample $bb$, which should be accepted by $\autom_2$ but it is not. Therefore we put $S \gets S \cup \{b,bb\}$.
\stopstep \stopstep
@ -271,7 +282,9 @@ It is closed, but $\epsilon$ and $b$ violate consistency, when $b$ is appended.
Therefore we add the column $b$ and we get the table on the right, which is closed and consistent. Therefore we add the column $b$ and we get the table on the right, which is closed and consistent.
The new hypothesis is $\autom_3$. The new hypothesis is $\autom_3$.
\starttikzpicture \placefigure[force, none]{}{
\startcombination[nx=3, ny=1, location=middle]
{\hbox{\starttikzpicture[bend angle=45]
\matrix[obs table](tab) \matrix[obs table](tab)
{ {
\phantom{\epsilon} \& \epsilon \& a \\ \phantom{\epsilon} \& \epsilon \& a \\
@ -295,10 +308,10 @@ The new hypothesis is $\autom_3$.
\node[fit=(tab-5-1)(tab-5-3), dotted, draw, inner sep=-2pt] (r2) {}; \node[fit=(tab-5-1)(tab-5-3), dotted, draw, inner sep=-2pt] (r2) {};
\node[fit=(tab-6-1)(tab-6-3), dotted, draw, inner sep=-2pt] (r3) {}; \node[fit=(tab-6-1)(tab-6-3), dotted, draw, inner sep=-2pt] (r3) {};
\path[->] (r1.west) edge[in=160,out=190,looseness=2] node[left] {$b$} (r2.west); \path[->] (r1.west) edge[bend right] node[left] {$b$} (r2.west);
\path[->] (r2.east) edge[in=20,out=-10,looseness=2] node[right] {$b$} (r3.east); \path[->] (r2.east) edge[bend left] node[right] {$b$} (r3.east);
\stoptikzpicture \stoptikzpicture}}{}
\starttikzpicture {\hbox{\starttikzpicture
\matrix[obs table, column 4/.style={nodes={color=red}}](tab) \matrix[obs table, column 4/.style={nodes={color=red}}](tab)
{ {
\phantom{\epsilon} \& \epsilon \& a \& b \\ \phantom{\epsilon} \& \epsilon \& a \& b \\
@ -317,8 +330,8 @@ The new hypothesis is $\autom_3$.
\draw (tab-2-2.north -| tab.west) -- (tab-2-2.north -| tab.east); \draw (tab-2-2.north -| tab.west) -- (tab-2-2.north -| tab.east);
\draw (tab-7-2.north -| tab.west) -- (tab-7-2.north -| tab.east); \draw (tab-7-2.north -| tab.west) -- (tab-7-2.north -| tab.east);
\draw (tab-2-2.west |- tab.north) -- (tab-2-2.west |- tab.south); \draw (tab-2-2.west |- tab.north) -- (tab-2-2.west |- tab.south);
\stoptikzpicture \stoptikzpicture}}{}
\starttikzpicture {\hbox{\starttikzpicture
\node[initial,state,initial text={$\autom_3 = $}] (q0) {$q_0$}; \node[initial,state,initial text={$\autom_3 = $}] (q0) {$q_0$};
\node[state,right of=q0] (q1) {$q_1$}; \node[state,right of=q0] (q1) {$q_1$};
\node[state,below of=q0] (q2) {$q_2$}; \node[state,below of=q0] (q2) {$q_2$};
@ -332,7 +345,8 @@ The new hypothesis is $\autom_3$.
(q2) edge[bend right] node[left] {$a$} (q0) (q2) edge[bend right] node[left] {$a$} (q0)
(q2) edge node[above] {$b$} (q3) (q2) edge node[above] {$b$} (q3)
(q3) edge node[fill=white,midway] {$a,b$} (q0); (q3) edge node[fill=white,midway] {$a,b$} (q0);
\stoptikzpicture \stoptikzpicture}}{}
\stopcombination}
The Teacher replies {\bf no} and provides the counterexample $babb$, so $S \gets S \cup \{ba,bab\}$. The Teacher replies {\bf no} and provides the counterexample $babb$, so $S \gets S \cup \{ba,bab\}$.
\stopstep \stopstep
@ -340,7 +354,8 @@ The Teacher replies {\bf no} and provides the counterexample $babb$, so $S \gets
\startstep{4} \startstep{4}
One more step brings us to the correct hypothesis $\autom_4$ (details are omitted). One more step brings us to the correct hypothesis $\autom_4$ (details are omitted).
\starttikzpicture \placefigure[force, none]{}{
\hbox{\starttikzpicture
\node[initial,state,initial text={$\autom_4 = $}] (q0) {$q_0$}; \node[initial,state,initial text={$\autom_4 = $}] (q0) {$q_0$};
\node[state,above right= 15pt of q0] (q1) {$q_1$}; \node[state,above right= 15pt of q0] (q1) {$q_1$};
\node[state,below right= 15pt of q0] (q2) {$q_2$}; \node[state,below right= 15pt of q0] (q2) {$q_2$};
@ -356,7 +371,7 @@ One more step brings us to the correct hypothesis $\autom_4$ (details are omitte
(q3) edge node[above] {$a,b$} (q4) (q3) edge node[above] {$a,b$} (q4)
(q2) edge[bend right] node[below] {$a$} (q4) (q2) edge[bend right] node[below] {$a$} (q4)
(q4) edge[loop right] node[right] {$a,b$} (q4); (q4) edge[loop right] node[right] {$a,b$} (q4);
\stoptikzpicture \stoptikzpicture}}
\stopstep \stopstep
@ -370,7 +385,8 @@ Consider now an infinite alphabet $A = \{a,b,c,d,\dots\}$.
The language $\lang_1$ becomes $\{aa,bb,cc,dd,\dots\}$. The language $\lang_1$ becomes $\{aa,bb,cc,dd,\dots\}$.
Classical theory of finite automata does not apply to this kind of languages, but one may draw an infinite deterministic automaton that recognises $\lang_1$ in the standard sense: Classical theory of finite automata does not apply to this kind of languages, but one may draw an infinite deterministic automaton that recognises $\lang_1$ in the standard sense:
\starttikzpicture \placefigure[force, none]{}{
\hbox{\starttikzpicture
\node[initial,state,initial text={$\autom_5 = $}] (q0) {$q_0$}; \node[initial,state,initial text={$\autom_5 = $}] (q0) {$q_0$};
\node[state, right of=q0] (qb) {$q_b$}; \node[state, right of=q0] (qb) {$q_b$};
\node[state, above=10 pt of qb] (qa) {$q_a$}; \node[state, above=10 pt of qb] (qa) {$q_a$};
@ -390,13 +406,14 @@ Classical theory of finite automata does not apply to this kind of languages, bu
(qb) edge[bend right] node[below] {${\neq} b$} (q4) (qb) edge[bend right] node[below] {${\neq} b$} (q4)
(qdots) edge[bend right] node {} (q4) (qdots) edge[bend right] node {} (q4)
(q4) edge[loop right] node[right] {$A$} (q4); (q4) edge[loop right] node[right] {$A$} (q4);
\stoptikzpicture \stoptikzpicture}}
where $\tot{A}$ and $\tot{{\neq} a}$ stand for the infinitely-many transitions labelled by elements of $A$ and $A \setminus \{a\}$, respectively. where $\tot{A}$ and $\tot{{\neq} a}$ stand for the infinitely-many transitions labelled by elements of $A$ and $A \setminus \{a\}$, respectively.
This automaton is infinite, but it can be finitely presented in a variety of ways, for example: This automaton is infinite, but it can be finitely presented in a variety of ways, for example:
\starttikzpicture \placefigure[force, none]{}{
\node[initial,state,initial text={}] (q0) {$q_0$}; \hbox{\starttikzpicture
\node[initial,state,initial text={$\autom_6 = $}] (q0) {$q_0$};
\node[state, right of=q0] (qx) {$q_x$}; \node[state, right of=q0] (qx) {$q_x$};
\node[state,accepting,right of=qx] (q3) {$q_3$}; \node[state,accepting,right of=qx] (q3) {$q_3$};
\node[state,right of=q3] (q4) {$q_4$}; \node[state,right of=q3] (q4) {$q_4$};
@ -408,22 +425,22 @@ This automaton is infinite, but it can be finitely presented in a variety of way
(q3) edge node[above] {$A$} (q4) (q3) edge node[above] {$A$} (q4)
(qx) edge[bend right] node[below] {${\neq} x$} (q4) (qx) edge[bend right] node[below] {${\neq} x$} (q4)
(q4) edge[loop right] node[right] {$A$} (q4); (q4) edge[loop right] node[right] {$A$} (q4);
\stoptikzpicture \stoptikzpicture}}
One can formalise the quantifier notation above (or indeed the \quotation{dots} notation above that) in several ways. One can formalise the quantifier notation above (or indeed the \quotation{dots} notation above that) in several ways.
A popular solution is to consider finite {\em register automata} \cite[DBLP:journals/tcs/KaminskiF94, DBLP:journals/tocl/DemriL09], i.e., finite automata equipped with a finite number of registers where alphabet letters can be stored and later compared for equality. A popular solution is to consider finite {\em register automata} \cite[DBLP:journals/tcs/KaminskiF94, DBLP:journals/tocl/DemriL09], i.e., finite automata equipped with a finite number of registers where alphabet letters can be stored and later compared for equality.
Our language $\lang_1$ is recognised by a simple automaton with four states and one register. Our language $\lang_1$ is recognised by a simple automaton with four states and one register.
The problem of learning registered automata has been successfully attacked before The problem of learning registered automata has been successfully attacked before by, for instance,
\cite[DBLP:conf/vmcai/HowarSJC12]. \citet[DBLP:conf/vmcai/HowarSJC12].
In this paper, however, we will consider nominal automata \citep[DBLP:journals/corr/BojanczykKL14] instead. In this chapter, however, we consider nominal automata by \citet[DBLP:journals/corr/BojanczykKL14] instead.
These automata ostensibly have infinitely many states, but the set of states can be finitely presented in a way open to effective manipulation. These automata ostensibly have infinitely many states, but the set of states can be finitely presented in a way open to effective manipulation.
More specifically, in a nominal automaton the set of states is subject to an action of permutations of a set $\EAlph$ of {\em atoms}, and it is finite up to that action. More specifically, in a nominal automaton the set of states is subject to an action of permutations of a set $\EAlph$ of {\em atoms}, and it is finite up to that action.
For example, the set of states of $\autom_5$ is: For example, the set of states of $\autom_5$ is:
\startformula \startformula
\{q_0,q_3,q_4\} \cup \{q_a \mid a\in A\} \{q_0,q_3,q_4\} \cup \{q_a \mid a\in A\}
\stopformula \stopformula
and it is equipped with a canonical action of permutations $\pi \colon \EAlph \to \EAlph$ that maps every $q_a$ to $q_{\pi(a)}$, and leaves $q_0$, $q_3$ and $q_4$ fixed. and it is equipped with a canonical action of permutations $\pi \colon \EAlph \to \EAlph$ that maps every $q_a$ to $q_{\pi(a)}$ and leaves $q_0$, $q_3$ and $q_4$ fixed.
Technically speaking, the set of states has four {\em orbits} (one infinite orbit and three fixed points) of the action of the group of permutations of $\EAlph$. Technically speaking, the set of states has four {\em orbits} (one infinite orbit and three fixed points) of the action of the group of permutations of $\EAlph$.
% %
Moreover, it is required that in a nominal automaton the transition relation is {\em equivariant}, i.e., closed under the action of permutations. Moreover, it is required that in a nominal automaton the transition relation is {\em equivariant}, i.e., closed under the action of permutations.
@ -432,8 +449,8 @@ For example, it has a transition $q_a \tot{a} q_3$, and for any $\pi \colon \EAl
Nominal automata with finitely many orbits of states are equi-expressive with finite register automata \cite[DBLP:journals/corr/BojanczykKL14], but they have an important theoretical advantage: Nominal automata with finitely many orbits of states are equi-expressive with finite register automata \cite[DBLP:journals/corr/BojanczykKL14], but they have an important theoretical advantage:
They are a direct reformulation of the classical notion of finite automaton, where one replaces finite sets with orbit-finite sets and functions (or relations) with equivariant ones. They are a direct reformulation of the classical notion of finite automaton, where one replaces finite sets with orbit-finite sets and functions (or relations) with equivariant ones.
A research programme advocated in \cite[DBLP:journals/corr/BojanczykKL14, DBLP:conf/popl/BojanczykBKL12] is to transport various computation models, algorithms and theorems along this correspondence. A research programme advocated by Boja\'{n}czyk, et al. is to transport various computation models, algorithms and theorems along this correspondence.
This can often be done with remarkable accuracy, and our paper is a witness to this. This can often be done with remarkable accuracy, and our results are a witness to this.
Indeed, as we shall see, nominal automata can be learned with an algorithm that is almost a verbatim copy of the classical Angluin's one. Indeed, as we shall see, nominal automata can be learned with an algorithm that is almost a verbatim copy of the classical Angluin's one.
Indeed, consider applying Angluin's algorithm to our new language $\lang_1$. Indeed, consider applying Angluin's algorithm to our new language $\lang_1$.
@ -441,13 +458,15 @@ The key idea is to change the basic data structure:
Our observation table $(S,E)$ will be such that \emph{$S$ and $E$ are equivariant subsets of $A^{\ast}$}, i.e., they are closed under the canonical action of atom permutations. Our observation table $(S,E)$ will be such that \emph{$S$ and $E$ are equivariant subsets of $A^{\ast}$}, i.e., they are closed under the canonical action of atom permutations.
In general, such a table has {\em infinitely many rows and columns}, so the following aspects of \in{Algorithm}[alg:alg] seem problematic: In general, such a table has {\em infinitely many rows and columns}, so the following aspects of \in{Algorithm}[alg:alg] seem problematic:
\description{\inline{line}[line:checks]} closedness and consistency tests range over infinite sets; \startitemize[after]
\item
\description{\inline{line}[line:clos-witness] and \inline{line}[line:cons-witness]} finding witnesses for closedness or consistency violations potentially require checking all infinitely many rows; \inline{line}[line:begin-closed] and \inline{line}[line:begin-const]:
\todo{Is hetzelfde als die hierboven. Doe maar line 4 en 9: oneindige checks. Extra clause over $S$ en $E$ oneindig?} finding witnesses for closedness or consistency violations potentially require checking all infinitely many rows;
\item
\description{\inline{line}[line:addrow-ex]} every counterexample $t$ has only infinitely many prefixes, so it is not clear how one would construct an infinite set $S$ in finite time. \inline{line}[line:addrow-ex]:
every counterexample $t$ has infinitely many prefixes, so it is not clear how one constructs an infinite set $S$ in finite time.
However, an infinite $S$ is necessary for the algorithm to ever succeed, because no finite automaton recognises $\lang_1$. However, an infinite $S$ is necessary for the algorithm to ever succeed, because no finite automaton recognises $\lang_1$.
\stopitemize
At this stage, we need to observe that due to equivariance of $S$, $E$ and $\lang_1$, the following crucial properties hold: At this stage, we need to observe that due to equivariance of $S$, $E$ and $\lang_1$, the following crucial properties hold:
@ -466,7 +485,9 @@ We have $S \Lext A = A$, which is infinite but admits a finite representation.
In fact, for any $a \in A$, we have $A = \{ \pi(a) \mid \text{$\pi$ is a permutation}\}$. In fact, for any $a \in A$, we have $A = \{ \pi(a) \mid \text{$\pi$ is a permutation}\}$.
Then, by {\bf (P2)}, $row(\pi(a))(\epsilon) = row(a)(\epsilon) = 0$, for all $\pi$, so the first table can be written as: Then, by {\bf (P2)}, $row(\pi(a))(\epsilon) = row(a)(\epsilon) = 0$, for all $\pi$, so the first table can be written as:
\starttikzpicture \placefigure[force, none]{}{
\startcombination[nx=2, ny=1, location=middle]
{\hbox{\starttikzpicture
\matrix[obs table](tab) \matrix[obs table](tab)
{ {
\phantom{\epsilon} \& \epsilon \\ \phantom{\epsilon} \& \epsilon \\
@ -476,11 +497,12 @@ Then, by {\bf (P2)}, $row(\pi(a))(\epsilon) = row(a)(\epsilon) = 0$, for all $\p
\draw (tab-2-2.north -| tab.west) -- (tab-2-2.north -| tab.east); \draw (tab-2-2.north -| tab.west) -- (tab-2-2.north -| tab.east);
\draw (tab-3-2.north -| tab.west) -- (tab-3-2.north -| tab.east); \draw (tab-3-2.north -| tab.west) -- (tab-3-2.north -| tab.east);
\draw (tab-2-2.west |- tab.north) -- (tab-2-2.west |- tab.south); \draw (tab-2-2.west |- tab.north) -- (tab-2-2.west |- tab.south);
\stoptikzpicture \stoptikzpicture}}{}
\starttikzpicture {\hbox{\starttikzpicture
\node[initial,state,initial text={$\autom'_1 = $}] (q0) {$q_0$}; \node[initial,state,initial text={$\autom'_1 = $}] (q0) {$q_0$};
\path (q0) edge[loop right] node[right] {$A$} (q0); \path (q0) edge[loop right] node[right] {$A$} (q0);
\stoptikzpicture \stoptikzpicture}}{}
\stopcombination}
It is closed and consistent. It is closed and consistent.
Our hypothesis is $\autom_1'$, where $\delta_{\autom_1'}(row(\epsilon),x) = row(x) = q_0$, for all $x \in A$. Our hypothesis is $\autom_1'$, where $\delta_{\autom_1'}(row(\epsilon),x) = row(x) = q_0$, for all $x \in A$.
@ -492,7 +514,8 @@ By equivariance of $\lang_1$, the counterexample tells us that \emph{all} words
Therefore we extend $S$ with the (infinite!) set of such words. Therefore we extend $S$ with the (infinite!) set of such words.
The new symbolic table is: The new symbolic table is:
\starttikzpicture \placefigure[force, none]{}{
\hbox{\starttikzpicture
\matrix[obs table](tab) \matrix[obs table](tab)
{ {
\phantom{\epsilon} \& \epsilon \\ \phantom{\epsilon} \& \epsilon \\
@ -506,7 +529,7 @@ The new symbolic table is:
\draw (tab-2-2.north -| tab.west) -- (tab-2-2.north -| tab.east); \draw (tab-2-2.north -| tab.west) -- (tab-2-2.north -| tab.east);
\draw (tab-5-2.north -| tab.west) -- (tab-5-2.north -| tab.east); \draw (tab-5-2.north -| tab.west) -- (tab-5-2.north -| tab.east);
\draw (tab-2-2.west |- tab.north) -- (tab-2-2.west |- tab.south); \draw (tab-2-2.west |- tab.north) -- (tab-2-2.west |- tab.south);
\stoptikzpicture \stoptikzpicture}}
The lower part stands for elements of $S \Lext A$. The lower part stands for elements of $S \Lext A$.
For instance, $ab$ stands for words obtained by appending a fresh letter to words of length 1 (row $a$). For instance, $ab$ stands for words obtained by appending a fresh letter to words of length 1 (row $a$).
@ -520,7 +543,8 @@ We found a \quotation{symbolic} witness $a$ for such violation.
In order to fix consistency, while keeping $E$ equivariant, we need to add columns for all $\pi(a)$. In order to fix consistency, while keeping $E$ equivariant, we need to add columns for all $\pi(a)$.
The resulting table is The resulting table is
\starttikzpicture \placefigure[force, none]{}{
\hbox{\starttikzpicture
\matrix[obs table](tab) \matrix[obs table](tab)
{ {
\phantom{\epsilon} \& \epsilon \& a \& b \& c \& \dots \\ \phantom{\epsilon} \& \epsilon \& a \& b \& c \& \dots \\
@ -534,7 +558,7 @@ The resulting table is
\draw (tab-2-2.north -| tab.west) -- (tab-2-2.north -| tab.east); \draw (tab-2-2.north -| tab.west) -- (tab-2-2.north -| tab.east);
\draw (tab-5-2.north -| tab.west) -- (tab-5-2.north -| tab.east); \draw (tab-5-2.north -| tab.west) -- (tab-5-2.north -| tab.east);
\draw (tab-2-2.west |- tab.north) -- (tab-2-2.west |- tab.south); \draw (tab-2-2.west |- tab.north) -- (tab-2-2.west |- tab.south);
\stoptikzpicture \stoptikzpicture}}
where non-specified entries are $0$. where non-specified entries are $0$.
Only finitely many entries of the table are relevant: Only finitely many entries of the table are relevant:
@ -545,7 +569,8 @@ The table is trivially consistent.
Notice that this step encompasses both {\bf Step 2} and {\bf 3}, because the rows $b$ and $bb$ added by {\bf Step 2} are already represented by $a$ and $aa$. Notice that this step encompasses both {\bf Step 2} and {\bf 3}, because the rows $b$ and $bb$ added by {\bf Step 2} are already represented by $a$ and $aa$.
The hypothesis automaton is The hypothesis automaton is
\starttikzpicture[node distance=2cm] \placefigure[force, none]{}{
\hbox{\starttikzpicture[node distance=2cm]
\node[initial,state,initial text={$\autom'_2 = $}] (q0) {$q_0$}; \node[initial,state,initial text={$\autom'_2 = $}] (q0) {$q_0$};
\node[state,right of=q0] (qx) {$q_x$}; \node[state,right of=q0] (qx) {$q_x$};
\node[state,accepting,right of=qx] (q2) {$q_2$}; \node[state,accepting,right of=qx] (q2) {$q_2$};
@ -556,11 +581,12 @@ The hypothesis automaton is
(qx) edge[bend left] node[above] {$\neq x$} (q0) (qx) edge[bend left] node[above] {$\neq x$} (q0)
(qx) edge node[above] {$x$} (q2) (qx) edge node[above] {$x$} (q2)
(q2) edge[bend right=50] node[below] {$A$} (q0); (q2) edge[bend right=50] node[below] {$A$} (q0);
\stoptikzpicture \stoptikzpicture}}
This is again incorrect, but one additional step will give the correct hypothesis automaton, shown earlier in \in{?}[eq:aut]). This is again incorrect, but one additional step will give the correct hypothesis automaton $\autom_6$.
\stopstep \stopstep
\stopsubsection \stopsubsection
\startsubsection \startsubsection
[title={Generalisation to Non-Deterministic Automata}] [title={Generalisation to Non-Deterministic Automata}]
@ -580,29 +606,31 @@ For the learning algorithm the power of non-determinism means that we can make s
If we want to make the table closed, we were previously required to find an equivalent row in the upper part; now we may find a sum of rows which, together, are equivalent to an existing row. If we want to make the table closed, we were previously required to find an equivalent row in the upper part; now we may find a sum of rows which, together, are equivalent to an existing row.
This means that in some cases fewer rows will be added for closedness. This means that in some cases fewer rows will be added for closedness.
\stopsubsection \stopsubsection
\stopsection \stopsection
\startsection \startsection
[title={Preliminaries}, [title={Preliminaries},
reference=sec:prelim] reference=sec:prelim]
We recall the notions of nominal sets, nominal automata and nominal regular languages (see \cite[DBLP:journals/corr/BojanczykKL14] for a detailed account). We recall the notions of nominal sets, nominal automata and nominal regular languages.
We refer to \citet[DBLP:journals/corr/BojanczykKL14] for a detailed account.
Let $\EAlph$ be a countable set and let $\Perm(\EAlph)$ be the set of \emph{permutations on $\EAlph$}, i.e., the bijective functions $\pi \colon \EAlph \to \EAlph$. Let $\EAlph$ be a countable set and let $\Perm(\EAlph)$ be the set of \emph{permutations on $\EAlph$}, i.e., the bijective functions $\pi \colon \EAlph \to \EAlph$.
Permutations form a group where the identity permutation $\id$ is the unit element, inverse is functional inverse and multiplication is function composition. Permutations form a group where the identity permutation $\id$ is the unit element, inverse is functional inverse and multiplication is function composition.
A \emph{nominal set} \cite[Pitts13] is a set $X$ plus a function $\cdot \colon \Perm(\EAlph) \times X \to X$, interpreting permutations over $X$. A \emph{nominal set} \cite[Pitts13] is a set $X$ together with a function $\cdot \colon \Perm(\EAlph) \times X \to X$, interpreting permutations over $X$.
Such function must be a \emph{group action} of $\Perm(\EAlph)$, i.e., it must satisfy $\id \cdot x = x$ and $\pi \cdot (\pi' \cdot x) = (\pi \circ \pi') \cdot x$. Such function must be a \emph{group action} of $\Perm(\EAlph)$, i.e., it must satisfy $\id \cdot x = x$ and $\pi \cdot (\pi' \cdot x) = (\pi \circ \pi') \cdot x$.
We say that a finite $A \subseteq \EAlph$ \emph{supports} $x \in X$ whenever, for all $\pi$ acting as the identity on $A$, we have $\pi \cdot x = x$. We say that a finite $A \subset \EAlph$ \emph{supports} $x \in X$ whenever, for all $\pi$ acting as the identity on $A$, we have $\pi \cdot x = x$.
In other words, permutations that only move elements outside $A$ do not affect $x$. In other words, permutations that only move elements outside $A$ do not affect $x$.
The support of $x \in X$, denoted $\supp(x)$, is the smallest finite set supporting $x$. The support of $x \in X$, denoted $\supp(x)$, is the smallest finite set supporting $x$.
We require nominal sets to have \emph{finite support}, meaning that $\supp(x)$ exists for all $x \in X$. We require nominal sets to have \emph{finite support}, meaning that $\supp(x)$ exists for all $x \in X$.
The \emph{orbit} $\orb(x)$ of $x \in X$ is the set of elements in $X$ reachable from $x$ via permutations, explicitly The \emph{orbit of $x$}, denoted $\orb(x)$, is the set of elements in $X$ reachable from $x$ via permutations, explicitly
\startformula \startformula
\orb(x) = \{ \pi \cdot x \mid \pi \in \Perm(\EAlph) \} . \orb(x) = \{ \pi \cdot x \mid \pi \in \Perm(\EAlph) \} .
\stopformula \stopformula
Then $X$ is \emph{orbit-finite} whenever it is a union of finitely many orbits. We say that $X$ is \emph{orbit-finite} whenever it is a union of finitely many orbits.
Given a nominal set $X$, a subset $Y \subseteq X$ is \emph{equivariant} if it is preserved by permutations, i.e., $\pi \cdot y \in Y$, for all $y \in Y$. Given a nominal set $X$, a subset $Y \subseteq X$ is \emph{equivariant} if it is preserved by permutations, i.e., $\pi \cdot y \in Y$, for all $y \in Y$.
In other words, $Y$ is a union of some orbits of $X$. In other words, $Y$ is a union of some orbits of $X$.
@ -627,7 +655,7 @@ We can form at most $l = l_1 l_2 \ldots l_n$ tuples of the form $O_1 \times \cdo
For $X$ single-orbit, the \emph{local symmetries} are defined by the group For $X$ single-orbit, the \emph{local symmetries} are defined by the group
\startformula \startformula
\{ g \in S_k \mid f(x_1, \ldots, x_k) = f(x_{g(1)}, \ldots, x_{g(k)}) \text{ for all } x_i \in X \}, \{ g \in S_k \mid f_X(x_1, \ldots, x_k) = f_X(x_{g(1)}, \ldots, x_{g(k)}) \text{ for all } x_i \in X \},
\stopformula \stopformula
where $k$ is the dimension of $X$ and $S_k$ is the \emph{symmetric group} of permutations over $k$ distinct elements. where $k$ is the dimension of $X$ and $S_k$ is the \emph{symmetric group} of permutations over $k$ distinct elements.
@ -656,8 +684,8 @@ for all $v \in A^{\ast}$.
This relation is equivariant and the set of equivalence classes $[w]_\lang$ is a nominal set. This relation is equivariant and the set of equivalence classes $[w]_\lang$ is a nominal set.
\starttheorem \starttheorem
(Myhill-Nerode theorem for nominal sets \cite[DBLP:journals/corr/BojanczykKL14]) (Myhill-Nerode theorem for nominal sets by \citenp[DBLP:journals/corr/BojanczykKL14])\\
Let $\lang$ be a regular nominal language. Let $\lang$ be a nominal language.
The following conditions are equivalent: The following conditions are equivalent:
\startitemize[m] \startitemize[m]
\item the set of equivalence classes of $\equiv_\lang$ is orbit-finite; \item the set of equivalence classes of $\equiv_\lang$ is orbit-finite;
@ -675,6 +703,7 @@ Language inclusion and minimality test for nominal DFAs.
Moreover, orbit-finite nominal sets can be finitely-represented, and so can be manipulated by algorithms. Moreover, orbit-finite nominal sets can be finitely-represented, and so can be manipulated by algorithms.
This is the key idea underpinning our implementation. This is the key idea underpinning our implementation.
\startsubsection \startsubsection
[title={Different Atom Symmetries}, [title={Different Atom Symmetries},
reference=sec:other-symms] reference=sec:other-symms]
@ -687,10 +716,10 @@ The theory of nominal automata remains similar, and an example nominal language
\stopformula \stopformula
which is recognised by a nominal DFA over those atoms. which is recognised by a nominal DFA over those atoms.
To simplify the presentation, in this paper we concentrate on the \quotation{equality atoms} only. To simplify the presentation, in this chapter we concentrate on the \quotation{equality atoms} only.
Also our implementation of nominal learning algorithms is restricted to equality atoms.
However, both the theory and the implementation can be generalised to other atom structures, with the \quotation{ordered atoms} $(\Q, <)$ as the simplest other example. However, both the theory and the implementation can be generalised to other atom structures, with the \quotation{ordered atoms} $(\Q, <)$ as the simplest other example.
We leave the details of this for a future extended version of this paper. We investigate the total order symmetry $(\Q, <)$ in \in{Chapter}[chap:ordered-nominal-sets].
\stopsubsection \stopsubsection
\stopsection \stopsection
@ -704,15 +733,15 @@ We fix a target language $\lang$, which is assumed to be a nominal regular langu
The learning algorithm for nominal automata, \nLStar{}, will be very similar to \LStar{} in \in{Algorithm}[alg:alg]. The learning algorithm for nominal automata, \nLStar{}, will be very similar to \LStar{} in \in{Algorithm}[alg:alg].
In fact, we only change the following lines: In fact, we only change the following lines:
\todo{In Alg environment?}
\placeformula[eq:changes] \placeformula[eq:changes]
\startformula[interlinespace=small]\startmathmatrix[n=2, align={middle, left}, distance=1cm] \startformula[interlinespace=small]\startmathmatrix[n=2, align={middle, left}, distance=1cm]
\NC \text{\color[darkgray]{7'}} \NC S \gets S \cup \orb(sa) \NR \NC \text{\color[darkgray]{6'}} \NC S \gets S \cup \orb(sa) \NR
\NC \text{\color[darkgray]{11'}} \NC E \gets E \cup \orb(ae) \NR \NC \text{\color[darkgray]{11'}} \NC E \gets E \cup \orb(ae) \NR
\NC \text{\color[darkgray]{16'}} \NC S \gets S \cup \pref(\orb(t)) \NR \NC \text{\color[darkgray]{16'}} \NC S \gets S \cup \pref(\orb(t)) \NR
\stopmathmatrix\stopformula \stopmathmatrix\stopformula
The basic data structure is an \emph{observation table} $(S, E, T)$ where $S$ and $E$ are orbit-finite subsets of $A^{\ast}$ and $T \colon S \cup S \Lext A \times E \to 2$ is an equivariant function defined by $T(se) = \lang(se)$ for each $s \in S \cup S \Lext A$ and $e \in E$. The basic data structure is an \emph{observation table} $(S, E, T)$ where $S$ and $E$ are orbit-finite subsets of $A^{\ast}$ and $T \colon S \cup S \Lext A \times E \to 2$ is an equivariant function defined by $T(s, e) = \lang(se)$ for each $s \in S \cup S \Lext A$ and $e \in E$.
Since $T$ is determined by $\lang$ we omit it from the notation. Since $T$ is determined by $\lang$ we omit it from the notation.
Let $row \colon S \cup S \Lext A \to 2^E$ denote the curried counterpart of $T$. Let $row \colon S \cup S \Lext A \to 2^E$ denote the curried counterpart of $T$.
Let $u \sim v$ denote the relation $row(u) = row(v)$. Let $u \sim v$ denote the relation $row(u) = row(v)$.
@ -743,19 +772,20 @@ Similarly the curried function $row$ can be represented by a finite structure.
To check whether the table is closed, we observe that if we have a corresponding row $s \in S$ for some $t \in S \Lext A$, this holds for any permutation of $t$. To check whether the table is closed, we observe that if we have a corresponding row $s \in S$ for some $t \in S \Lext A$, this holds for any permutation of $t$.
Hence it is enough to check the following: For all representatives $t \in S \Lext A$ there is a representative $s \in S$ with $row(t) = \pi \cdot row(s)$ for some permutation $\pi$. Hence it is enough to check the following: For all representatives $t \in S \Lext A$ there is a representative $s \in S$ with $row(t) = \pi \cdot row(s)$ for some permutation $\pi$.
Note that we only have to consider finitely many permutations, since the support is finite and so we can decide this property. Note that we only have to consider finitely many permutations, since the support is finite and so we can decide this property.
Furthermore if the property does not hold, we immediately find a witness represented by $t$. Furthermore, if the property does not hold, we immediately find a witness represented by $t$.
Consistency is a bit more complicated, but it is enough to consider the set of inconsistencies, $\left{(s_1, s_2, a, e) \mid row(s_1) = row(s_2) \wedge row(s_1 a)(e) \neq row(s_2 a)(e)\right}$. Consistency is a bit more complicated, but it is enough to consider the set of inconsistencies, $\left{(s_1, s_2, a, e) \mid row(s_1) = row(s_2) \wedge row(s_1 a)(e) \neq row(s_2 a)(e)\right}$.
It is an equivariant subset of $S \times S \times A \times E$ and so it is orbit-finite. It is an equivariant subset of $S \times S \times A \times E$ and so it is orbit-finite.
Hence we can decide emptiness and obtain representatives if it is non-empty. Hence we can decide emptiness and obtain representatives if it is non-empty.
Constructing the hypothesis happens in the same way as before (\in{Section}[sec:overview]), where we note the state space is orbit-finite since it is a quotient of $S$. Constructing the hypothesis happens in the same way as before (\in{Section}[sec:overview]), where we note the state space is orbit-finite since it is a quotient of $S$.
Moreover the function $row$ is equivariant, so all structure ($Q_0$, $F$ and $\delta$) is equivariant as well. Moreover, the function $row$ is equivariant, so all structure ($Q_0$, $F$ and $\delta$) is equivariant as well.
The representation given above is not the only way to represent nominal sets. The representation given above is not the only way to represent nominal sets.
For example, first-order definable sets can be used as well \cite[DBLP:journals/corr/KlinS16]. For example, first-order definable sets can be used as well \cite[DBLP:journals/corr/KlinS16].
From now on we assume to have set theoretic primitives so that each line in \in{Algorithm}[alg:alg] is well defined. From now on we assume to have set theoretic primitives so that each line in \in{Algorithm}[alg:alg] is well defined.
\startsubsection \startsubsection
[title={Correctness}] [title={Correctness}]
@ -765,12 +795,11 @@ We start out by listing some facts about observation tables.
\startlemma \startlemma
[reference=lem:row_equiv] [reference=lem:row_equiv]
%\label{lem:nerode_approx}
The relation $\sim$ is an equivariant equivalence relation. The relation $\sim$ is an equivariant equivalence relation.
Furthermore, for all $u, v \in S$ we have that $u \equiv_\lang v$ implies $u \sim v$. Furthermore, for all $u, v \in S$ we have that $u \equiv_\lang v$ implies $u \sim v$.
\stoplemma \stoplemma
\in{Lemma}[lem:row_equiv] implies that at any stage of the algorithm the number of orbits of $S / {\sim}$ does not exceed the number of orbits of the minimal acceptor with state space $A^{\ast} / {\equiv_\lang}$ (recall that $\equiv_\lang$ is the nominal Myhill-Nerode equivalence relation). This lemma implies that at any stage of the algorithm the number of orbits of $S / {\sim}$ does not exceed the number of orbits of the minimal acceptor with state space $A^{\ast} / {\equiv_\lang}$ (recall that $\equiv_\lang$ is the nominal Myhill-Nerode equivalence relation).
Moreover, the following lemma shows that the dimension of the state space never exceeds the dimension of the minimal acceptor. Moreover, the following lemma shows that the dimension of the state space never exceeds the dimension of the minimal acceptor.
Recall that the dimension is the maximal size of the support of any state, which is different than the number of orbits. Recall that the dimension is the maximal size of the support of any state, which is different than the number of orbits.
@ -783,7 +812,7 @@ We have $\supp([u]_{\sim}) \subseteq \supp([u]_{\equiv_\lang}) \subseteq \supp(u
The automaton constructed from a closed and consistent table is minimal. The automaton constructed from a closed and consistent table is minimal.
\stoplemma \stoplemma
\startproof \startproof
Follows from the categorical perspective given by \cite[authoryears][DBLP:conf/birthday/JacobsS14a]. Follows from the categorical perspective by \cite[authoryears][DBLP:conf/birthday/JacobsS14a].
\stopproof \stopproof
We note that the constructed automaton is consistent with the table (we use that the set $S$ is prefix-closed and $E$ is suffix-closed \cite[DBLP:journals/iandc/Angluin87]). We note that the constructed automaton is consistent with the table (we use that the set $S$ is prefix-closed and $E$ is suffix-closed \cite[DBLP:journals/iandc/Angluin87]).
@ -812,7 +841,7 @@ Let $n$ be the number of orbits of $H$.
The former set has $n$ orbits by the first observation, the latter set has at most $n$ orbits by assumption. The former set has $n$ orbits by the first observation, the latter set has at most $n$ orbits by assumption.
We conclude that the two sets (both being equivariant) must be equal. We conclude that the two sets (both being equivariant) must be equal.
That means that for each $q \in M'$ there is a $s \in S$ such that $row'(q) = row(s)$. That means that for each $q \in M'$ there is a $s \in S$ such that $row'(q) = row(s)$.
We see that $row' \colon M' \twoheadrightarrow \{ row'(\delta'(q_0', s)) \mid s \} = H$ is a surjective map. We see that $row' \colon M' \to \{ row'(\delta'(q_0', s)) \mid s \in S \} = H$ is a surjective map.
Since a surjective map cannot increase the dimensions of orbits and the dimensions of $M'$ are bounded, we note that the dimensions of the orbits in $H$ and $M'$ have to agree. Since a surjective map cannot increase the dimensions of orbits and the dimensions of $M'$ are bounded, we note that the dimensions of the orbits in $H$ and $M'$ have to agree.
Similarly, surjective maps preserve local symmetries. Similarly, surjective maps preserve local symmetries.
This map must hence be an isomorphism of nominal sets. This map must hence be an isomorphism of nominal sets.
@ -869,7 +898,7 @@ It remains to prove that it needs only finitely many equivalence queries.
Let $(S, E)$ be the closed and consistent table and $H$ its corresponding hypothesis. Let $(S, E)$ be the closed and consistent table and $H$ its corresponding hypothesis.
If it is incorrect, then a second hypothesis $H'$ will be constructed which is consistent with the old table $(S, E)$. If it is incorrect, then a second hypothesis $H'$ will be constructed which is consistent with the old table $(S, E)$.
The two hypotheses are nonequivalent, as $H'$ will handle the counter example correctly and $H$ does not. The two hypotheses are nonequivalent, as $H'$ will handle the counterexample correctly and $H$ does not.
Therefore, $H'$ will have at least one orbit more, one local symmetry less, or one orbit will have strictly bigger dimension (\in{Lemma}[lem:minimal_wrt_table]), Therefore, $H'$ will have at least one orbit more, one local symmetry less, or one orbit will have strictly bigger dimension (\in{Lemma}[lem:minimal_wrt_table]),
all of which can only happen finitely often. all of which can only happen finitely often.
\stopproof \stopproof
@ -884,6 +913,9 @@ From the proof \in{Theorem}[thm:termination] we observe moreover that the way we
Any other method which ensures a nonequivalent hypothesis will work. Any other method which ensures a nonequivalent hypothesis will work.
In particular our algorithm is easily adapted to include optimisations such as the ones by \cite[authoryears][DBLP:journals/iandc/RivestS93, DBLP:journals/iandc/MalerP95], where counterexamples are added as columns. In particular our algorithm is easily adapted to include optimisations such as the ones by \cite[authoryears][DBLP:journals/iandc/RivestS93, DBLP:journals/iandc/MalerP95], where counterexamples are added as columns.
\todo{Consistentie is nodig?}
\stopsubsection \stopsubsection
\startsubsection \startsubsection
[title={Example}] [title={Example}]
@ -894,7 +926,7 @@ In particular our algorithm is easily adapted to include optimisations such as t
list={Example automaton with three subsequent tables computed by \nLStar{}.}, list={Example automaton with three subsequent tables computed by \nLStar{}.},
reference=fig:ex, reference=fig:ex,
location=page] location=page]
\startcombination[nx=3,ny=1,location=middle,distance=.5cm] \startcombination[nx=3,ny=1,location=middle,distance=.8cm]
{\starttikzpicture {\starttikzpicture
\node[initial, state] (q0) {$q_0$}; \node[initial, state] (q0) {$q_0$};
\node[state, right of=q0] (q1) {$q_{1,x}$}; \node[state, right of=q0] (q1) {$q_{1,x}$};
@ -965,13 +997,13 @@ We remind the reader that the table is represented in a symbolic way:
The sequences in the rows and columns stand for whole orbits and the cells denote functions from the product of the orbits to $2$. The sequences in the rows and columns stand for whole orbits and the cells denote functions from the product of the orbits to $2$.
Since the cells can consist of multiple orbits, where each orbit is allowed to have a different value, we use a formula to specify which orbits have a $1$. Since the cells can consist of multiple orbits, where each orbit is allowed to have a different value, we use a formula to specify which orbits have a $1$.
The table $T_1$ at some stage of the algorithm has to be checked for closedness and consistency. The table $T_1$ has to be checked for closedness and consistency.
We note that it is definitely closed. We note that it is definitely closed.
For consistency we check the rows $row(\epsilon)$ and $row(a)$ which are equal. For consistency we check the rows $row(\epsilon)$ and $row(a)$ which are equal.
Observe, however, that $row(\epsilon b)(\epsilon) = 0$ and $row(ab)(\epsilon) = 1$, so we have an inconsistency. Observe, however, that $row(\epsilon b)(\epsilon) = 0$ and $row(ab)(\epsilon) = 1$, so we have an inconsistency.
The algorithm adds the orbit $\orb(b)$ as column and extends the table, obtaining $T_2$. The algorithm adds the orbit $\orb(b)$ as column and extends the table, obtaining $T_2$.
We note that, in this process, the number of orbits did grow, as the two rows are split. We note that, in this process, the number of orbits did grow, as the two rows are split.
Furthermore we see that both $row(a)$ and $row(ab)$ have empty support in $T_1$, but not in $T_2$, Furthermore, we see that both $row(a)$ and $row(ab)$ have empty support in $T_1$, but not in $T_2$,
because $row(a)(a')$ depends on $a'$ being equal or different from $a$, similarly for $row(ab)(a')$. because $row(a)(a')$ depends on $a'$ being equal or different from $a$, similarly for $row(ab)(a')$.
The table $T_2$ is still not consistent as we see that $row(ab) = row(ba)$ but $row(abb)(c) = 1$ and $row(bab)(c) = 0$. The table $T_2$ is still not consistent as we see that $row(ab) = row(ba)$ but $row(abb)(c) = 1$ and $row(bab)(c) = 0$.
@ -980,6 +1012,7 @@ We note that in this case, no new orbits are obtained and no support has grown.
In fact, the only change here is that the local symmetry between $row(ab)$ and $row(ba)$ is removed. In fact, the only change here is that the local symmetry between $row(ab)$ and $row(ba)$ is removed.
This last table, $T_3$, is closed and consistent and will produce the correct hypothesis. This last table, $T_3$, is closed and consistent and will produce the correct hypothesis.
\stopsubsection \stopsubsection
\startsubsection \startsubsection
[title={Query Complexity}, [title={Query Complexity},
@ -1022,6 +1055,8 @@ Next we will give a bound for the size of the table.
Since they are suffix closed, the length is at most $n (k + k \log k + 1)$. Since they are suffix closed, the length is at most $n (k + k \log k + 1)$.
\stopproof \stopproof
\todo{Betere bound \citet[CameronST89] = $ceil \frac{3}{2} n - b(n) - 1$.}
Let $p$ and $l$ denote respectively the dimension and the number of orbits of $A$. Let $p$ and $l$ denote respectively the dimension and the number of orbits of $A$.
\startlemma \startlemma
@ -1035,12 +1070,16 @@ Let $p$ and $l$ denote respectively the dimension and the number of orbits of $A
\stopproof \stopproof
Let $C_{n,k,m} = (n + m E_{n,k}) (l f_\EAlph(p(n+m), p) + 1) n (k + k \log k + 1)$ be the maximal number of cells in the table. Let $C_{n,k,m} = (n + m E_{n,k}) (l f_\EAlph(p(n+m), p) + 1) n (k + k \log k + 1)$ be the maximal number of cells in the table.
We note that this number is polynomial in $k, l, m$ and $n$ but not in $p$. We note that this number is polynomial in $k, l, m$ and $n$ but it is not polynomial in $p$.
\startcorollary \startcorollary
The number of membership queries is bounded by $C_{n,k,m} f_\EAlph(p (n + m), p n (k + k \log k + 1))$. The number of membership queries is bounded by
\startformula
C_{n,k,m} f_\EAlph(p (n + m), p n (k + k \log k + 1)).
\stopformula
\stopcorollary \stopcorollary
\stopsubsection \stopsubsection
\stopsection \stopsection
\startsection \startsection
@ -1121,7 +1160,7 @@ Given a RFSA-closed and RFSA-consistent table $(S, E)$, the conjecture automaton
\item $Q = PR^\top(S,E)$; \item $Q = PR^\top(S,E)$;
\item $Q_0 = \{ r \in Q \mid r \rowincl row(\epsilon)\}$; \item $Q_0 = \{ r \in Q \mid r \rowincl row(\epsilon)\}$;
\item $F = \{r \in Q \mid r(\epsilon) = 1\}$; \item $F = \{r \in Q \mid r(\epsilon) = 1\}$;
\item the transition relation $\delta \subseteq Q \times A \times Q$ is given by $\delta(row(s),a) = \{ r \in Q \mid r \rowincl row(sa) \}$. \item the transition relation is given by $\delta(row(s),a) = \{ r \in Q \mid r \rowincl row(sa) \}$.
\stopitemize \stopitemize
\stopdefinition \stopdefinition
@ -1135,16 +1174,17 @@ To deal with this, we introduce a nominal notion of RFSAs.
They are a \emph{proper subclass} of nominal NFAs, because they recognise nominal regular languages. They are a \emph{proper subclass} of nominal NFAs, because they recognise nominal regular languages.
Nonetheless, they are more succinct than nominal DFAs. Nonetheless, they are more succinct than nominal DFAs.
\startsubsection \startsubsection
[title={Nominal Residual Finite-State Automata}, [title={Nominal Residual Finite-State Automata},
reference=subsec:nrfsa] reference=subsec:nrfsa]
Let $\lang$ be a nominal regular language and let $u$ be a finite string. Let $\lang$ be a nominal language and $u$ be a finite string.
The \emph{derivative} of $\lang$ w.r.t. $u$ is The \emph{derivative} of $\lang$ w.r.t. $u$ is
\startformula \startformula
\lder{u}\lang = \{ v \in A^{\ast} \mid uv \in \lang \}. \lder{u}\lang = \{ v \in A^{\ast} \mid uv \in \lang \}.
\stopformula \stopformula
A set $\lang' \subseteq \EAlph^*$ is a \emph{residual} of $\lang$ if there is $u$ with $\lang' = \lder{u}\lang$. A language $\lang' \subseteq \EAlph^*$ is a \emph{residual} of $\lang$ if there is $u$ with $\lang' = \lder{u}\lang$.
Note that a residual might not be equivariant, but it does have a finite support. Note that a residual might not be equivariant, but it does have a finite support.
We write $R(\lang)$ for the set of residuals of $\lang$. We write $R(\lang)$ for the set of residuals of $\lang$.
Residuals form an orbit-finite nominal set: Residuals form an orbit-finite nominal set:
@ -1196,7 +1236,7 @@ Our algorithm is a simple modification of the algorithm in \in{Algorithm}[alg:nf
\startformula[interlinespace=small]\startmathmatrix[n=2, align={middle, left}, distance=1cm] \startformula[interlinespace=small]\startmathmatrix[n=2, align={middle, left}, distance=1cm]
\NC \text{\color[darkgray]{6'}} \NC S \gets S \cup \orb(sa) \NR \NC \text{\color[darkgray]{6'}} \NC S \gets S \cup \orb(sa) \NR
\NC \text{\color[darkgray]{11'}} \NC E \gets E \cup \orb(ae) \NR \NC \text{\color[darkgray]{11'}} \NC E \gets E \cup \orb(ae) \NR
\NC \text{\color[darkgray]{16'}} \NC S \gets S \cup \suff(\orb(t)) \NR \NC \text{\color[darkgray]{16'}} \NC E \gets E \cup \suff(\orb(t)) \NR
\stopmathmatrix\stopformula \stopmathmatrix\stopformula
Switching to nominal sets, several decidability issues arise. Switching to nominal sets, several decidability issues arise.
The most critical one is that rows may be the union of infinitely many component rows, as happens for residuals of nominal languages, so finding all such components can be challenging. The most critical one is that rows may be the union of infinitely many component rows, as happens for residuals of nominal languages, so finding all such components can be challenging.
@ -1206,13 +1246,13 @@ row(t) = \Rowunion \{ row(s) \mid row(s) \rowincls row(t) \}.
\stopformula \stopformula
where $\rowincls$ is \emph{strict} row inclusion; otherwise $row(t)$ is prime. where $\rowincls$ is \emph{strict} row inclusion; otherwise $row(t)$ is prime.
We now check that relevant parts of our algorithm terminate. We now check that three relevant parts of our algorithm terminate.
{\bf Row Containment Check.} \description{1. Row containment check.}
The basic containment check $row(s) \rowincl row(t)$ is decidable, as $row(s)$ and $row(t)$ are supported by the finite supports of $s$ and $t$ respectively. The basic containment check $row(s) \rowincl row(t)$ is decidable, as $row(s)$ and $row(t)$ are supported by the finite supports of $s$ and $t$ respectively.
{\bf \inline{Line}[line:nfa-checks].} \startdescription[title={2. RFSA-Closedness and RFSA-Consistency Checks.}]
RFSA-Closedness and RFSA-Consistency Checks. (\inline{Line}[line:nfa-checks])
We first show that prime rows form orbit-finite nominal sets. We first show that prime rows form orbit-finite nominal sets.
\startlemma[reference=lem:pr-ns] \startlemma[reference=lem:pr-ns]
@ -1236,20 +1276,22 @@ of permutations fixing $\supp(row(t))$.
\stoplemma \stoplemma
We established that $C(row(t))$ can be effectively computed, and the same holds for $\Rowunion C(row(t))$. We established that $C(row(t))$ can be effectively computed, and the same holds for $\Rowunion C(row(t))$.
In fact, $\Rowunion$ is equivariant w.r.t the whole $Perm(\EAlph)$ and then, in particular, w.r.t. $G_t$, so it preserves orbit-finiteness. In fact, $\Rowunion$ is equivariant w.r.t. the whole $Perm(\EAlph)$ and then, in particular, w.r.t. $G_t$, so it preserves orbit-finiteness.
Now, to check $row(t) = \Rowunion C(row(t))$, we can just pick one representative of every orbit of $S \Lext A$, because we have $C(\pi \cdot row(t)) = \pi \cdot C(row(t))$ and permutations distribute over $\rowunion$, so permuting both sides of the equation gives again a valid equation. Now, to check $row(t) = \Rowunion C(row(t))$, we can just pick one representative of every orbit of $S \Lext A$, because we have $C(\pi \cdot row(t)) = \pi \cdot C(row(t))$ and permutations distribute over $\rowunion$, so permuting both sides of the equation gives again a valid equation.
For RFSA-consistency, consider the two sets: For RFSA-consistency, consider the two sets
\startformula\startalign \startformula\startalign
\NC N \NC = \{ (s_1, s_2) \in S \times S \mid row(s_1) \rowincl row(s_2) \} \NR \NC N \NC = \{ (s_1, s_2) \in S \times S \mid row(s_1) \rowincl row(s_2) \}, \text{ and} \NR
\NC M \NC = \{ (s_1, s_2) \in S \times S \mid \forall a \in A : row(s_1a) \rowincl row(s_2a) \} \NR \NC M \NC = \{ (s_1, s_2) \in S \times S \mid \forall a \in A : row(s_1a) \rowincl row(s_2a) \}. \NR
\stopalign\stopformula \stopalign\stopformula
They are both orbit-finite nominal sets, by equivariance of $row$, $\rowincl$ and $A$. They are both orbit-finite nominal sets, by equivariance of $row$, $\rowincl$ and $A$.
We can check RFSA-consistency in finite time by picking orbit representatives from $N$ and $M$. We can check RFSA-consistency in finite time by picking orbit representatives from $N$ and $M$.
For each representative $n \in N$, we look for a representative $m \in M$ and a permutation $\pi$ such that $n = \pi \cdot m$. For each representative $n \in N$, we look for a representative $m \in M$ and a permutation $\pi$ such that $n = \pi \cdot m$.
If no such $m$ and $\pi$ exist, then $n$ does not belong to any orbit of $M$, so it violates RFSA-consistency. If no such $m$ and $\pi$ exist, then $n$ does not belong to any orbit of $M$, so it violates RFSA-consistency.
\stopdescription
{\bf \inline{Lines}[line:nfa-clos-witness] and \inlinerange[line:nfa-cons-witness]: Finding Witnesses for Violations.} \description{3. Finding Witnesses for Violations.}
(\inline{Lines}[line:nfa-clos-witness] and \inlinerange[line:nfa-cons-witness])
We can find witnesses by comparing orbit representatives of orbit-finite sets, as we did with RFSA-consistency. We can find witnesses by comparing orbit representatives of orbit-finite sets, as we did with RFSA-consistency.
Specifically, we can pick representatives in $S \times A$ and $S \times S \times A \times E$ and check them against the following orbit-finite nominal sets: Specifically, we can pick representatives in $S \times A$ and $S \times S \times A \times E$ and check them against the following orbit-finite nominal sets:
\startitemize \startitemize
@ -1335,14 +1377,14 @@ Now we prove that a finite number of equivalence queries is needed to reach the
To do this, we cannot use a suitable version of \in{Lemma}[lem:minimal_wrt_table], because this relies on $N(S, E)$ being consistent with $(S, E)$, which in general is not true (see \cite[BolligHKL08] for an example of this). To do this, we cannot use a suitable version of \in{Lemma}[lem:minimal_wrt_table], because this relies on $N(S, E)$ being consistent with $(S, E)$, which in general is not true (see \cite[BolligHKL08] for an example of this).
We can, however, use an argument similar to that for RFSA-consistency, because the algorithm adds columns in response to counterexamples. We can, however, use an argument similar to that for RFSA-consistency, because the algorithm adds columns in response to counterexamples.
Let $w$ the counterexample provided by the teacher. Let $w$ the counterexample provided by the teacher.
When $12'$ is executed, the table must change. When $16'$ is executed, the table must change.
In fact, by Lemma 2 in \cite[DBLP:conf/ijcai/BolligHKL09], if it does not, then $w$ is already correctly classified by $N(S, E)$, which is absurd. In fact, by Lemma 2 of \citet[DBLP:conf/ijcai/BolligHKL09], if it does not, then $w$ is already correctly classified by $N(S, E)$, which is absurd.
We have the following cases. We have the following cases.
{\bf E1)} orbits of $(S \cup S \Lext A)/{\sim}$ increase ({\bf C1}). {\bf E1)} orbits of $(S \cup S \Lext A)/{\sim}$ increase ({\bf C1}).
Or, {\bf E2)} either: Orbits in $PR(S, E)$ increase, or any of the following happens: Or, {\bf E2)} either: Orbits in $PR(S, E)$ increase, or any of the following happens:
Orbits in $I$ decrease ({\bf C2}), local symmetries/dimension of an orbit of rows change ({\bf C3}). Orbits in $I$ decrease ({\bf C2}), local symmetries/dimension of an orbit of rows change ({\bf C3}).
In fact, if {\bf E1} does not happen and $PR(S, E)$, $I$ and local symmetries/dimension of orbits of rows do not change, the automaton $\autom$ for the new table coincides with $N(S, E)$. In fact, if {\bf E1} does not happen and $PR(S, E)$, $I$ and local symmetries/dimension of orbits of rows do not change, the automaton $\autom$ for the new table coincides with $N(S, E)$.
But $N(S, E) = \autom$ is a contradiction, because $\autom$ correctly classifies $w$ (by Lemma 2 in \cite[DBLP:conf/ijcai/BolligHKL09], as $w$ now belongs to columns), whereas $N(S, E)$ does not. But $N(S, E) = \autom$ is a contradiction, because $\autom$ correctly classifies $w$ (by Lemma 2 of \citet[DBLP:conf/ijcai/BolligHKL09], as $w$ now belongs to columns), whereas $N(S, E)$ does not.
Both {\bf E1} and {\bf E2} can only happen finitely many times. Both {\bf E1} and {\bf E2} can only happen finitely many times.
\stopproof \stopproof
@ -1363,7 +1405,7 @@ Clearly the first two ones can happen at most $n$ times.
We now estimate how many times $I$ can decrease. We now estimate how many times $I$ can decrease.
Suppose $(S \cup S\Lext A)/{\sim}$ has $d$ orbits and $h$ orbits are added to it. Suppose $(S \cup S\Lext A)/{\sim}$ has $d$ orbits and $h$ orbits are added to it.
Recall that, given an orbit $O$ of rows of dimension at most $m$, $f_\EAlph(m, m)$ is an upper bound for the number of orbits in the product $O \times O$. Recall that, given an orbit $O$ of rows of dimension at most $m$, $f_\EAlph(m, m)$ is an upper bound for the number of orbits in the product $O \times O$.
Since the support of rows is bounded by $k$, we can give a bound for the number of orbits added to $I$: $dhf_\EAlph(k, k)$, for new pairs $r \rowincls r'$ with $r$ in a new orbit of rows and $r'$ in an old one (or viceversa); plus $(h(h-1)/2)f_\EAlph(k, k)$, for $r$ and $r'$ both in (distinct) new orbits; plus $hf_\EAlph(k, k)$, for $r$ and $r'$ in the same new orbit. Since the support of rows is bounded by $k$, we can give a bound for the number of orbits added to $I$: $dhf_\EAlph(k, k)$, for new pairs $r \rowincls r'$ with $r$ in a new orbit of rows and $r'$ in an old one (or vice versa); plus $(h(h-1)/2)f_\EAlph(k, k)$, for $r$ and $r'$ both in (distinct) new orbits; plus $hf_\EAlph(k, k)$, for $r$ and $r'$ in the same new orbit.
Notice that, if $PR(S, E)$ grows but $(S \cup S\Lext A)/{\sim}$ does not, $I$ does not increase. Notice that, if $PR(S, E)$ grows but $(S \cup S\Lext A)/{\sim}$ does not, $I$ does not increase.
By \in{Lemma}[lem:rows-min-dfa], $h, d \leq n$, so $I$ cannot decrease more than $(n^2 + n(n-1)/2 + n)f_\EAlph(k,k)$ times. By \in{Lemma}[lem:rows-min-dfa], $h, d \leq n$, so $I$ cannot decrease more than $(n^2 + n(n-1)/2 + n)f_\EAlph(k,k)$ times.
@ -1411,7 +1453,7 @@ The number of membership queries is bounded by $C_{n,k,m}' f_\EAlph(p n, p m E'_
Our algorithms for learning nominal automata operate on infinite sets of rows and columns, and hence it is not immediately clear how to actually implement them on a computer. Our algorithms for learning nominal automata operate on infinite sets of rows and columns, and hence it is not immediately clear how to actually implement them on a computer.
We have used NLambda, a recently developed Haskell library by \cite[authoryears][DBLP:journals/corr/KlinS16] designed to allow direct manipulation of infinite (but orbit-finite) nominal sets, within the functional programming paradigm. We have used NLambda, a recently developed Haskell library by \cite[authoryears][DBLP:journals/corr/KlinS16] designed to allow direct manipulation of infinite (but orbit-finite) nominal sets, within the functional programming paradigm.
The semantics of NLambda is based on \cite[DBLP:conf/popl/BojanczykBKL12], and the library itself is inspired by Fresh O'Caml by \cite[authoryears][DBLP:journals/entcs/Shinwell06], a language for functional programming over nominal data structures with binding. The semantics of NLambda is based by \citet[DBLP:conf/popl/BojanczykBKL12], and the library itself is inspired by Fresh O'Caml by \cite[authoryears][DBLP:journals/entcs/Shinwell06], a language for functional programming over nominal data structures with binding.
\subsection{NLambda} \subsection{NLambda}
@ -1427,7 +1469,7 @@ All these are instances of a type class \kw{NominalType} specified in NLambda fo
NLambda, in addition to all the standard machinery of Haskell, offers primitives to manipulate values of any nominal types $\tau, \sigma$: NLambda, in addition to all the standard machinery of Haskell, offers primitives to manipulate values of any nominal types $\tau, \sigma$:
\startitemize \startitemize[after]
\item $\kw{empty} : \kw{Set}\ \tau$, returns the empty set of any type; \item $\kw{empty} : \kw{Set}\ \tau$, returns the empty set of any type;
\item $\kw{atoms} : \kw{Set}\ \kw{Atoms}$, returns the (infinite but single-orbit) set of all atoms; \item $\kw{atoms} : \kw{Set}\ \kw{Atoms}$, returns the (infinite but single-orbit) set of all atoms;
\item $\kw{insert} : \tau \to \kw{Set}\ \tau \to \kw{Set}\ \tau$, adds an element to a set; \item $\kw{insert} : \tau \to \kw{Set}\ \tau \to \kw{Set}\ \tau$, adds an element to a set;
@ -1436,31 +1478,34 @@ NLambda, in addition to all the standard machinery of Haskell, offers primitives
\item $\kw{isEmpty} : \kw{Set}\ \tau \to \kw{Formula}$, checks whether a set is empty. \item $\kw{isEmpty} : \kw{Set}\ \tau \to \kw{Formula}$, checks whether a set is empty.
\stopitemize \stopitemize
The type \kw{Formula} takes the role of a Boolean type. The type \kw{Formula} has the role of a Boolean type.
For technical reasons it is distinct from the standard Haskell type \kw{Bool}, but it provides standard logical operations such as For technical reasons, it is distinct from the standard Haskell type \kw{Bool}, but it provides standard logical operations, e.g.,
\startformula \startformula
\kw{not} : \kw{Formula} \to \kw{Formula}, \qquad \kw{not} : \kw{Formula} \to \kw{Formula}, \qquad
\kw{or} : \kw{Formula} \to \kw{Formula} \to \kw{Formula}, \kw{or} : \kw{Formula} \to \kw{Formula} \to \kw{Formula},
\stopformula \stopformula
as well as a conditional operator $\kw{ite} : \kw{Formula} \to \tau \to \tau \to \tau$ that mimics the standard \kw{if-then-else} construction. as well as a conditional operator $\kw{ite} : \kw{Formula} \to \tau \to \tau \to \tau$ that mimics the standard \kw{if-then-else} construction.
It is also the result type of a built-in equality test on atoms, $\kw{eq} : \kw{Atoms} \to \kw{Atoms} \to \kw{Formula}$. It is also the result type of a built-in equality test on atoms:
\startformula
\kw{eq} : \kw{Atoms} \to \kw{Atoms} \to \kw{Formula}.
\stopformula
Using these primitives, one builds more functions to operate on orbit-finite sets, such as a function to build singleton sets: Using these primitives, one builds more functions to operate on orbit-finite sets, such as a function to build singleton sets:
\startformula\startalign[align={left}] \startformula\startmathmatrix[align={left}]
\NC \kw{singleton} : \tau \to \kw{Set}\ \tau \NR \NC \kw{singleton} : \tau \to \kw{Set}\ \tau \NR
\NC \kw{singleton}\ x = \kw{insert}\ x\ \kw{empty} \NR \NC \kw{singleton}\ x = \kw{insert}\ x\ \kw{empty} \NR
\stopalign\stopformula \stopmathmatrix\stopformula
or a filtering function to select elements that satisfy a given predicate: or a filtering function to select elements that satisfy a given predicate:
\startformula\startalign[align={left}] \startformula\startmathmatrix[align={left}]
\NC \kw{filter} : (\tau \to \kw{Formula}) \to \kw{Set}\ \tau \to \kw{Set}\ \tau \NR \NC \kw{filter} : (\tau \to \kw{Formula}) \to \kw{Set}\ \tau \to \kw{Set}\ \tau \NR
\NC \kw{filter}\ p\ s = \kw{sum}\ (\kw{map}\ (\lambda x .\ \kw{ite}\ (p\ x)\ (\kw{singleton}\ x)\ \kw{empty})\ s) \NR \NC \kw{filter}\ p\ s = \kw{sum}\ (\kw{map}\ (\lambda x .\ \kw{ite}\ (p\ x)\ (\kw{singleton}\ x)\ \kw{empty})\ s) \NR
\stopalign\stopformula \stopmathmatrix\stopformula
or functions to quantify a predicate over a set: or functions to quantify a predicate over a set:
\startformula\startalign[align={left}] \startformula\startmathmatrix[align={left}]
\NC \kw{exists}, \kw{forall} : (\tau \to \kw{Formula}) \to \kw{Set}\ \tau \to \kw{Formula} \NR \NC \kw{exists}, \kw{forall} : (\tau \to \kw{Formula}) \to \kw{Set}\ \tau \to \kw{Formula} \NR
\NC \kw{exists}\ p\ s = \kw{not}\ (\kw{isEmpty}\ (\kw{filter}\ p\ s)) \NR \NC \kw{exists}\ p\ s = \kw{not}\ (\kw{isEmpty}\ (\kw{filter}\ p\ s)) \NR
\NC \kw{forall}\ p\ s = \kw{isEmpty}\ (\kw{filter}\ (\lambda x .\ \kw{not}\ (p\ x))\ s) \NR \NC \kw{forall}\ p\ s = \kw{isEmpty}\ (\kw{filter}\ (\lambda x .\ \kw{not}\ (p\ x))\ s) \NR
\stopalign\stopformula \stopmathmatrix\stopformula
and so on. and so on.
Note that these functions are written in exactly the same way as they would be for finite sets and the standard \kw{Data.Set} type. Note that these functions are written in exactly the same way as they would be for finite sets and the standard \kw{Data.Set} type.
This is not an accident, and indeed the programmer can use the convenient set-theoretic intuition of NLambda primitives. This is not an accident, and indeed the programmer can use the convenient set-theoretic intuition of NLambda primitives.
@ -1493,8 +1538,9 @@ In the current implementation of NLambda, an external SMT solver Z3 \cite[DBLP:c
For example, to evaluate the expression $\kw{isEmpty}\ \kw{distPairs}$, For example, to evaluate the expression $\kw{isEmpty}\ \kw{distPairs}$,
NLambda makes a system call to the SMT solver to check whether the formula $a \neq b$ is satisfiable in the first-order theory of equality and, after receiving the affirmative answer, returns the value \kw{False}. NLambda makes a system call to the SMT solver to check whether the formula $a \neq b$ is satisfiable in the first-order theory of equality and, after receiving the affirmative answer, returns the value \kw{False}.
For more details about the semantics and implementation of NLambda, see \cite[DBLP:journals/corr/KlinS16]. For more details about the semantics and implementation of NLambda, see \citet[DBLP:journals/corr/KlinS16].
The library itself can be downloaded from \cite[nlambda-code]. The library itself can be downloaded from \cite[url][nlambda-code].
\stopsubsection \stopsubsection
\startsubsection \startsubsection
@ -1503,7 +1549,7 @@ The library itself can be downloaded from \cite[nlambda-code].
Using NLambda we implemented the algorithms from \in{Sections}[sec:nangluin] and \in{}[sec:nondet]. Using NLambda we implemented the algorithms from \in{Sections}[sec:nangluin] and \in{}[sec:nondet].
We note that the internal representation is slightly different than the one discussed in \in{Section}[sec:nangluin]. We note that the internal representation is slightly different than the one discussed in \in{Section}[sec:nangluin].
Instead of representing the table $(S, E)$ with actual representatives of orbits, the sets are represented logically as described above. Instead of representing the table $(S, E)$ with actual representatives of orbits, the sets are represented logically as described above.
Furthermore the control flow of the algorithm is adapted to fit in the functional programming paradigm. Furthermore, the control flow of the algorithm is adapted to fit in the functional programming paradigm.
In particular, recursion is used instead of a while loop. In particular, recursion is used instead of a while loop.
In addition to the nominal adaptation of Angluin's algorithm \nLStar{}, we implemented a variant, \nLStarCol{} which adds counterexamples to the columns instead of rows. In addition to the nominal adaptation of Angluin's algorithm \nLStar{}, we implemented a variant, \nLStarCol{} which adds counterexamples to the columns instead of rows.
@ -1517,9 +1563,9 @@ As an optimisation, we use bisimulation up to congruence as described by \cite[a
Having an approximate teacher is a minor issue since in many applications no complete teacher can be implemented and one relies on testing \cite[DBLP:conf/ictac/AartsFKV15, DBLP:conf/dlt/BolligHLM13]. Having an approximate teacher is a minor issue since in many applications no complete teacher can be implemented and one relies on testing \cite[DBLP:conf/ictac/AartsFKV15, DBLP:conf/dlt/BolligHLM13].
For the experiments listed here the bound was chosen large enough for the learner to terminate with the correct automaton. For the experiments listed here the bound was chosen large enough for the learner to terminate with the correct automaton.
We remark that our algorithms seamlessly merge with teachers written in NLambda, but the current version of the library does not allow generating concrete membership queries for external teachers. %We remark that our algorithms seamlessly merge with teachers written in NLambda, but the current version of the library does not allow generating concrete membership queries for external teachers.
We are currently working on a new version of the library in which this will be possible. %We are currently working on a new version of the library in which this will be possible.
\todo{Dit kan nu al} %\todo{Dit kan nu al}
\stopsubsection \stopsubsection
@ -1527,14 +1573,15 @@ We are currently working on a new version of the library in which this will be p
[title={Test Cases}, [title={Test Cases},
reference=sec:tests] reference=sec:tests]
To provide a benchmark for future improvements, we tested our algorithms on a few simple automata described below. We report results in \in{Table}[tab:results]. To provide a benchmark for future improvements, we tested our algorithms on simple automata described below. We report results in \in{Table}[tab:results].
The experiments were performed on a machine with an Intel Core i5 (Skylake, 2.4 GHz) and 8 GB RAM. The experiments were performed on a machine with an Intel Core i5 (Skylake, 2.4 GHz) and 8 GB RAM.
\placetable[here][tab:results] \startplacetable
{Results of experiments. [title={Results of experiments.
The column DFA (resp. RFSA) shows the number of orbits (left sub-column) and dimension (right sub-column) of the learnt minimal DFA (resp. canonical RFSA). The column DFA (resp. RFSA) shows the number of orbits (left sub-column) and dimension (right sub-column) of the learnt minimal DFA (resp. canonical RFSA).
We use $\infty$ when the running time is too high. We use $\infty$ when the running time is too high.},
\todo{Getallen in mathmode?}} list={Results of learning experiments},
reference=tab:results]
\starttabulate[|l|c|c|cg{.}|cg{.}|c|c|cg{.}|][distance=none] \starttabulate[|l|c|c|cg{.}|cg{.}|c|c|cg{.}|][distance=none]
\NC Model \NC \NC \VL \nLStar{} (s) \NC \nLStar{}$_{\text{col}}$ (s) \VL \NC \VL \nNLStar{} (s) \NC\NR \NC Model \NC \NC \VL \nLStar{} (s) \NC \nLStar{}$_{\text{col}}$ (s) \VL \NC \VL \nNLStar{} (s) \NC\NR
\HL \HL
@ -1556,15 +1603,19 @@ We use $\infty$ when the running time is too high.
\HL \HL
\NC $\lang_{eq}$ \NC n/a \NC n/a \VL n/a \NC n/a \VL 3 \NC 1 \VL 16.3 \NC\NR \NC $\lang_{eq}$ \NC n/a \NC n/a \VL n/a \NC n/a \VL 3 \NC 1 \VL 16.3 \NC\NR
\stoptabulate \stoptabulate
\stopplacetable
{\bf Queue Data Structure.} \startdescription[title={Queue Data Structure.}]
A queue is a data structure to store elements which can later be retrieved in a first-in, first-out order. A queue is a data structure to store elements which can later be retrieved in a first-in, first-out order.
It has two operations: \kw{push} and \kw{pop}. It has two operations: \kw{push} and \kw{pop}.
We define the alphabet $\Sigma_{\text{FIFO}} = \{ \kw{push}(a), \kw{pop}(a) \mid a \in \EAlph \}$. We define the alphabet $\Sigma_{\text{FIFO}} = \{ \kw{push}(a), \kw{pop}(a) \mid a \in \EAlph \}$.
The language $\text{FIFO}_n$ contains all valid traces of \kw{push} and \kw{pop} using a bounded queue of size $n$. The language $\text{FIFO}_n$ contains all valid traces of \kw{push} and \kw{pop} using a bounded queue of size $n$.
The minimal nominal DFA for $\text{FIFO}_2$ is The minimal nominal DFA for $\text{FIFO}_2$ is given in \in{Figure}[fig:fifo2].
\starttikzpicture[node distance=4cm] \startplacefigure
[title={A nominal automaton accepting $\text{FIFO}_2$.},
reference=fig:fifo2]
\hbox{\starttikzpicture[node distance=4cm]
\node[state, initial, accepting] (q0) {$q_0$}; \node[state, initial, accepting] (q0) {$q_0$};
\node[state, accepting, right of=q0] (q1) {$q_{1,x}$}; \node[state, accepting, right of=q0] (q1) {$q_{1,x}$};
\node[state, accepting, right of=q1] (q2) {$q_{2, x, y}$}; \node[state, accepting, right of=q1] (q2) {$q_{2, x, y}$};
@ -1579,23 +1630,31 @@ The minimal nominal DFA for $\text{FIFO}_2$ is
(q1) edge[bend left] node[right] {$\kw{pop}(\neq x)$} (bot) (q1) edge[bend left] node[right] {$\kw{pop}(\neq x)$} (bot)
(q2) edge[bend left] node[below, align=center] {$\kw{pop}(\neq x)$ \\ $\kw{push}(\EAlph)$} (bot) (q2) edge[bend left] node[below, align=center] {$\kw{pop}(\neq x)$ \\ $\kw{push}(\EAlph)$} (bot)
(bot) edge[loop left] node {$\ast$} (bot); (bot) edge[loop left] node {$\ast$} (bot);
\stoptikzpicture \stoptikzpicture}
\stopplacefigure
The state reached from $q_{1,x}$ via $\kw{push}(x)$ is omitted: The state reached from $q_{1,x}$ via $\kw{push}(x)$ is omitted:
Its outgoing transitions are those of $q_{2,x,y}$, where $y$ is replaced by $x$. Its outgoing transitions are those of $q_{2,x,y}$, where $y$ is replaced by $x$.
Similar benchmarks appear in \cite[DBLP:conf/ictac/AartsFKV15, DBLP:journals/ml/IsbernerHS14]. Similar benchmarks appear in \cite[DBLP:conf/ictac/AartsFKV15, DBLP:journals/ml/IsbernerHS14].
\stopdescription
{\bf Double Word.}
\description{Double Word.}
$\lang_n = \{ ww \mid w \in \EAlph^n \}$ from \in{Section}[sec:overview]. $\lang_n = \{ ww \mid w \in \EAlph^n \}$ from \in{Section}[sec:overview].
{\bf NFA.} \startdescription[title={NFA.}]
Consider the language $\lang_{eq} = \bigcup_{a \in \EAlph} \EAlph^{\ast} a \EAlph^{\ast} a \EAlph^{\ast}$ of words where some letter appears twice. Consider the language $\lang_{eq} = \bigcup_{a \in \EAlph} \EAlph^{\ast} a \EAlph^{\ast} a \EAlph^{\ast}$ of words where some letter appears twice.
This is accepted by an NFA which guesses the position of the first occurrence of a repeated letter $a$ and then waits for the second $a$ to appear. This is accepted by an NFA which guesses the position of the first occurrence of a repeated letter $a$ and then waits for the second $a$ to appear.
The language is not accepted by a DFA \cite[DBLP:journals/corr/BojanczykKL14]. The language is not accepted by a DFA \cite[DBLP:journals/corr/BojanczykKL14].
Despite this \nNLStar{} is able to learn the automaton: Despite this \nNLStar{} is able to learn the automaton shown in \in{Figure}[fig:nfa].
\starttikzpicture[node distance=4cm] \startplacefigure
[title={A nominal NFA accepting $\lang_{eq}$.
Here, the transition from $q_2'$ to $q_{1,x}'$ is defined as $\delta(q_2', a) = \{ q_{1,b}' \mid b \in \EAlph \}$.},
list={A nominal NFA accepting $\lang_{eq}$.},
reference=fig:nfa]
\hbox{\starttikzpicture[node distance=4cm]
\node[state, initial] (q0) {$q_0'$}; \node[state, initial] (q0) {$q_0'$};
\node[state, right of=q0] (q1) {$q_{1,x}'$}; \node[state, right of=q0] (q1) {$q_{1,x}'$};
\node[state, accepting, right of=q1] (q2) {$q_{2}'$}; \node[state, accepting, right of=q1] (q2) {$q_{2}'$};
@ -1610,16 +1669,18 @@ Despite this \nNLStar{} is able to learn the automaton:
(q1) edge[loop above] node[above] {$\EAlph$} (q1) (q1) edge[loop above] node[above] {$\EAlph$} (q1)
(q1) edge[loop below] node[below] {$y$ to $q_{2,y}'$} (q1) (q1) edge[loop below] node[below] {$y$ to $q_{2,y}'$} (q1)
(q2) edge[loop above] node[above] {$\EAlph$} (q2); (q2) edge[loop above] node[above] {$\EAlph$} (q2);
\stoptikzpicture \stoptikzpicture}
\stopplacefigure
\stopdescription
where the transition from $q_2'$ to $q_{1,x}'$ is defined as $\delta(q_2', a) = \{ q_{1,b}' \mid b \in \EAlph \}$.
{\bf $n$-last Position.} \description{$n$-last Position.}
A prototypical example of regular languages which are accepted by very small NFAs is the set of words where a distinguished symbol $a$ appears on the $n$-last position \cite[DBLP:conf/ijcai/BolligHKL09]. A prototypical example of regular languages which are accepted by very small NFAs is the set of words where a distinguished symbol $a$ appears on the $n$-last position \cite[DBLP:conf/ijcai/BolligHKL09].
We define a similar nominal language $\lang'_n = \bigcup_{a \in \EAlph} a \EAlph^{\ast} a \EAlph^n$. We define a similar nominal language $\lang'_n = \bigcup_{a \in \EAlph} a \EAlph^{\ast} a \EAlph^n$.
To accept such words non-deterministically, one simply guesses the $n$-last position. To accept such words non-deterministically, one simply guesses the $n$-last position.
This language is also accepted by a much larger deterministic automaton. This language is also accepted by a much larger deterministic automaton.
\stopsubsection \stopsubsection
\stopsection \stopsection
\startsection \startsection
@ -1642,23 +1703,23 @@ Moreover, our algorithm is easier to generalise to other alphabets and computati
More recently papers appeared on learning register automata by \cite[authoryears][DBLP:conf/vmcai/HowarSJC12, DBLP:journals/fac/CasselHJS16]. More recently papers appeared on learning register automata by \cite[authoryears][DBLP:conf/vmcai/HowarSJC12, DBLP:journals/fac/CasselHJS16].
Their register automata are as expressive as our deterministic nominal automata. Their register automata are as expressive as our deterministic nominal automata.
The state-space is similar to our orbit-wise representation: The state space is similar to our orbit-wise representation:
It is formed by finitely many locations with registers. It is formed by finitely many locations with registers.
Transitions are defined symbolically using propositional logic. Transitions are defined symbolically using propositional logic.
We remark that the most recent paper \cite[DBLP:journals/fac/CasselHJS16] generalises the algorithm to alphabets with different structures (which correspond to different atom symmetries in our work), but at the cost of changing Angluin's framework. We remark that the most recent paper by \citet[DBLP:journals/fac/CasselHJS16] generalises the algorithm to alphabets with different structures (which correspond to different atom symmetries in our work), but at the cost of changing Angluin's framework.
Instead of membership queries the algorithm requires more sophisticated tree queries. Instead of membership queries the algorithm requires more sophisticated tree queries.
In our approach, using a different symmetry does not affect neither the algorithm nor its correctness proof. In our approach, using a different symmetry does not affect neither the algorithm nor its correctness proof.
Tree queries can be reduced to membership queries by enumerating all $n$-types for some $n$ ($n$-types in logic correspond to orbits in the set of $n$-tuples). Tree queries can be reduced to membership queries by enumerating all $n$-types for some $n$ ($n$-types in logic correspond to orbits in the set of $n$-tuples).
Keeping that in mind, their complexity results are roughly the same as ours, although this is hard to verify, as they do not give bounds on the length of individual tree queries. Keeping that in mind, their complexity results are roughly the same as ours, although this is hard to verify, as they do not give bounds on the length of individual tree queries.
Finally, our approach lends itself better to be extended to other variations on \LStar{} (of which many exist), as it is closer to Angluin's original work. Finally, our approach lends itself better to be extended to other variations on \LStar{} (of which many exist), as it is closer to Angluin's original work.
Another class of learning algorithms for systems with large alphabets is based on abstraction and refinement, which is orthogonal to the approach in the present paper but connections and possible transference of techniques are worth exploring in the future. Another class of learning algorithms for systems with large alphabets is based on abstraction and refinement, which is orthogonal to the approach in this thesis but connections and possible transference of techniques are worth exploring in the future.
\cite[authoryears][DBLP:conf/ictac/AartsFKV15] reduce the alphabet to a finite alphabet of abstractions, and \LStar{} for ordinary DFAs over such finite alphabet is used. \cite[authoryears][DBLP:conf/ictac/AartsFKV15] reduce the alphabet to a finite alphabet of abstractions, and \LStar{} for ordinary DFAs over such finite alphabet is used.
Abstractions are refined by counterexamples. Abstractions are refined by counterexamples.
Other similar approaches are \cite[DBLP:conf/vmcai/HowarSM11, DBLP:conf/nfm/IsbernerHS13], where global and local per-state abstractions of the alphabet are used, and \cite[DBLP:phd/hal/Mens17], where the alphabet can also have additional structure (e.g., an ordering relation). Other similar approaches are by \citet[DBLP:conf/vmcai/HowarSM11, DBLP:conf/nfm/IsbernerHS13], where global and local per-state abstractions of the alphabet are used, and by \citet[DBLP:phd/hal/Mens17], where the alphabet can also have additional structure (e.g., an ordering relation).
We can also mention \cite[DBLP:conf/popl/BotincanB13], a framework for learning symbolic models of software behaviour. We also mention that \citet[DBLP:conf/popl/BotincanB13] give a framework for learning symbolic models of software behaviour.
In \cite[DBLP:conf/fase/BergJR06, DBLP:conf/fase/BergJR08], authors cope with an infinite alphabet by running \LStar{} (adapted to Mealy machines) using a finite approximation of the alphabet, which may be augmented when equivalence queries are answered. \citet[DBLP:conf/fase/BergJR06, DBLP:conf/fase/BergJR08] cope with an infinite alphabet by running \LStar{} (adapted to Mealy machines) using a finite approximation of the alphabet, which may be augmented when equivalence queries are answered.
A smaller symbolic model is derived subsequently. A smaller symbolic model is derived subsequently.
Their approach, unlike ours, does not exploit the symmetry over the full alphabet. Their approach, unlike ours, does not exploit the symmetry over the full alphabet.
The symmetry allows our algorithm to reduce queries and to produce the smallest possible automaton at every step. The symmetry allows our algorithm to reduce queries and to produce the smallest possible automaton at every step.
@ -1678,13 +1739,14 @@ We chose NLambda for its convenient set-theoretic primitives, but the other opti
\startsection \startsection
[title={Discussion and Future Work}] [title={Discussion and Future Work}]
In this paper we defined and implemented extensions of several versions of \LStar{} and of \NLStar{} for nominal automata. In this chapter we defined and implemented extensions of several versions of \LStar{} and of \NLStar{} for nominal automata.
\noindent
We highlight two features of our approach: We highlight two features of our approach:
\startitemize \startitemize[after]
\item it has strong theoretical foundations: \item It has strong theoretical foundations:
The \emph{theory of nominal languages}, covering different alphabets and symmetries (see \in{Section}[sec:other-symms]); \emph{category theory}, where nominal automata have been characterised as \emph{coalgebras} \cite[DBLP:conf/icalp/KozenMP015, DBLP:journals/iandc/CianciaM10] and many properties and algorithms (e.g., minimisation) have been studied at this abstract level. The \emph{theory of nominal languages}, covering different alphabets and symmetries (see \in{Section}[sec:other-symms]); \emph{category theory}, where nominal automata have been characterised as \emph{coalgebras} \cite[DBLP:conf/icalp/KozenMP015, DBLP:journals/iandc/CianciaM10] and many properties and algorithms (e.g., minimisation) have been studied at this abstract level.
\item it follows a generic pattern for transporting computation models and algorithms from finite sets to nominal sets, which leads to simple correctness proofs. \item It follows a generic pattern for transporting computation models and algorithms from finite sets to nominal sets, which leads to simple correctness proofs.
\stopitemize \stopitemize
These features pave the way to several extensions and improvements. These features pave the way to several extensions and improvements.
@ -1692,7 +1754,7 @@ These features pave the way to several extensions and improvements.
Future work includes a general version of \nNLStar{}, parametric in the notion of side-effect (an example is non-determinism). Future work includes a general version of \nNLStar{}, parametric in the notion of side-effect (an example is non-determinism).
Different notions will yield models with different degree of succinctness w.r.t. deterministic automata. Different notions will yield models with different degree of succinctness w.r.t. deterministic automata.
The key observation here is that many forms of non-determinism and other side effects can be captured via the categorical notion of \emph{monad}, i.e., an algebraic structure, on the state-space. The key observation here is that many forms of non-determinism and other side effects can be captured via the categorical notion of \emph{monad}, i.e., an algebraic structure, on the state-space.
Monads allow generalizing the notion of composed and prime state: Monads allow generalising the notion of composed and prime state:
A state is composed whenever it is obtained from other states via an algebraic operation. A state is composed whenever it is obtained from other states via an algebraic operation.
Our algorithm \nNLStar{} is based on the \emph{powerset} monad, representing classical non-determinism. Our algorithm \nNLStar{} is based on the \emph{powerset} monad, representing classical non-determinism.
We are currently investigating a \emph{substitution} monad, where the operation is \quotation{applying a (possibly non-injective) substitution of atoms in the support}. We are currently investigating a \emph{substitution} monad, where the operation is \quotation{applying a (possibly non-injective) substitution of atoms in the support}.
@ -1722,10 +1784,12 @@ For instance, we can give an automaton over the substitution monad that recognis
Here $[y \mapsto x]$ means that, if that transition is taken, $q_{xy}$ (hence its language) is subject to $y \mapsto x$. Here $[y \mapsto x]$ means that, if that transition is taken, $q_{xy}$ (hence its language) is subject to $y \mapsto x$.
In general, the size of the minimal DFA for $\lang_n$ grows more than exponentially with $n$, but an automaton with substitutions on transitions, like the one above, only needs $\bigO(n)$ states. In general, the size of the minimal DFA for $\lang_n$ grows more than exponentially with $n$, but an automaton with substitutions on transitions, like the one above, only needs $\bigO(n)$ states.
This direction is investigated in \in{Chapter}[chap:separated-nominal-automata].
In principle, thanks to the generic approach we have taken, all our algorithms should work for various kinds of atoms with more structure than just equality, as advocated by \citet[DBLP:journals/corr/BojanczykKL14]. In principle, thanks to the generic approach we have taken, all our algorithms should work for various kinds of atoms with more structure than just equality, as advocated by \citet[DBLP:journals/corr/BojanczykKL14].
Details, such as precise assumptions on the underlying structure of atoms necessary for proofs to go through, remain to be checked. Details, such as precise assumptions on the underlying structure of atoms necessary for proofs to go through, remain to be checked.
For an implementation of automata learning over other kinds of atoms without compromising the generic approach, an extension of NLambda to those atoms will be needed, as the current version of the library only supports equality and totally ordered atoms. In the next chapter (\in{Chapter}[chap:ordered-nominal-sets]), we investigate learning with the total order symmetry.
We implement this in NLambda, as well as a new tool for computing with nominal sets over the total order symmetry.
The efficiency of our current implementation, as measured in \in{Section}[sec:tests], leaves much to be desired. The efficiency of our current implementation, as measured in \in{Section}[sec:tests], leaves much to be desired.
There is plenty of potential for running time optimisation, ranging from improvements in the learning algorithms itself, to optimisations in the NLambda library (such as replacing the external and general-purpose SMT solver with a purpose-built, internal one, or a tighter integration of nominal mechanisms with the underlying Haskell language as it was done by \citenp[DBLP:journals/entcs/Shinwell06]), to giving up the functional programming paradigm for an imperative language such as LOIS \cite[DBLP:conf/cade/KopczynskiT16, DBLP:conf/popl/KopczynskiT17]. There is plenty of potential for running time optimisation, ranging from improvements in the learning algorithms itself, to optimisations in the NLambda library (such as replacing the external and general-purpose SMT solver with a purpose-built, internal one, or a tighter integration of nominal mechanisms with the underlying Haskell language as it was done by \citenp[DBLP:journals/entcs/Shinwell06]), to giving up the functional programming paradigm for an imperative language such as LOIS \cite[DBLP:conf/cade/KopczynskiT16, DBLP:conf/popl/KopczynskiT17].

View file

@ -37,7 +37,7 @@ This chapter is based on the following publication:
\definedescription[description] \definedescription[description]
[headstyle=bold, alternative=serried, width=fit, indenting=yes] [headstyle=bold, alternative=serried, width=fit, indenting=yes]
\definedescription[step] \definedescription[step]
[headstyle=bold, alternative=serried, width=fit, text={Step }] [headstyle=bold, alternative=serried, width=fit, indenting=yes, text={Step }]
% The black dot is missing from Euler? % The black dot is missing from Euler?
\setupitemize[2,joinedup,nowhite,autointro] \setupitemize[2,joinedup,nowhite,autointro]

View file

@ -30,9 +30,9 @@
% Used to draw extended cells in observation tables for the learning nominal automata paper % Used to draw extended cells in observation tables for the learning nominal automata paper
\define[1]\NomCell{ \define[1]\NomCell{
\startmathcases \startmathcases[numberdistance=0pt]
\NC 1 \NC if $#1$ \NR \NC 1 \NC if $#1$ \NR
\NC 0 \NC else \NR \NC 0 \NC else \NR
\stopmathcases \stopmathcases
} }