phd-thesis/content/introduction.tex

\project thesis
\startcomponent introduction

\startchapter
  [title={Introduction},
   reference=chap:introduction]

When I was younger, I often learned how to play with new toys by messing about with them, i.e., by pressing buttons at random, observing their behaviour, pressing more buttons, and so on.
Only resorting to the manual -- or asking \quotation{experts} -- to confirm my beliefs on how the toys work.
Now that I am older, I do mostly the same with new devices, new tools, and new software.
However, now I know that this is an established computer science technique, called \emph{model learning}.

Model learning
\footnote{There are many names for the type of learning, such as \emph{active automata learning}.
The generic name \quotation{model learning} is chosen as a counterpoint to model checking.}
is an automated technique to construct a state-based model -- often a type of \emph{automaton} -- from a black box system.
The goal of this technique can be manifold:
it can be used to reverse-engineer a system, to find bugs in it, to verify properties of the system, or to understand the system in one way or another.
It is \emph{not} just random testing: the information learned during the interaction with the system is actively used to guide following interactions.
Additionally, the information learned can be inspected and analysed.

This thesis is about model learning and related techniques.
In the first part, I present results concerning \emph{black box testing} of automata.
Testing is a crucial part in learning software behaviour and often remains a bottleneck in applications of model learning.
In the second part, I show how \emph{nominal techniques} can be used to learn automata over structured infinite alphabets.
The study on nominal automata was directly motivated by work on learning networks protocols which rely on identifiers or sequence numbers.

But before we get ahead of ourselves, we should first understand what we mean by learning, as learning means very different things to different people.
In educational science, learning may involve concepts such as teaching, blended learning, and interdisciplinarity.
Data scientists may think of data compression, feature extraction, and neural networks.
In this thesis we are mostly concerned with software verification.
But even in the field of verification several types of learning are relevant.


\startsection
  [title={Model Learning}]

In the context of software verification, we often look at stateful computations with inputs and outputs.
For this reason, it makes sense to look at \emph{words}, or \emph{traces}.
For an alphabet $\Sigma$, we denote the set of words by $\Sigma^{*}$.

The learning problem is defined as follows.
There is some fixed, but unknown, language $\lang \subseteq \Sigma^{*}$.
This language may define the behaviour of a software component, a property in model checking, a set of traces from a protocol, etc.
We wish to infer a description of $\lang$ after only having observed a small part of this language.
For example, we may have seen hundred words belonging to the language and a few which do not belong to the language.
Then concluding with a good description of $\lang$ is difficult, as we are missing information about the infinitely many words we have not observed.

Such a learning problem can be stated and solved in a variety of ways.
In the applications we do in our research group, we often try to infer a model of a software component.
(\in{Chapter}[chap:applying-automata-learning] describes such an application.)
In these cases, a learning algorithm can interact with the software.
So it makes sense to study a learning paradigm which allows for \emph{queries}, and not just a data set of samples.

A typical query learning framework was established by \citet[Angluin87].
In her framework, the learning algorithm may pose two types of queries to a \emph{teacher}, or \emph{oracle}:

\description{Membership queries (MQs)}
The learner poses such a query by providing a word $w \in \Sigma^{*}$ to the teacher.
The teacher will then reply whether $w \in \lang$ or not.
This type of query is often generalised to more general output, in these cases we consider $\lang \colon \Sigma^\ast \to O$ and the teacher replies with $\lang(w)$.
In some papers, such a query is then called an \emph{output query}.

\description{Equivalence queries (EQs)}
The learner can provide a hypothesised description $H$ of $\lang$ to the teacher.
If the hypothesis is correct, the teacher replies with \kw{yes}.
If, however, the hypothesis is incorrect, the teacher replies with \kw{no} together with a counterexample, i.e., a word which is in $\lang$ but not in the hypothesis or vice versa.

By posing many such queries, the learner algorithm is supposed to converge to a correct model.
This type of learning is hence called \emph{exact learning}.
\citet[Angluin87] showed that one can do this efficiently for deterministic finite automata (DFAs), when $\lang$ is in the class of regular languages.

It should be clear why this is called \emph{query learning} or \emph{active learning}.
The learning algorithm initiates interaction with the teacher by posing queries, it may construct its own data points and ask for their corresponding label.
Active learning is in contrast to \emph{passive learning} where all observations are given to the algorithm up front.

Another paradigm which is relevant for our type of applications is \emph{PAC-learning with membership queries}.
Here, the algorithm can again use MQs as before, but the EQs are replace by random sampling.
So the allowed query is:

\description{Random sample queries (EXs)}
If the learner poses this query (there are no parameters), the teacher responds with a random word $w$ together with its label (i.e., whether $w \in \lang$ or not).

Instead of requiring that the learner exactly learns the model, we only require the following.
The learner should \emph{probably} return a model which is \emph{approximate} to the target.
This gives the name \emph{probably approximately correct} (PAC).
Note that there are two uncertainties: the probable and the approximate part.
Both parts are bounded by parameters, so one can determine the confidence.

As with many problems in computer science, we are also interested in the \emph{efficiency} of learning algorithms.
Instead of measuring time or space, we analyse the number of queries posed by an algorithm.
Efficiency often means that we require a polynomial number of queries.
But polynomial in what?
The learner has no input, other than the access to a teacher.
We ask the algorithms to be polynomial in the \emph{size of the target} (i.e., the size of the description which has yet to be learned).
In the case of PAC learning we also require it to be polynomial in the two parameters for confidence.

Deterministic automata can be efficiently learned in the PAC model.
In fact, any efficient exact learning algorithm with MQs and EQs can be transformed into an efficient PAC algorithm with MQs (see \citenp[KearnsV94], exercise 8.1).
For this reason, we mostly focus on the former type of learning in this thesis.
The transformation from exact learning to PAC learning is implemented by simply \emph{testing} the hypothesis with random samples.
This can be postponed until we actually implement a learning algorithm and apply it.

When using only EQs, only MQs, or only EXs, then there are hardness results for learning DFAs.
So the combinations MQs + EQs and MQs + EXs have been carefully picked,
they provide a minimal basis for efficient learning.
See the book of \citet[KearnsV94] for such hardness results and more information on PAC learning.

So far, all the queries are assumed to be \emph{just there}.
Somehow, these are existing procedures which we can invoke with \kw{MQ}$(w)$, \kw{EQ}$(H)$, or \kw{EX}$()$.
This is a useful abstraction when designing a learning algorithm.
One can analyse the complexity (in terms of number of queries) independently of how these queries are resolved.
Nevertheless, at some point in time one has to implement them.
In our case of learning software behaviour, membership queries are easily implemented:
simply provide the word $w$ to a running instance of the software and observe the output.
\footnote{In reality, it is a bit harder than this.
There are plentiful of challenges to solve, such as timing, choosing your alphabet, choosing the kind of observations to make, and being able to reliably reset the software.}
Equivalence queries, however, are in general not doable.
Even if we have the (machine) code, it is often way too complicated to check equivalence.
That is why we resort to testing with EX queries.
The EX query from PAC learning normally assumes a fixed, unknown, distribution.
In our case, we choose and implement a distribution to test against.
This cuts both ways:
On one hand, it allows us to only test behaviour we really case about, on the other hand the results are only as good as our choice of distribution.
We deviate even further from the PAC-model as we sometimes change our distribution while learning.
Yet, as applications show, this is a useful way of learning software behaviour.


\stopsection
\startsection
  [title={Applications of Model Learning}]

Since this thesis only contains one real-world application of learning in \in{Chapter}[chap:applying-automata-learning], it is good to mention a few others.
Although we remain in the context of learning software behaviour, the applications are quite different from each other.
This is by no means a complete list.

\description{Bug finding in protocols.}
A prominent example is by \citet[Fiterau-Brostean16].
They learn models of TCP implementations -- both clients and server sides.
Interestingly, they found bugs in the (closed source) Windows implementation.
Later, \citet[Fiterau-BrosteanH17] also found a bug in the sliding window of the Linux implementation of TCP.
Other protocols have been learned as well, such as the MQTT protocol by \citet[TapplerAB17], TLS by \citet[RuiterP15], and SSH by \citet[Fiterau-BrosteanLPRVV17].
Many of these applications reveal bugs by learning a model and consequently apply model checking.
The combination of learning and model checking was first described by \citet[PeledVY02].

\description{Bug finding in smart cards.}
\citet[AartsRP13] learn the software on smart cards of several Dutch and German banks.
These cards use the EMV protocol, which is run on the card itself.
So this is an example of a real black box system, where no other monitoring is possible and no code is available.
No vulnerabilities were found, although each card had a slightly different state machine.
The e.dentifier, a card reader implementing a challenge-response protocol, has been learned by \citet[ChaluparPPR14].
They built a Lego machine which could automatically press buttons and the researchers found a security flaw in this card reader.

\description{Regression testing.}
\citet[HungarNS03] describe the potential of automata learning in regression testing.
The aim is not to find bugs, but to monitor the development process of a system.
By considering the differences between models at different stages, one can generate regressions tests.

\description{Refactoring legacy software.}
Model learning can also be used in order to verify refactored software.
\citet[SchutsHV16] have applied this at a project within Philips.
They learn both an old version and a new version of the same component.
By comparing the learned models, some differences could be seen.
This gave developers opportunities to solve problems before replacing the old component by the new one.


\stopsection
\startsection
  [title={Research challenges}]

In this thesis, we will mostly see learning of deterministic automata or Mealy machines.
Although this is restrictive as many pieces of software require richer models, it has been successfully applied in the above examples.
The restrictions include the following.
\startitemize[after]
\item The system behaves deterministically.
\item One can reliably reset the system.
\item The system can be modelled with a finite state space.
This also means that the model does not incorporate time or data.
\item The set of input actions is finite.
\item One knows when the target is reached.
\stopitemize

\description{Research challenge 1: Approximating equivalence queries.}
Having confidence in a learned model is difficult.
We have PAC guarantees (as discussed before), but sometimes we may want to draw other conclusions.
For example, we may require the hypothesis to be correct, provided that the real system is implemented with a certain number of states.
Efficiency is important here:
We want to obtain those guarantees fast and we want to quickly find counterexamples when the hypothesis is wrong.
Test generation methods is the topic of the first part in this thesis.
We will review existing algorithms and discuss new algorithms for test generation.

\startdescription[title={Research challenge 2: Generalisation to infinite alphabets.}]
Automata over infinite alphabets are very useful for modelling protocols which involve identifiers or timestamps.
Not only is the alphabet infinite in these cases, the state space is as well, since the values have to be remembered.
In the second part of this thesis, we will see how nominal techniques can be used to tackle this challenge.

Being able to learn automata over an infinite alphabet is not new.
It has been tackled, for instance, by \citet[HowarSJC12], \citet[BolligHLM13] and in the theses of \citet[Aarts14], \citet[Cassel15], and \citet[Fiterau-Brostean18].
In the first thesis, the problem is solved by considering abstractions, which reduces the alphabet to a finite one.
These abstractions are automatically refined when a counterexample is presented to the algorithms.
\citet[Fiterau-Brostean18] extends this approach to cope with \quotation{fresh values}, crucial for protocols such as TCP.
In the thesis by \citet[Cassel15], another approach is taken.
The queries are changed to tree queries.
The approach in my thesis will be based on symmetries, which gives yet another perspective into the problem of learning such automata.
\stopdescription


\stopsection
\startsection
  [title={Black Box Testing}]

An important step in automata learning is equivalence checking.
Normally, this is abstracted away and done by an oracle, but we intend to implement such an oracle ourselves for our applications.
Concretely, the problem we need to solve is that of \emph{conformance checking}\footnote{Also known as \emph{machine verification} or \emph{fault detection}.} as it was first described by \citet[Moore56].

The problem is as follows:
Given the description of a finite state machine and a black box system, does the system behave exactly as the description?
We wish te determine this by running experiments on the system (as it is black box).
It should be clear that this is a hopelessly difficult task, as an error can be hidden arbitrarily deep in the system.
That is why we often assume some knowledge of the system.
In this thesis we often assume a \emph{bound on the number of states} of the system.
Under these conditions, \citet[Moore56] already solved the problem.
Unfortunately, his experiment is exponential in size, or in his own words: \quotation{fantastically large.}

Years later, \citet[Chow78, Vasilevskii73] independently designed efficient experiments.
In particular, the set of experiments is polynomial in the number of states.
These techniques will be discussed in detail in \in{Chapter}[chap:test-methods].
More background and other related problems, as well as their complexity results, are well exposed in a survey of \citet[LeeY94].

To give an example of conformance checking, we model a record player as a finite state machine.
We will not model the audible output -- that would depend not only on the device, but also the record one chooses to play\footnote{In particular, we have to add time to the model as one side of a record only lasts for roughly 25 minutes. Unless we take a record with sound on the locked groove such as the \emph{Sgt. Pepper's Lonely Hearts Club Band} album by \emph{The Beatles}.}.
Instead, the only observation we can make is looking how fast the turntable spins.
The device has two buttons: a start-stop button (\playstop) and a speed button (\spin) which toggles between $33 \frac{1}{3}$ rpm and $45$ rpm.
When turned on, the system starts playing immediately at $33 \frac{1}{3}$ rpm -- this is useful for DJing.
The intended behaviour of the records player has four states as depicted in \in{Figure}[fig:record-player].

\startplacefigure
  [title={Behaviour of a record player modelled as a finite state machine.},
   reference=fig:record-player]
\hbox{\starttikzpicture[node distance=4cm, bend angle=10]
  \node[state, initial above, align=center] (tl) {slow\\spinning};
  \node[state, right of=tl, align=center]   (tr) {fast\\spinning};
  \node[state, below of=tl, align=center]   (bl) {no\\spinning};
  \node[state, right of=bl, align=center]   (br) {no\\spinning};

  \path[trans]
  (tl) edge [bend right] node [left]  {\playstop} (bl)
  (tl) edge [bend right] node [below] {\spin}     (tr)
  (bl) edge [bend right] node [right] {\playstop} (tl)
  (bl) edge [bend right] node [below] {\spin}     (br)
  (tr) edge [bend right] node [above] {\spin}     (tl)
  (tr) edge [bend right] node [left]  {\playstop} (br)
  (br) edge [bend right] node [right] {\playstop} (tr)
  (br) edge [bend right] node [above] {\spin}     (bl);
\stoptikzpicture}
\stopplacefigure

Let us consider some faults which could be present in an implementation with four states.
In \in{Figure}[fig:faulty-record-players], two flawed records players are given.
In the first (\in{Figure}{a}[fig:faulty-record-players]), the sequence \playstop\,\spin\,\spin\, leads us to the wrong state.
However, this is not immediately observable, the turntable is in a non-spinning state as it should be.
The fault is only visible when we press \playstop\, once more: now the turntable is spinning fast instead of slow.
The sequence \playstop\,\spin\,\spin\,\playstop\, is a \emph{counterexample}.
In the second example (\in{Figure}{b}[fig:faulty-record-players]), the fault is again not immediately obvious: after pressing \spin\,\playstop\, we are in the wrong state as observed by pressing \playstop.
Here, the counterexample is \spin\,\playstop\,\playstop.

When a model of the implementation is given, it is not hard to find counterexamples.
However, in a black box setting we do not have such a model.
In order test whether a black box system is equivalent to a model, we somehow need to test all possible counterexamples.
In this example, a test suite should include sequences such as \playstop\,\spin\,\spin\,\playstop\, and \spin\,\playstop\,\playstop.

\startplacefigure
  [title={Two faulty record players.},
   reference=fig:faulty-record-players]
\startcombination[nx=2, ny=1, location=top, distance=1.5cm]
{\hbox{\starttikzpicture[node distance=3cm, bend angle=10]
  \node[state, initial above, align=center] (tl) {slow\\spinning};
  \node[state, right of=tl, align=center]   (tr) {fast\\spinning};
  \node[state, below of=tl, align=center]   (bl) {no\\spinning};
  \node[state, right of=bl, align=center]   (br) {no\\spinning};

  \path[trans]
  (tl) edge [bend right] node [left]  {\playstop} (bl)
  (tl) edge [bend right] node [below] {\spin}     (tr)
  (bl) edge [bend right] node [right] {\playstop} (tl)
  (bl) edge [bend right] node [below] {\spin}     (br)
  (tr) edge [bend right] node [above] {\spin}     (tl)
  (tr) edge [bend right] node [left]  {\playstop} (br)
  (br) edge [bend right] node [right] {\playstop} (tr)
  (br) edge [loop below, color=red]  node [above] {\spin} (br);
\stoptikzpicture}}{(a)}
{\hbox{\starttikzpicture[node distance=3cm, bend angle=10]
  \node[state, initial above, align=center] (tl) {slow\\spinning};
  \node[state, right of=tl, align=center]   (tr) {fast\\spinning};
  \node[state, below of=tl, align=center]   (bl) {no\\spinning};
  \node[state, right of=bl, align=center]   (br) {no\\spinning};

  \path[trans]
  (tl) edge [bend right] node [left]  {\playstop} (bl)
  (tl) edge [bend right] node [below] {\spin}     (tr)
  (bl) edge [bend right] node [right] {\playstop} (tl)
  (bl) edge [bend right] node [below] {\spin}     (br)
  (tr) edge [bend right] node [above] {\spin}     (tl)
  (tr) edge [color=red]  node [below right] {\playstop} (bl)
  (br) edge [bend right] node [right] {\playstop} (tr)
  (br) edge [bend right] node [above] {\spin}     (bl);
\stoptikzpicture}}{(b)}
\stopcombination
\stopplacefigure


\stopsection
\startsection
  [title={Nominal Techniques}]

In the second part of this thesis, I will present results related to \emph{nominal automata}.
Usually, nominal techniques are introduced in order to solve problems involving name binding in topics like lambda calculus.
However, we use them in automata theory, specifically to model \emph{register automata}.
These are automata which have an infinite alphabet, often thought of as input actions with \emph{data}.
The control flow of the automaton may actually depend on the data.
However, the data cannot be used in an arbitrary way as this would lead to many problems, such as emptiness and equivalence, being undecidable.
\footnote{The class of automata with arbitrary data operations is sometimes called \emph{extended finite state machines}.}
A principal concept in nominal techniques is that of \emph{symmetries}.

To motivate the use of symmetries, we will look at an example of a register auttomaton.
In the following automaton we model a (not-so-realistic) login system for a single person.
The alphabet consists of the following actions:
\startformula\startmathmatrix[n=2, distance=1cm]
\NC \kw{sign-up}(p) \NC \kw{logout}() \NR
\NC \kw{login}(p)   \NC \kw{view}()   \NR
\stopmathmatrix\stopformula
The \kw{sign-up} action allows one to set a password $p$.
This can only be done when the system is initialised.
The \kw{login} and \kw{logout} actions speak for themselves and the \kw{view} action allows one to see the secret data (we abstract away from what the user actually gets to see here).
A simple automaton with roughly this behaviour is given in \in{Figure}[fig:login-system].
We will only informally discuss its semantics for now.

\startplacefigure
  [title={A simple register automaton. The symbol $\ast$ denotes any input otherwise not specified. The $r$ in states $q_1$ and $q_2$ is a register.},
   list={A simple register automaton.},
   reference=fig:login-system]
\hbox{\starttikzpicture[node distance=4cm]
  \node[state,initial]    (0) {$q_0$};
  \node[state with output,right of=0] (1) {$q_1$ \nodepart{lower} $r$};
  \node[state with output,right of=1] (2) {$q_2$ \nodepart{lower} $r$};

  \path[trans]
  (0) edge [loop below] node [below] {$\ast$ / \checknok}
  (0) edge node [above, align=center] {$\kw{sign-up}(p)$ / \checkok \\ set $r := p$} (1)
  (1) edge [loop below] node [below] {$\ast$ / \checknok}
  (1) edge [bend left, align=center] node [above] {$\kw{login}(p)$ / \checkok \\ if $r = p$} (2)
  (2) edge [loop below] node [right, xshift=2mm, align=left] {$\kw{view}()$ / \checkok \\ $\ast$ / \checknok}
  (2) edge [bend left] node [below] {$\kw{logout}()$ / \checkok} (1);
\stoptikzpicture}
\stopplacefigure

To model the behaviour, we want the domain of passwords to be infinite.
After all, one should allow arbitrarily long passwords to be secure.
This means that a register automaton is actually an automaton over an infinite alphabet.

Common algorithms for automata, such as learning, will not work with an infinite alphabet.
Any loop which iterates over the alphabet will diverge.
In order to cope with this, we will use the \emph{symmetries} present in the alphabet.

Let us continue with the example and look at its symmetries.
If a person signs up with a password \quotation{\kw{hello}} and consequently logins with \quotation{\kw{hello}}, then this is not distinguishable from a person signing up and logging in with \quotation{\kw{bye}}.
This is an example of symmetry:
the values \quotation{\kw{hello}} and \quotation{\kw{bye}} can be permuted, or interchanged.
Note, however, that the trace $\kw{sign-up}(\kw{hello})\, \kw{login}(\kw{bye})$ is different from the two before:
no permutation of \quotation{\kw{hello}} and \quotation{\kw{bye}} will bring us to a logged-in state with that trace.
So we see that, despite the symmetry, we cannot simply identify the value \quotation{\kw{hello}} and \quotation{\kw{bye}}.
For this reason, we keep the alphabet infinite and explicitly mention its symmetries.

Using symmetries in automata theory is not a new idea.
In the context of model checking, the first to use symmetries were \citet[EmersonS96, IpD96].
But only \citet[IpD96] used it to deal with infinite data domains.
For automata learning with infinite domains, symmetries were used by \citet[Sakamoto97].
He devised an \LStar{} learning algorithm for register automata, much like the one presented in \in{Chapter}[chap:learning-nominal-automata].
The symmetries are crucial to reduce the problem to a finite alphabet and use the regular \LStar{} algorithm.
(\in{Chapter}[chap:learning-nominal-automata] shows how to do it with more general symmetries.)
Around the same time \citet[FerrariMT05] worked on automata theoretic algorithms for the \pi-calculus.
Their approach was based on the same symmetries and they developed a theory of \emph{named sets} to implement their algorithms.
Named sets are equivalent to nominal sets.
However, nominal sets are defined in a more elementary way.
The nominal sets we will soon see are introduced by \citet[GabbayP02] to solve certain problems in name binding in abstract syntaxes.
Although this is not really related to automata theory, it was picked up by \citet[BojanczykKL14],
who provide an equivalence between register automata and nominal automata.
(This equivalence is exposed in more detail in the book of \citenp[Bojanczyk18].)
Additionally, they generalise the work on nominal sets to other symmetries.

The symmetries we encounter in this thesis are listed below, but other symmetries can be found in the literature.
The symmetry directly corresponds to the data values (and operations) used in an automaton.
The data values are often called \emph{atoms}.
\startbigitemize
\item
The \emph{equality symmetry}.
Here the domain can be any countably infinite set.
We can take, for example, the set of strings we used before as the domain from which we take passwords.
No further structure is used on this domain, meaning that any value is just as good as any other.
The symmetries therefore consist of all bijections on this domain.
\item
The \emph{total order symmetry}.
In this case, we take a countable infinite set with a dense total order.
Typically, this means we use the rational numbers, $\Q$, as data values and symmetries which respect the ordering.
\stopbigitemize


\startsubsection
  [title={What is a nominal set?}]

So what exactly is a nominal set?
I will not define it here and leave the formalities to the corresponding chapters.
It suffices, for now, to think of nominal sets as abstract sets (often infinite) on which a group of symmetries acts.
This action makes it possible to interpret the symmetries of the data values in the abstract set.
For automata, this allows us to talk about symmetries on the state space, the set of transitions, and the alphabet.

In order to implement these sets algorithmically, we impose two finiteness requirements.
Both properties can be expressed using only the group action.
\startbigitemize
\item
Each element is \emph{finitely supported}.
A way to think of this requirement is that each element is \quotation{constructed} out of finitely many data values.
\item
The set is \emph{orbit-finite}.
This means that we can choose finitely many elements such that any other element is a permuted version of one of those elements.
\stopbigitemize

If we wish to model the automaton from \in{Figure}[fig:login-system] as a nominal automaton, then we can simply define the state space as
\startformula
Q = \{ q_0 \} \cup \{ q_{1, a} \mid a \in \atoms \} \cup \{ q_{2, a} \mid a \in \atoms \},
\stopformula
where $\atoms$ is the set of atoms.
In this example, $\atoms$ is the set of all possible passwords.
The set $Q$ is infinite, but satisfies the two finiteness requirements.

The upshot of doing this is that the set $Q$ (and transition structure) corresponds directly to the semantics of the automaton.
We do not have to \emph{encode} how values relate or how they interact.
Instead, the set (and transition structure) defines all we need to know.
Algorithms, such as reachability, minimisation, and learning, can be run on such automata, despite the sets being infinite.
These algorithms can be implemented rather easily by using a libraries such as \NLambda{}, \LOIS{}, or \ONS{} from \in{Chapter}[chap:ordered-nominal-sets].
These libraries implement a data structure for nominal sets, and provide ways to iterate over such (infinite) sets.

One has to be careful as not all results from automata theory transfer to nominal automata.
A notable example is the powerset construction, which converts a non-deterministic automaton into a deterministic one.
The problem here is that the powerset of a set is generally \emph{not} orbit-finite and so the finiteness requirement is not met.
Consequently, languages accepted by nominal DFAs are \emph{not} closed under Kleene star, or even concatenation.


\stopsubsection
\stopsection
\startsection
  [title={Contributions}]

This thesis is split into two parts.
\in{Part}[part:testing] contains material about black box testing, while \in{Part}[part:nominal] is about nominal techniques.
The chapters can be read in isolation.
However, the chapters do get more technical and mathematical -- especially in \in{Part}[part:nominal].

Detailed discussion on related work and future directions of research are presented in each chapter.

\startcontribution[title={Chapter \ref[default][chap:test-methods]: FSM-based test methods.}]
This chapter introduces test generation methods which can be used for learning or conformance testing.
The methods are presented in a uniform way, which allows to give a single proof of completeness for all these methods.
Moreover, the uniform presentation gives room to develop new test generation methods.
One particular contribution (besides the theoretical framework) is the new \emph{hybrid ADS} test generation method.
The main contributions are:
\startitemize[before, after]
\item Uniform description of known methods: \in{Theorem}[thm:completeness] (p. \at[thm:completeness])
\item A new proof of completeness: \in{Section}[sec:completeness] (p. \at[sec:completeness])
\item New algorithm and its implementation: \in{Section}[sec:hybrid-ads-impementation] (p. \at[sec:hybrid-ads-impementation])
\stopitemize
\stopcontribution

\startcontribution[title={Chapter \ref[default][chap:applying-automata-learning]: Applying automata learning to embedded control software.}]
In this chapter we will apply model learning to an industrial case study.
It is a unique benchmark as it is much bigger than any of the applications seen before (3410 states and 77 inputs).
This makes it challenging to learn a model and the main obstacle is finding counterexamples.
Here, my main contribution is the new test generation method from \in{Chapter}[chap:test-methods].
The main contribution is:
\startitemize[before, after]
\item Succesfully learn a large-scale system: \in{Section}[sec:learnResults] (p. \at[sec:learnResults])
\stopitemize

This is based on the following publication:

\cite[entry][DBLP:conf/icfem/SmeenkMVJ15].
\stopcontribution

\startcontribution[title={Chapter \ref[default][chap:minimal-separating-sequences]: Minimal separating sequences for all pairs of states.}]
Continuing on test generation methods, this chapter presents an efficient algorithm to construct separating sequences.
Not only is the algorithm efficient -- it runs in $\bigO(n \log n)$ time -- it also constructs minimal length sequences.
The algorithm is inspired by a minimisation algorithm by \citet[Hopcroft71], but extending it to construct witnesses is non-trivial.
The main contributions are:
\startitemize[before, after]
\item Efficient algorithm for separating sequences: \in{Algorithms}[alg:minimaltree] \& \in{}[alg:increase-k-v3] (p. \at[alg:minimaltree] \& \at[alg:increase-k-v3])
\item Applications to black box testing: \in{Section}[sec:appl-conformance-testing] (p. \at[sec:appl-conformance-testing])
\item Implementation: \in{Section}[sec:implementation-partition] (p. \at[sec:implementation-partition])
\stopitemize

This is based on the following publication:

\cite[entry][DBLP:conf/lata/SmetsersMJ16].
\stopcontribution

\startcontribution[title={Chapter \ref[default][chap:learning-nominal-automata]: Learning nominal automata.}]
In this chapter, we show how to learn automata over infinite alphabets.
We do this by translating the \LStar{} algorithm directly to a nominal version, \nLStar{}.
The correctness proofs mimic the original proofs by \citet[Angluin87].
Since our new algorithm is close to the original, we are able to translate variants of the \LStar{} algorithm as well.
In particular, we provide a learning algorithm for nominal non-deterministic automata.
The main contributions are:
\startitemize[before, after]
\item \LStar{}-algorithm for nominal automata: \in{Section}[sec:nangluin] (p. \at[sec:nangluin])
\item Its correctness and complexity: \in{Theorem}[thm:termination] \& \in{Corollary}[cor:complexity-nom-lstar] (p. \at[thm:termination] \& \at[cor:complexity-nom-lstar])
\item Generalisation to non-deterministic automata: \in{Section}[sec:nominal-nlstar] (p. \at[sec:nominal-nlstar])
\item Implementation in \NLambda{}: \in{Section}[sec:implementation-nominal-lstar] (p. \at[sec:implementation-nominal-lstar])
\stopitemize

This is based on the following publication:

\cite[entry][MoermanS0KS17].
\stopcontribution

\startcontribution[title={Chapter \ref[default][chap:ordered-nominal-sets]: Fast computations on ordered nominal sets.}]
In this chapter, we provide a library to compute with nominal sets.
We restrict our attention to nominal sets over the total order symmetry.
This symmetry allows for a rather easy characterisation of orbits, and hence an easy implementation.
We experimentally show that it is competitive with existing tools, which are based on SMT solvers.
The main contributions are:
\startitemize[before, after]
\item Characterisation theorem of orbits: \in{Table}[tab:overview] (p. \at[tab:overview])
\item Complexity results: \in{Theorems}[thmcomplexity] \& \in{}[thm:moore] (p. \at[thmcomplexity] and \at[thm:moore])
\item Implementation: \in{Section}[sec:implementation-ons] (p. \at[sec:implementation-ons])
\stopitemize

This is based on the following publication:

\cite[entry][VenhoekMR18].
\stopcontribution

\startcontribution[title={Chapter \ref[default][chap:separated-nominal-automata]: Separated nominal automata.}]
We investigate how to reduce the size of certain nominal automata.
This is based on the observation that some languages (with outputs) are not just invariant under symmetries, but invariant under arbitrary \emph{transformations}, or \emph{renamings}.
We define a new type of automaton, the \emph{separated nominal automaton}, and show that they exactly accept those languages which are closed under renamings.
All of this is shown by using a theoretical framework: we establish a strong relationship between nominal sets on one hand, and nominal renaming sets on the other.
The main contributions are:
\startitemize[before, after]
\item Adjunction between nominal sets and renaming sets: \in{Theorem}[thm:adjunction] (p. \at[thm:adjunction])
\item This adjunction is monoidal: \in{Theorem}[thm:monoidal] (p. \at[thm:monoidal])
\item Separated automata have reduced state space: \in{Example}[ex:sep-aut-fifo] (p. \at[ex:sep-aut-fifo])
\stopitemize

This is based on a paper under submission:

\cite[entry][MoermanR19].
\stopcontribution

Besides these chapters in this thesis, the author has published the following papers.
These are not included in this thesis, but a short summary of those papers is presented below.

\startcontribution[title={Complementing Model Learning with Mutation-Based Fuzzing.}]
Our group at the Radboud University participated in the RERS challenge 2016.
This is a challenge where reactive software is provided and researchers have to asses validity of certain properties (given as LTL specifications).
We approached this with model learning:
Instead of analysing the source code, we simply learned the external behaviour, and then used model checking on the learned model.
This has worked remarkably well, as the models of the external behaviour are not too big.
Our results were presented at the RERS workshop (ISOLA 2016).
The report can be found on ar$\Chi$iv:

\cite[entry][SmetsersMJV16].
\stopcontribution

\startcontribution[title={$n$-Complete test suites for IOCO.}]
In this paper, we investigate complete test suites for labelled transition systems (LTSs), instead of deterministic Mealy machines.
This is a much harder problem than conformance testing of deterministic systems.
The system may adversarially avoid certain states the tester wishes to test.
We provide a test suite which is $n$-complete (provided the implementation is a suspension automaton).
My main personal contribution here is the proof of completeness, which resembles the proof presented in \in{Chapter}[chap:test-methods] closely.
The conference paper was presented at ICTSS:

\cite[entry][BosJM17].

An extended version will appear soon:

\cite[entry][BosJM18].
\stopcontribution

\startcontribution[title={Learning Product Automata.}]
In this article, we consider Moore machines with multiple outputs.
These machines can be decomposed by projecting on each output,
resulting in smaller components that can be learned with fewer queries.
We give experimental evidence that this is a useful technique which can reduce the number of queries substantially.
This is all motivated by the idea that compositional methods are widely used throughout engineering and that we should use this in model learning.
This work was presented at ICGI 2018:

\cite[entry][Moerman18].
\stopcontribution


\stopsection
\startsection
  [title=Conclusion and Outlook]

With the current tools for model learning, it is possible to learn big state machines of black box systems.
It involves using the clever algorithms for learning (such as the TTT algorithm by \citenp[Isberner15]) and efficient testing methods (see \in{Chapter}[chap:test-methods]).
However, as the industrial case study from \in{Chapter}[chap:applying-automata-learning] shows, the bottleneck is often in conformance testing.

In order to improve on this bottleneck, one possible direction is to consider \quote{grey box testing.}
The methods discussed in this thesis are all black box methods, this could be considered as \quote{too pessimistic.}
Often, we do have (parts of the) source code and we do know relationships between different inputs.
The question is how this additional information can be integrated in a principled manner in the learning and testing of systems.

Black box testing still has theoretical challenges.
Current generalisations to non-deterministic systems or language inclusion (such as black box testing for IOCO) often need exponentially big test suites.
Whether this is necessary is unknown (to me), we only have upper bounds but no lower bounds.
An interesting approach would be to see if there exists a notion of reduction between test suites.
This is analogous to the reductions used complexity theory to prove problems hard, or reductions used in PAC theory to prove learning problems to be inherently unpredictable.

Another path taken in this thesis is the research on nominal automata.
This was motivated by the problem of learning automata over infinite alphabets.
So far, the results on nominal automata are mostly theoretical in nature.
Nevertheless, we show that the nominal algorithms can be implemented and that they can be run concretely on black box systems (\in{Chapter}[chap:learning-nominal-automata]).
The advantage of using the foundations of nominal sets is that the algorithms are closely related to the original \LStar{} algorithm.
Consequently, variations of \LStar{} can easily be implemented.
For instance, we show that \NLStar{} algorithm for non-deterministic automata works in the nominal case too.
(We have not attempted to implement more recent algorithms such as TTT.)
The nominal learning algorithms can be implemented in just a few hundreds lines of code, much less than the approach taken by, e.g., \citet[Fiterau-Brostean18].

In this thesis, we tackle some efficiency issues when computing with nominal sets.
In \in{Chapter}[chap:ordered-nominal-sets] we characterise orbits in order to give a efficient representation (for the total-order symmetry).
Another result is the fact that some nominal automata can be \quote{compressed} to separated automata, which can be exponentially smaller (\in{Chapter}[chap:separated-nominal-automata]).
However, the nominal tools still leave much to desired in terms of efficiency.

Last, it would be interesting to marry the two paths taken in this thesis.
I am not aware of $n$-complete test suites for register automata or nominal automata.
The results on learning nominal automata in \in{Chapter}[chap:learning-nominal-automata] show that this should be possible, as an observation table gives a test suite.
\footnote{The rows of a table are access sequences, and the columns provide a characterisation set.}
However, there is an interesting twist to this problem.
The test methods from \in{Chapter}[chap:test-methods] can all account for extra states.
For nominal automata, we should be able to cope with extra states \emph{and} extra registers.
It will be interesting to see how the test suite grows as these two dimensions increase.


\stopsection
\referencesifcomponent
\stopchapter
\stopcomponent