diff --git a/content.tex b/content.tex
index 656332c..c2b07e0 100644
--- a/content.tex
+++ b/content.tex
@@ -14,6 +14,7 @@
 \section{Learning and Testing}
 \section{Nominal Techniques}
 \component content/test-methods
+\component content/applying-automata-learning
 \chapter{Learning Embedded Stuff}
 \chapter{Separating Sequences}
 \component content/learning-nominal-automata
diff --git a/content/applying-automata-learning.tex b/content/applying-automata-learning.tex
new file mode 100644
index 0000000..4553b97
--- /dev/null
+++ b/content/applying-automata-learning.tex
@@ -0,0 +1,612 @@
+\project thesis
+\startcomponent applying-automata-learning
+
+\startchapter
+  [title={Applying Automata Learning to Embedded Control Software},
+   reference=chap:applying-automata-learning]
+
+\todo{Auteurs en moet ik ook support noemen van de andere auteurs? Keywords?}
+
+\startabstract
+Using an adaptation of state-of-the-art algorithms for black-box automata learning, as implemented in the LearnLib tool,
+we succeeded to learn a model of the Engine Status Manager (ESM), a software component that is used in printers and copiers of Oc\'{e}.
+The main challenge that we encountered was that LearnLib, although effective in constructing hypothesis models, was unable to find counterexamples for some hypotheses.
+In fact, none of the existing FSM-based conformance testing methods that we tried worked for this case study.
+We therefore implemented an extension of the algorithm of Lee \& Yannakakis for computing an adaptive distinguishing sequence.
+Even when an adaptive distinguishing sequence does not exist, Lee \& Yannakakis' algorithm produces an adaptive sequence that \quotation{almost} identifies states.
+In combination with a standard algorithm for computing separating sequences for pairs of states, we managed to verify states with on average 3 test queries.
+Altogether, we needed around 60 million queries to learn a model of the ESM with 77 inputs and 3.410 states.
+We also constructed a model directly from the ESM software and established equivalence with the learned model.
+To the best of our knowledge, this is the first paper in which active automata learning has been applied to industrial control software.
+\stopabstract
+
+
+\startsection
+  [title={Introduction}]
+
+Once they have high-level models of the behaviour of software components, software engineers can construct better software in less time.
+A key problem in practice, however, is the construction of models for existing software components, for which no or only limited documentation is available.
+
+The construction of models from observations of component behaviour can be performed using regular inference (aka automata learning) techniques \citep[Ang87,Hig10,conf/sfm/SteffenHM11].
+The most efficient such techniques use the set-up of \emph{active learning,} illustrated in \in{Figure}[fig:learning], in which a \quotation{learner} has the task to learn a model of a system by actively asking questions to a \quotation{teacher}.
+
+\startplacefigure
+  [title={Active learning of reactive systems},
+   reference=fig:learning]
+\externalfigure[Learner.pdf][width=0.6\textwidth]
+\stopplacefigure
+
+The core of the teacher is a \emph{System Under Test (SUT),} a reactive system to which one can apply inputs and whose outputs one may observe.
+The learner interacts with the SUT to infer a model by sending inputs and observing the resulting outputs (\quotation{membership queries}).
+In order to find out whether an inferred model is correct, the learner may pose an \quotation{equivalence query}.
+The teacher uses a model-based testing (MBT) tool to try and answer such queries:
+Given a hypothesized model, an MBT tool generates a long test sequence using some conformance testing method.
+If the SUT passes this test, then the teacher informs the learner that the model is deemed correct.
+If the outputs of the SUT and the model differ, this constitutes a counterexample, which is returned to the learner.
+Based on such a counterexample, the learner may then construct an improved hypothesis.
+Hence, the task of the learner is to collect data by interacting with the teacher and to formulate hypotheses, and the task of the MBT tool is to establish the validity of these hypotheses.
+It is important to note that it may occur that an SUT passes the test for an hypothesis, even though this hypothesis is not valid.
+
+Triggered by various theoretical and practical results, see e.g. the work by \citet[BergGJLRS05,Leucker06,RSBM09,ThesisFides,CasselHJMS15,MertenHSCJ12,HowarSJC12], there is a fast-growing interest in automata learning technology.
+In recent years, automata learning has been applied successfully, e.g., to regression testing of telecommunication systems \citep[HNS03], checking conformance of communication protocols to a reference implementation \citep[BRP14], finding bugs in Windows and Linux implementations of TCP \citep[FJV14], analysis of botnet command and control protocols \citep[ChocSS10], and integration testing \citep[LGS:forte06,GLPS:fates08].
+
+In this paper, we explore whether LearnLib \citep[RSBM09], a state-of-the-art automata learning tool, is able to learn a model of the Engine Status Manager (ESM), a piece of control software that is used in many printers and copiers of Oc\'{e}.
+Software components like the ESM can be found in many embedded systems in one form or another.
+Being able to retrieve models of such components automatically is potentially very useful.
+For instance, if the software is fixed or enriched with new functionality, one may use a learned model for regression testing.
+Also, if the source code of software is hard to read and poorly documented, one may use a model of the software for model-based testing of a new implementation, or even for generating an implementation on a new platform automatically.
+Using a model checker one may also study the interaction of the software with other components for which models are available.
+
+The ESM software is actually well documented, and an extensive test suite exists.
+The ESM, which has been implemented using Rational Rose Real-Time (RRRT), is stable and has been in use for 10 years.
+Due to these characteristics, the ESM is an excellent benchmark for assessing the performance of automata learning tools in this area.
+The ESM has also been studied in other research projects:
+\citet[Plo:05] modelled the ESM and other related managers and verified properties based on the official specifications of the ESM, and \citet[conf/mompes/GraafD07]
+have checked the consistency of the behavioural specifications defined in the ESM against the RRRT definitions.
+
+Learning a model of the ESM turned out to be more complicated than expected.
+The top level UML/RRRT statechart from which the software is generated only has 16 states.
+However, each of these states contains nested states, and in total there are 70 states that do not have further nested states.
+Moreover, the \cpp{} code contained in the actions of the transitions also creates some complexity, and this explains why the minimal Mealy machine that models the ESM has 3.410 states.
+LearnLib has been used to learn models with tens of thousands of states by \citet[RaMeSM2009], and therefore we expected that it would be easy to learn a model for the ESM.
+However, finding counterexamples for incorrect hypotheses turned out to be challenging due to the large number of 77 inputs.
+The test algorithms implemented in LearnLib, such as random testing, the W-method by \citet[Ch78,vasilevskii1973failure] and the Wp-method by \citet[FBKAG91], failed to deliver counterexamples within an acceptable time.
+Automata learning techniques have been successfully applied to case studies in which the total number of input symbols is much larger, but in these cases it was possible to reduce the number of inputs to a small number ($< 10$) using abstraction techniques \cite{AJUV15,HoStMe2011}.
+In the case of ESM, use of abstraction techniques only allowed us to reduce the original 156 concrete actions to 77 abstract actions.
+
+We therefore implemented and extension of the algorithm of \citet[LYa94] for computing an adaptive distinguishing sequence.
+Even when an adaptive distinguishing sequence does not exist, Lee \& Yannakakis' algorithm produces an adaptive sequence that \quotation{almost} identifies states.
+In combination with a standard algorithm for computing separating sequences for pairs of states, we managed to verify states with on average 3 test queries and to learn a model of the ESM with 77 inputs and 3.410 states.
+We also constructed a model directly from the ESM software and established equivalence with the learned model.
+To the best of our knowledge, this is the first paper in which active automata learning has been applied to industrial control software.
+Preliminary evidence suggests that our adaptation of Lee \& Yannakakis' algorithm outperforms existing FSM-based conformance algorithms.
+
+During recent years most researchers working on active automata learning focused their efforts on efficient algorithms and tools for the construction of hypothesis models.
+Our work shows that if we want to further scale automata learning to industrial applications, we also need better algorithms for finding counterexamples for incorrect hypotheses.
+Following \citet[BergGJLRS05], our work shows that the context of automata learning provides both new challenges and new opportunities for the application of testing algorithms.
+All the models for the ESM case study together with the learning and testing statistics are available at \todo{http://www.mbsd.cs.ru.nl/publications/papers/fvaan/ESM/}, as a benchmark for both the automata learning and testing communities.
+\todo{Link naar nieuwe benchmark site}
+
+
+\stopsection
+\startsection
+  [title={Engine Status Manager},
+   reference=sec:esm]
+
+The focus of this article is the \emph{Engine Status Manager (ESM),} a software component that is used to manage the status of the engine of Oc\'{e} printers and copiers.
+In this section, the overall structure and context of the ESM will be explained.
+
+
+\startsubsection
+  [title={ESRA}]
+
+The requirements and behaviour of the ESM are defined in a software architecture called Embedded Software Reference Architecture (ESRA).
+The components defined in this architecture are reused in many of the products developed by Oc\'{e} and form an important part of these products.
+This architecture is developed for \emph{cut-sheet} printers or copiers.
+The term cut-sheet refers to the use of separate sheets of paper as opposed to a \emph{continuous feed} of paper.
+
+An \emph{engine} refers to the printing or scanning part of a printer or copier.
+Other products can be connected to an engine that pre- or postprocess the paper, for example a cutter, folder, stacker or stapler.
+
+\startplacefigure
+  [title={Global overview of the engine software},
+   reference=fig:esra-overview]
+\externalfigure[ESRAOverview.pdf][width=0.5\textwidth]
+\stopplacefigure
+
+\in{Figure}[fig:esra-overview] gives an overview of the software in a printer or copier.
+The \emph{controller} communicates the required actions to the engine software.
+This includes transport of digital images, status control, print or scan actions and error handling.
+The controller is responsible for queuing, processing the actions received from the network and operators and delegating the appropriate actions to the engine software.
+%
+The \emph{managers} communicate with the controller using the \emph{external interface adapters.}
+These adapters translate the external protocols to internal protocols.
+The managers manage the different functions of the engine.
+They are divided by the different functionalities such as status control, print or scan actions or error handling they implement.
+In order to do this a manager may communicate with other managers and functions.
+%
+A \emph{function} is responsible for a specific set of hardware components.
+It translates commands from the managers to the function hardware and reports the status and other information of the function hardware to the managers.
+This hardware can for example be the printing hardware or hardware that is not part of the engine hardware such as a stapler.
+%
+Other functionalities such as logging and debugging are orthogonal to the functions and managers.
+
+
+\stopsubsection
+\startsubsection
+  [title={ESM and connected components}]
+
+The ESM is responsible for the transition from one status of the printer or copier to another.
+It coordinates the functions to bring them in the correct status.
+Moreover, it informs all its connected clients (managers or the controller) of status changes.
+Finally, it handles status transitions when an error occurs.
+
+\startplacefigure
+  [title={Overview of the managers and clients connected to the ESM},
+   reference=fig:esm-protocols]
+\externalfigure[ESMProtocols.pdf][width=0.5\textwidth]
+\stopplacefigure
+
+\in{Figure}[fig:esm-protocols] shows the different components to which the ESM is connected.
+The \emph{Error Handling Manager (EHM),} \emph{Action Control Manager (ACM)} and other clients request engine statuses.
+The ESM decides whether a request can be honored immediately, has to be postponed or ignored.
+If the requested action is processed the ESM requests the functions to go to the appropriate status.
+The EHM has the highest priority and its requests are processed first.
+The EHM can request the engine to go into the defect status.
+The ACM has the next highest priority.
+The ACM requests the engine to switch between running and standby status.
+The other clients request transitions between the other statuses, such as idle, sleep, standby and low power.
+All the other clients have the same lowest priority.
+%
+The Top Capsule instantiates the ESM and communicates with it during the initialization of the ESM.
+The Information Manager provides some parameters during the initialization.
+
+There are more managers connected to the ESM but they are of less importance and are thus not mentioned here.
+
+
+\stopsubsection
+\startsubsection
+  [title={Rational Rose RealTime}]
+
+The ESM has been implemented using \emph{Rational Rose RealTime (RRRT)}.
+In this tool so-called \emph{capsules} can be created.
+Each of these capsules defines a hierarchical \emph{statechart diagram}.
+Capsules can be connected with each other using \emph{structure diagrams}.
+Each capsule contains a number of ports that can be connected to ports of other capsules by adding connections in the associated structure diagram.
+Each of these ports specifies which protocol should be used.
+This protocol defines which messages may be sent to and from the port.
+Transitions in the statechart diagram of the capsule can be triggered by arriving messages on a port of the capsule.
+Messages can be sent to these ports using the action code of the transition.
+The transitions between the states, actions and guards are defined in \cpp{} code.
+From the state diagram, \cpp{} source files are generated.
+
+The RRRT language and semantics is based on UML \citep[OMGUML2] and ROOM \citep[ROOM94].
+One important concept used in RRRT is the run-to-completion execution model \citep[EJW02].
+This means that when a received message is processed, the execution cannot be interrupted by other arriving messages.
+These messages are placed in a queue to be processed later.
+
+
+\stopsubsection
+\startsubsection
+  [title={The ESM state diagram}]
+
+\startplacefigure
+  [title={Top states and transitions of the ESM},
+   reference=fig:esm-states]
+\externalfigure[ESMStates.pdf][width=\textwidth]
+\stopplacefigure
+
+\in{Figure}[fig:esm-states] shows the top states of the ESM statechart.
+The statuses that can be requested by the clients and managers correspond to gray states.
+The other states are so called \emph{transitory states}.
+In transitory states the ESM is waiting for the functions to report that they have moved to the corresponding status.
+Once all functions have reported, the ESM moves to the corresponding status.
+
+The \kw{idle} status indicates that the engine has started up but that
+it is still \emph{cold} (uncontrolled temperature).
+The \kw{standby} status indicates that the engine is \emph{warm} and ready for printing or scanning.
+The \kw{running} status indicates that the engine is printing or scanning.
+%
+The transitions from the overarching state to the \kw{goingToSleep} and
+\kw{goingToDefect} states indicate that it is possible to move to the \kw{sleep} or \kw{defect} status from any state.
+In some cases it is possible to awake from sleep status, in other cases the main power is turned off.
+%
+The \kw{medium} status is designed for diagnostics.
+In this status the functions can each be in a different status.
+For example one function is in standby status while another function is in idle status.
+
+The statechart diagram in \in{Figure}[fig:esm-states] may seem simple, but it hides many details.
+Each of the states has up to 5 nested states.
+In total there are 70 states that do not have further nested states.
+The \cpp{} code contained in the actions of the transitions is in some cases non-trivial.
+The possibility to transition from any state to the sleep or defect state also complicates the learning.
+
+
+\stopsubsection
+\stopsection
+\startsection
+  [title={Learning the ESM},
+   reference=sec:learning]
+
+In order to learn a model of the ESM, we connected it to LearnLib by \citet[conf/tacas/MertenSHM11], a state-of-the-art tool for learning Mealy machines developed at the University of Dortmund.
+A \emph{Mealy machine} is a tuple $M = (I, O, Q, q_0, \delta, \lambda)$, where
+\startitemize
+\item $I$ is a finite set of input symbols,
+\item $O$ is a finite set of output symbols,
+\item $Q$ is a finite set of states,
+\item $q_0 \in Q$ is an initial state,
+\item $\delta \colon Q \times I \to Q$ is a transition function, and
+\item $\lambda \colon Q \times I \to O$ is an output function.
+\stopitemize
+
+The behaviour of a Mealy machine is \emph{deterministic,} in the sense that the outputs are fully determined by the inputs.
+%
+%A \emph{transition} of a Mealy machine is denoted as $q \xrightarrow{i/o} q'$
+%meaning $\delta(q,i) = q'$ and $\lambda(q,i) = o$.
+Functions $\delta$ and $\lambda$ are extended to accept sequences in the standard way.
+We say that Mealy machines $M = (I, O, Q, q_0, \delta, \lambda)$ and $M' = (I', O', Q', q'_0, \delta', \lambda')$ are \emph{equivalent} if they generate an identical sequence of outputs for every sequence of inputs, that is, if $I = I'$ and, for all $w \in I^{\ast}$, $\lambda(q_0, w) = \lambda'(q'_0, w)$.
+If the behaviour of an SUT is described by a Mealy machine $M$ then the task of LearnLib is to learn a Mealy machine $M'$ that is equivalent to $M$.
+
+
+\startsubsection
+  [title={Experimental set-up},
+   reference=sec:learningSetup]
+
+A clear interface to the ESM has been defined in RRRT.
+The ESM defines ports from which it receives a predefined set of inputs and to which it can send a predefined set of outputs.
+However, this interface can only be used within RRRT.
+In order to communicate with the LearnLib software a TCP connection was set up.
+An extra capsule was created in RRRT which connects to the ports defined by the ESM.
+This capsule opened a TCP connection to LearnLib.
+Inputs and outputs are translated to and from a string format and sent over the connection.
+Before each membership query, the learner needs to bring the SUT back to its initial state.
+This means that LearnLib needs a way to reset the SUT.
+
+Some inputs and outputs sent to and from the ESM carry parameters.
+These parameters are enumerations of statuses, or integers bounded by the number of functions connected to the ESM.
+Currently, LearnLib cannot handle inputs with parameters; therefore, we introduced a separate input action for every parameter value.
+%
+Based on domain knowledge and discussions with the Oc\'{e} engineers, we could group some of these inputs together and reduce the total number of inputs.
+When learning the ESM using one function, 83 concrete inputs are grouped into four abstract inputs.
+When using two functions, 126 concrete inputs can be grouped.
+When an abstract input needs to be sent to the ESM, one concrete input of the represented group is randomly selected, as in the approach of \citet[AJUV15].
+This is a valid abstraction because all the inputs in the group have exactly the same behavior in any state of the ESM.
+This has been verified by doing code inspection.
+No other abstractions were found during the research.
+After the inputs are grouped a total of 77 inputs remain when learning the ESM using 1 function, and 105 inputs remain when using 2 functions.
+
+It was not immediately obvious how to model the ESM by a Mealy machine, since some inputs trigger no output, whereas other inputs trigger several outputs.
+In order to resolve this, we benefited from the run-to-completion execution model used in RRRT.
+Whenever an input is sent all the outputs are collected until quiescence is detected.
+Next all the outputs are concatenated and are sent to LearnLib as a single aggregated output.
+%
+In model-based testing, quiescence is usually detected by waiting for a fixed time-out period.
+However, this causes the system to be mostly idle while waiting for the time-out, which is inefficient.
+In order to detect quiescence faster, we exploited the run-to-completion execution model used by RRRT:
+we modified the ESM to respond to a new low-priority test input with a (single) special output.
+This test input is sent after each normal input.
+Only after the normal input is processed and all the generated outputs have been sent, the test input is processed and the special output is generated; upon its reception, quiescence can be detected immediately and reliably.
+
+
+\stopsubsection
+\startsubsection
+  [title={Test selection strategies},
+   reference=sec:testSelection]
+
+In the ESM case study the most challenging problem was finding counterexamples for the hypotheses constructed during learning.
+
+LearnLib implements several algorithms for conformance testing, one of which is a random walk algorithm.
+The random walk algorithm works by first selecting the length of the test query according to a geometric distribution, cut off at a fixed upper bound.
+Each of the input symbols in the test query is then randomly selected from the input alphabet $I$ from a uniform distribution.
+%
+In order to find counterexamples, a specific sequence of input symbols is needed to arrive at the state in the SUT that differentiates it from the hypothesis.
+The upper bound for the size of this search space is $|I|^{n}$ where $|I|$ is the size of the input alphabet used, and $n$ the length of the
+counterexample that needs to be found.
+If this sequence is long the chance of finding it is small.
+Because the ESM has many different input symbols to choose from, finding the correct one is hard.
+When learning the ESM with 1 function there are 77 possible input symbols.
+If for example the length of the counterexample needs to be at least 6 inputs to identify a certain state, then the upper bound on the number of test queries would be around $2 \times 10^{11}$.
+An average test query takes around 1 ms, so it would take about 7 years to execute these test queries.
+
+
+\startsubsubsection
+  [title={Augmented DS-method.},
+   reference=sec:randomPrefix]
+
+In order to reduce the number of tests, \citet[Ch78] and
+\citet[vasilevskii1973failure] pioneered the so called W-method.
+In their framework a test query consists of a prefix $p$ bringing the SUT to a specific state, a (random) middle part $m$ and a suffix $s$ assuring that the SUT is in the appropriate state.
+This results in a test suite of the form $P I^{\leq k} W$, where $P$ is a set of (shortest) access sequences, $I^{\leq k}$ the set of all sequences of length at most $k$, and $W$ is a characterization set.
+Classically, this characterization set is constructed by taking the set of all (pairwise) separating sequences.
+For $k=1$ this test suite is complete in the sense that if the SUT passes all tests, then either the SUT is equivalent to the specification or the SUT has strictly more states than the specification.
+By increasing $k$ we can check additional states.
+
+We tried using the W-method as implemented by LearnLib to find counterexamples.
+The generated test suite, however, was still too big in our learning context.
+\citet[FBKAG91] observed that it is possible to let the set $W$ depend on the state the SUT is supposed to be.
+This allows us to only take a subset of $W$ which is relevant for a specific state.
+This slightly reduces the test suite without losing the power of the full test suite.
+This method is known as the Wp-method.
+More importantly, this observation allows for generalizations where we can carefully pick the suffixes.
+
+In the presence of an (adaptive) distinguishing sequence one can take $W$ to be a single suffix, greatly reducing the test suite.
+\citet[LYa94] describe an algorithm (which we will refer to as the LY algorithm) to efficiently construct this sequence, if it exists.
+In our case, unfortunately, most hypotheses did not enjoy existence of an adaptive distinguishing sequence.
+In these cases the incomplete result from the LY algorithm still contained a lot of information which we augmented by pairwise separating sequences.
+
+\startplacefigure
+  [title={A small part of an incomplete distinguishing sequence as produced by the LY algorithm. Leaves contain a set of possible initial states, inner nodes have input sequences and edges correspond to different output symbols (of which we only drew some), where Q stands for quiescence.},
+   reference=fig:distinguishing-sequence]
+\externalfigure[hyp_20_partial_ds.pdf][width=\textwidth]
+\stopplacefigure
+
+As an example we show an incomplete adaptive distinguishing sequence for one of the hypothesis in \in{Figure}[fig:distinguishing-sequence].
+When we apply the input sequence I46 I6.0 I10 I19 I31.0 I37.3 I9.2 and observe outputs O9 O3.3 Q \ldots{} O28.0, we know for sure that the SUT was in state 788.
+Unfortunately not all paths lead to a singleton set.
+When for instance we apply the sequence I46 I6.0 I10 and observe the outputs O9 O3.14 Q, we know for sure that the SUT was in one of the states 18, 133, 1287 or 1295.
+In these cases we have to perform more experiments and we resort to pairwise separating sequences.
+
+We note that this augmented DS-method is in the worst case not any better than the classical Wp-method.
+In our case, however, it greatly reduced the test suites.
+
+Once we have our set of suffixes, which we call $Z$ now, our test algorithm works as follows.
+The algorithm first exhausts the set $P I^{\leq 1} Z$.
+If this does not provide a counterexample, we will randomly pick test queries from $P I^2 I^\ast Z$, where the algorithm samples uniformly from $P$, $I^2$ and $Z$ (if $Z$ contains more that $1$ sequence for the supposed state) and with a geometric distribution on $I^\ast$.
+
+
+\stopsubsubsection
+\startsubsubsection
+  [title={Sub-alphabet selection.},
+   reference=sec:subaplha]
+
+Using the above method the algorithm still failed to learn the ESM.
+By looking at the RRRT-based model we were able to see why the algorithm failed to learn.
+In the initialization phase, the controller gives exceptional behavior when providing a certain input eight times consecutively.
+Of course such a sequence is hard to find in the above testing method.
+With this knowledge we could construct a single counterexample by hand by which means the algorithm was able to learn the ESM.
+
+In order to automate this process, we defined a sub-alphabet of actions that are important during the initialization phase of the controller.
+This sub-alphabet will be used a bit more often than the full alphabet. We do this as follows.
+%
+We start testing with the alphabet which provided a counterexample for the previous hypothesis (for the first hypothesis we take the subalphabet).
+If no counterexample can be found within a specified query bound, then we repeat with the next alphabet.
+If both alphabets do not produce a counterexample within the bound, the bound is increased by some factor and we repeat all.
+%
+This method only marginally increases the number of tests.
+But it did find the right counterexample we first had to construct by hand.
+
+
+\stopsubsubsection
+\stopsubsection
+\startsubsection
+  [title={Results},
+   reference=sec:learnResults]
+
+Using the learning set-up discussed in \in{Section}[sec:learningSetup] and the test selection strategies discussed in \in{Section }[sec:testSelection], a model of the ESM using 1 function could be learned.
+After an additional eight hours of testing no counterexample was found and the experiment was stopped.
+The following list gives the most important statistics gathered during the learning:
+
+\startitemize
+  \item The learned model has 3.410 states.
+  \item Altogether, 114 hypotheses were generated.
+  \item The time needed for learning the final hypothesis was 8 hours, 26 minutes, and 19 seconds.
+  \item 29.933.643 membership queries were required, with on average 35,77 inputs per query.
+  \item 30.629.711 test queries were required, with on average 29,06 inputs per query.
+\stopitemize
+
+
+\stopsubsection
+\stopsection
+\startsection
+  [title={Verification},
+   reference=sec:verification]
+
+To verify the correctness of the model that was learned using LearnLib,
+we checked its equivalence with a model that was generated directly from the code.
+
+\startsubsection
+  [title={Approach}]
+
+As mentioned already, the ESM has been implemented using Rational Rose RealTime (RRRT).
+Thus a statechart representation of the ESM is available.
+However, we have not been able to find a tool that translates RRRT models to Mealy machines, allowing us to compare the RRRT-based model of the ESM with the learned model.
+We considered several formalisms and tools that were proposed in the literature to flatten statecharts to state machines.
+The first one was a tool for hierarchical timed automata (HTA) \citep[david_02_formal].
+However, we found it hard to translate the output of this tool, a network of Uppaal timed automata, to a Mealy machine that could be compared to the learned model.
+The second tool that we considered has been developed by \citet[conf/fmco/HansenKLMPS10].
+This tool misses some essential features, for example the ability to assign new values to state variables on transitions.
+Finally, we considered a formalism called object-oriented action systems
+(OOAS) \citep[conf/fmco/KrennSA09], but no tools to use this could be found.
+
+In the end we decided to implement the required model transformations ourselves.
+\in{Figure}[fig:formats] displays the different formats for representing models that we used and the transformations between those formats.
+
+\startplacefigure
+  [title={Formats for representing models and transformations between formats},
+   reference=fig:formats]
+\externalfigure[formats.pdf][width=0.7\textwidth]
+\stopplacefigure
+
+We used the bisimulation checker of CADP \citep[conf/tacas/GaravelLMS11] to check the equivalence of labelled transition system models in {\tt .aut} format.
+The Mealy machine models learned by LearnLib are represented as {\tt .dot} files.
+A small script converts these Mealy machines to labelled transition systems in {\tt .aut} format.
+We used the Uppaal \citep[conf/qest/BehrmannDLHPYH06] tool as an editor for defining extended finite state machines (EFSM), represented as {\tt .xml} files.
+A script developed in the ITALIA project (\todo{http://www.italia.cs.ru.nl/}) converts these EFSM models to LOTOS, and then CADP takes care of the conversion from LOTOS to the {\tt .aut} format.
+
+The Uppaal syntax is not sufficiently expressive to directly encode the RRRT definition of the ESM, since this definition makes heavy use of UML \citep[OMGUML2] concepts such as state hierarchy and transitions from composite states, concepts which are not present in Uppaal.
+Using Uppaal would force us to duplicate many transitions and states.
+
+We decided to manually create an intermediate hierarchical EFSM (HEFSM) model using the UML drawing tool PapyrusUML \citep[lanusse2009papyrus].
+The HEFSM model closely resembles the RRRT UML model, but many elements used in UML state machines are left out because they are not needed for modeling the ESM and complicate the transformation process.
+
+
+\stopsubsection
+\startsubsection
+  [title={Model transformations}]
+
+We explain the transformation from the HEFSM model to the EFSM model using examples.
+The transformation is divided into five steps, which are executed in order:
+\startitemize[n]
+\item combine transitions without input or output signal,
+\item transform supertransitions,
+\item transform internal transitions,
+\item add input signals that do not generate an output, and
+\item replace invocations of the next function.
+\stopitemize
+
+
+\startdescription{1. Empty transitions.}
+In order to make the model more readable and to make it easy to model \kw{if} and \kw{switch} statements in the \cpp{} code the HEFSM model allows for transitions without a signal.
+These transitions are called \emph{empty} transitions.
+An empty transition can still contain a guard and an assignment.
+However these kinds of transitions are only allowed on states that only contain empty outgoing transitions.
+This was done to make the transformation easy and the model easy to read.
+
+In order to transform a state with empty transitions all the incoming and outgoing transitions are collected.
+For each combination of incoming transition $a$ and outgoing transition $b$ a new transition $c$ is created with the source of $a$ as source and the target of $b$ as target.
+The guard for transition $c$ evaluates to true if and only if the guard of $a$ and $b$ both evaluate to true.
+The assignment of $c$ is the concatenation of the assignment of $a$ and $b$.
+The signal of $c$ will be the signal of $a$ because $b$ cannot have a signal.
+Once all the new transitions are created all the states with empty transitions are removed together with all their incoming and outgoing transitions.
+
+\in{Figure}[fig:model-example1] shows an example model with empty transitions and its transformed version.
+Each of the incoming transitions from the state \kw{B} is combined with each of the outgoing transitions.
+This results into two new transitions.
+The old transitions and state \kw{B} are removed.
+
+\startplacefigure
+  [title={Example of empty transition transformation. On the left the original version. On the right the transformed version.},
+   reference=fig:model-example1]
+\externalfigure[model-example1.pdf][width=0.5\textwidth]
+\stopplacefigure
+\stopdescription
+
+
+\startdescription{2. Supertransitions.}
+The RRRT model of the ESM contains many transitions originating from a composite state.
+Informally, these \emph{supertransitions} can be taken in each of the substates of the composite state if the guard evaluates to true.
+In order to model the ESM as closely as possible, supertransitions are also supported in the HEFSM model.
+
+In RRRT transitions are evaluated from bottom to top.
+This means that first the transitions from the leaf state are considered, then transitions from its parent state and then from its parent's parent state, etc.
+Once a transition for which the guard evaluates to true and the correct signal has been found it is taken.
+%
+When flattening the statechart, we modified the guards of supertransitions to ensure the correct priorities.
+%In order to transform a supertransition all the states are processed beginning at the top state and recursively going down to its child states.
+%The supertransitions that can be taken in the parent states of a state are collected.
+%If the current state contains a transition $a$ that has the same signal as a supertransition $b$ the guard of $b$ has to be modified because supertransition $b$ may only be taken if transition $a$ cannot be taken.
+%Thus the negation of the guard of $a$ is added to the guard of $b$.
+%Once a leaf state is reached all the collected supertransitions are added as outgoing transitions of the leaf state.
+
+\startplacefigure
+  [title={Example of supertransition transformation. On the left the original version. On the right the transformed version},
+   reference=fig:model-example2]
+\externalfigure[model-example2.pdf][width=0.4\textwidth]
+\stopplacefigure
+
+\in{Figure}[fig:model-example2] shows an example model with supertransitions and its transformed version.
+The supertransition from state \kw{A} can be taken at each of \kw{A}'s leaf states \kw{B} and \kw{C}.
+The transformation removes the original supertransition and creates a new transition at states \kw{B} and \kw{C} using the same target state.
+For leaf state \kw{C} this is easy because it does not contain a transition with the input signal \kw{IP}.
+In state \kw{B} the transition to state \kw{C} would be taken if a signal \kw{IP} was processed and the state variable \kw{a} equals $1$.
+The supertransition can only be taken if the other transition cannot be taken.
+This is why the negation of other the guard is added to the new transition.
+If the original supertransition is an internal transition the model needs further transformation after this transformation.
+This is described in the next paragraph.
+If the original supertransition is not an internal transition the new transitions will have the initial state of \kw{A} as target.
+\stopdescription
+
+
+\startdescription{3. Internal transitions.}
+The ESM model also makes use of \emph{internal transitions} in RRRT.
+Using such a transition the current state does not change.
+If such a transition is defined on a composite state it can be taken from all of the substates and return to the same leaf state it originated from.
+If defined on a composite state it is thus also a supertransition.
+%
+This is also possible in the HEFSM model.
+In order to transform an internal transition it is first seen as a supertransition and the above transformation is applied.
+Then the target of the transition is simply set to the leaf state it originates from.
+%
+An example can be seen in \in{Figure}[fig:model-example2].
+If the supertransition from state \kw{A} is also defined to be an internal transition the transformed version on the right would need another transformation.
+The new transitions that now have the target state \kw{A} would be transformed to have the same target state as their current source state.
+\stopdescription
+
+
+\startdescription{4. Quiescent transitions.}
+In order to reduce the number of transitions in the HEFSM model quiescent transitions are added automatically.
+For every state all the transitions for each signal are collected in a set $T$.
+A new self transition $a$ is added for each signal.
+The guard for transition $a$ evaluates to true if and only if none of the guards of the transactions in $T$ evaluates to true.
+This makes the HEFSM input enabled without having to specify all the transitions.
+\stopdescription
+
+
+\startdescription{5. The next function.}
+In RRRT it is possible to write the guard and assignment in \cpp{} code.
+It is thus possible that the value of a variable changes while an input signal is processed.
+In the HEFSM however all the assignments only take effect after the input signal is processed.
+In order to simulate this behavior the \emph{next} function is used.
+This function takes a variable name and evaluates to the value of this variable after the transition.
+\stopdescription
+
+
+\stopsubsection
+\startsubsection
+  [title={Results}]
+
+\in{Figure}[fig:esra-model] shows a visualization of the learned model that was generated using Gephi \citep[conf/icwsm/BastianHJ09].
+States are coloured according to the strongly connected components.
+The number of transitions between two states is represented by the thickness of the edge.
+The large number of states (3.410) and transitions (262.570) makes it hard to visualize this model.
+Nevertheless, the visualization does provide insight in the behaviour of the ESM.
+The three protrusions at the bottom of \in{Figure}[fig:esra-model] correspond to deadlocks in the model.
+These deadlocks are \quotation{error} states that are present in the ESM by design.
+According to the Oc\'{e} engineers, the sequences of inputs that are needed to drive the ESM into these deadlock states will always be followed by a system power reset.
+The protrusion at the top right of the figure corresponds to the initialization phase of the ESM.
+This phase is performed only once and thus only transitions from the initialisation cluster to the main body of states are present.
+
+\startplacefigure
+  [title={Final model of the ESM.},
+   reference=fig:esra-model]
+\externalfigure[ESM-manual.png][width=\textwidth]
+\stopplacefigure
+
+During the construction of the RRRT-based model, the ESM code was thoroughly inspected.
+This resulted in the discovery of missing behaviour in one transition of the ESM code.
+An Oc\'{e} software engineer confirmed that this behavior is a (minor) bug, which will be fixed.
+We have verified the equivalence of the learned model and the RRRT-based model by using CADP \citep[conf/tacas/GaravelLMS11].
+
+
+\stopsubsection
+\stopsection
+\startsection
+  [title={Conclusions and Future Work},
+   reference=sec:future]
+
+Using an extension of algorithm by \citet[LYa94] for adaptive distinguishing sequences, we succeeded to learn a Mealy machine model of a piece of widely used industrial control software.
+Our extension of Lee \& Yannakakis' algorithm is rather obvious, but nevertheless it appears to be new.
+Preliminary evidence suggests that it outperforms existing conformance testing algorithms.
+We are currently performing experiments in which we compare the new algorithm with other test algorithms on a number of realistic benchmarks.
+
+There are several possibilities for extending the ESM case study.
+To begin with, one could try to learn a model of the ESM with more than one function.
+Another interesting possibility would be to learn models of the EHM, ACM and other managers connected to the ESM. Using these models some of the properties discussed by \citet[Plo:05] could be verified at a more detailed level.
+We expect that the combination of LearnLib with the extended Lee \& Yannakakis algorithm can be applied to learn models of many other software components.
+
+In the specific case study described in this article, we know that our learning algorithm has succeeded to learn the correct model, since we established equivalence with a reference model that was constructed independently from the RRRT model of the ESM software.
+In the absence of a reference model, we can never guarantee that the actual system behaviour conforms to a learned model.
+In order to deal with this problem, it is important to define metrics that quantify the difference (or distance) between a hypothesis and a correct model of the SUT, and to develop test generation algorithms that guarantee an upper bound on this difference.
+Preliminary work in this area is reported by \citet[SVVV14].
+
+
+\stopsection
+\startsubject
+  [title={Acknowledgements}]
+We thank Lou Somers for suggesting the ESM case study and for his support of our research.
+Fides Aarts and Harco Kuppens helped us with the use of LearnLib and CADP, and Jan Tretmans gave useful feedback.
+
+\stopsubject
+\referencesifcomponent
+\stopchapter
+\stopcomponent
\ No newline at end of file
diff --git a/content/learning-nominal-automata.tex b/content/learning-nominal-automata.tex
index 63e8cf6..36cbdd3 100644
--- a/content/learning-nominal-automata.tex
+++ b/content/learning-nominal-automata.tex
@@ -64,6 +64,7 @@ In \in{Section}[sec:nondet], we describe an algorithm to learn nominal non-deter
 \in{Section}[sec:impl] contains a description of NLambda, details of the implementation, and results of preliminary experiments.
 \in{Section}[sec:related] contains a discussion of related work. We conclude the paper with a discussion section where also future directions are presented. 
 
+
 \stopsection
 \startsection
   [title={Overview of the Approach},
@@ -83,7 +84,7 @@ The algorithm is able to fill the table with membership queries.
 As an example, and to set notation, consider the following table (over $A=\{a, b\}$).
 
 \starttikzpicture
-\matrix[matrix of nodes](tab)
+\matrix[matrix of nodes, column 1/.style={anchor=base west}](tab)
 {
  \phantom{$\epsilon$} & {$\epsilon$} & $a$ & $aa$ \\
  $\epsilon$ & $0$ & $0$ & $1$ \\
@@ -132,8 +133,7 @@ It terminates when the answer is {\bf yes}, otherwise it extends the table with
 \REPEAT
 	\WHILE{$(S, E)$ is not closed or not consistent\someline[line:checks]}
 		\IF{$(S, E)$ is not closed\someline[line:begin-closed]}
-		\STATE find $s_1\in S$, $a \in A$ such that
-		\STATE \hfill $row(s_1a) \neq row(s)$ for all $s\in S$\someline[line:clos-witness]
+		\STATE find $s_1\in S$, $a \in A$ such that $row(s_1a) \neq row(s)$ for all $s\in S$\someline[line:clos-witness]
 		\STATE $S\gets S\cup \{s_1a\}$ \someline[line:addrow-clos]
 		\ENDIF\someline[line:end-closed]
 		\IF{$(S, E)$ is not consistent\someline[line:begin-const]}
@@ -169,10 +169,29 @@ We illustrate some relevant ones.
 \startstep{1}
 We start from $S, E = \{\epsilon\}$, and we fill the entries of the table below by asking membership queries for $\epsilon$, $a$ and $b$.
 The table is closed and consistent, so we construct the hypothesis $\autom_1$.
-\todo{tikz \tt 134-155}
+
+\starttikzpicture
+\matrix[matrix of nodes, column 1/.style={anchor=base west}](tab)
+{
+ \phantom{$\epsilon$} & {$\epsilon$} \\
+ $\epsilon$ & $0$ \\
+ $a$ & $0$ \\
+ $b$ & $0$ \\
+};
+\draw (tab-1-1.south west) -| (tab-1-2.south east);
+\draw (tab-2-1.south west) -| (tab-2-2.south east);
+\draw (tab-1-1.north east) -| (tab-4-1.south east);
+\stoptikzpicture
+\starttikzpicture
+\node[initial,state,initial text={$\autom_1 = $}] (q0) {$q_0$};
+\path (q0) edge[loop right] node[trlab,right] {$a,b$} (q0);
+\stoptikzpicture
+\startformula
+q_0 = row(\epsilon) = \{ \epsilon \mapsto 0 \}
+\stopformula
 
 The Teacher replies {\bf no} and gives the counterexample $aa$, which is in $\lang_1$ but it is not accepted by $\autom_1$.
-Therefore, line 12 of the algorithm is triggered and we set $S \gets S \cup \{a, aa\}$.
+Therefore, \inline{line}[line:addrow-ex] of the algorithm is triggered and we set $S = \{\epsilon, a, aa\}$.
 \stopstep
 
 \startstep{2}
@@ -181,7 +200,58 @@ It is closed, but not consistent:
 Rows $\epsilon$ and $a$ are identical, but appending $a$ leads to different rows, as depicted.
 Therefore, line 9 is triggered and an extra column $a$, highlighted in red, is added.
 The new table is closed and consistent and a new hypothesis $\autom_2$ is constructed.
-\todo{tikz \tt 163-219}
+
+\starttikzpicture
+\matrix[matrix of nodes, column 1/.style={anchor=base west}](tab)
+{
+ \phantom{$\epsilon$} & \epsilon \\
+ \epsilon & 0 \\
+ a   & 0 \\
+ aa  & 1 \\
+ b   & 0 \\
+ ab  & 0 \\
+ aaa & 0 \\
+ aab & 0 \\
+};
+\draw (tab-1-1.south west) -| (tab-1-2.south east);
+\draw (tab-4-1.south west) -| (tab-4-2.south east);
+\draw (tab-1-2.north west) -| (tab-8-2.south west);
+
+\node[fit=(tab-2-1)(tab-2-2), dotted, draw, inner sep=-2pt] (r1) {};
+\node[fit=(tab-3-1)(tab-3-2), dotted, draw, inner sep=-2pt] (r2) {};
+\node[fit=(tab-4-1)(tab-4-2), dotted, draw, inner sep=-2pt] (r3) {};
+
+\path[->] (r1.west) edge[in=160,out=190,looseness=2] node[left] {$a$} (r2.west);
+\path[->] (r2.east) edge[in=20,out=-10,looseness=2] node[right] {$a$} (r3.east);
+\stoptikzpicture
+\starttikzpicture
+\matrix[matrix of nodes, column 1/.style={anchor=base west}, column 3/.style={nodes={color=red}}](tab)
+{
+ \phantom{$\epsilon$} & \epsilon & a \\
+ \epsilon & 0 & 0 \\
+ a   & 0 & 1 \\
+ aa  & 1 & 0 \\
+ b   & 0 & 0 \\
+ ab  & 0 & 0 \\
+ aaa & 0 & 0 \\
+ aab & 0 & 0 \\
+};
+\draw (tab-1-1.south west) -| (tab-1-3.south east);
+\draw (tab-4-1.south west) -| (tab-4-3.south east);
+\draw (tab-1-2.north west) -| (tab-8-2.south west);
+\stoptikzpicture
+\starttikzpicture[node distance=2cm]
+\node[initial,state,initial text={$\autom_2 = $}] (q0) {$q_0$};
+\node[state,right of=q0] (q1) {$q_1$};
+\node[state,accepting,below of=q1] (q2) {$q_2$};
+
+\path[->]
+(q0) edge[loop below] node[trlab,below] {$b$} (q0)
+(q0) edge[bend right] node[trlab,above] {$a$} (q1)
+(q1) edge[bend right] node[trlab,above] {$b$} (q0)
+(q1) edge node[trlab,right] {$a$} (q2)
+(q2) edge node[trlab,midway,fill=white] {$a, b$} (q0);
+\stoptikzpicture
 
 The Teacher again replies {\bf no} and gives the counterexample $bb$, which should be accepted by $\autom_2$ but it is not. Therefore we put $S \gets S \cup \{b,bb\}$.
 \stopstep
@@ -191,16 +261,97 @@ The new table is the one on the left.
 It is closed, but $\epsilon$ and $b$ violate consistency, when $b$ is appended.
 Therefore we add the column $b$ and we get the table on the right, which is closed and consistent.
 The new hypothesis is $\autom_3$.
-\todo{tikz \tt 228-296}
+
+\starttikzpicture
+\matrix[matrix of nodes, column 1/.style={anchor=base west}](tab)
+{
+ \phantom{$\epsilon$} & \epsilon & a \\
+ \epsilon & 0 & 0 \\
+ a   & 0 & 1 \\
+ aa  & 1 & 0 \\
+ b   & 0 & 0 \\
+ bb  & 1 & 0 \\
+ ab  & 0 & 0 \\
+ aaa & 0 & 0 \\
+ aab & 0 & 0 \\
+ ba  & 0 & 0 \\
+ bba & 0 & 0 \\
+ bbb & 0 & 0 \\
+};
+\draw (tab-1-1.south west) -| (tab-1-3.south east);
+\draw (tab-6-1.south west) -| (tab-6-3.south east);
+\draw (tab-1-2.north west) -| (tab-12-2.south west);
+
+\node[fit=(tab-2-1)(tab-2-3), dotted, draw, inner sep=-2pt] (r1) {};
+\node[fit=(tab-5-1)(tab-5-3), dotted, draw, inner sep=-2pt] (r2) {};
+\node[fit=(tab-6-1)(tab-6-3), dotted, draw, inner sep=-2pt] (r3) {};
+
+\path[->] (r1.west) edge[in=160,out=190,looseness=2] node[left] {$b$} (r2.west);
+\path[->] (r2.east) edge[in=20,out=-10,looseness=2] node[right] {$b$} (r3.east);
+\stoptikzpicture
+\starttikzpicture
+\matrix[matrix of nodes, column 1/.style={anchor=base west}, column 4/.style={nodes={color=red}}](tab)
+{
+ \phantom{$\epsilon$} & \epsilon & a & b \\
+ \epsilon & 0 & 0 & 0 \\
+ a   & 0 & 1 & 0 \\
+ aa  & 1 & 0 & 0 \\
+ b   & 0 & 0 & 1\\
+ bb  & 1 & 0 & 0 \\
+ ab  & 0 & 0 & 0 \\
+ aaa & 0 & 0 & 0 \\
+ aab & 0 & 0 & 0 \\
+ ba  & 0 & 0 & 0 \\
+ bba & 0 & 0 & 0 \\
+ bbb & 0 & 0 & 0 \\
+};
+\draw (tab-1-1.south west) -| (tab-1-4.south east);
+\draw (tab-6-1.south west) -| (tab-6-4.south east);
+\draw (tab-1-2.north west) -| (tab-12-2.south west);
+\stoptikzpicture
+\starttikzpicture[node distance=2cm]
+\node[initial,state,initial text={$\autom_3 = $}] (q0) {$q_0$};
+\node[state,right of=q0] (q1) {$q_1$};	
+\node[state,below of=q0] (q2) {$q_2$};	
+\node[state,accepting,right of=q2] (q3) {$q_3$};	
+
+\path[->]
+(q0) edge[bend left] node[trlab,above] {$a$} (q1)
+(q1) edge[bend left] node[trlab,above] {$b$} (q0)
+(q1) edge node[trlab,right] {$a$} (q3)
+(q0) edge[bend right] node[trlab,left] {$b$} (q2)
+(q2) edge[bend right] node[trlab,left] {$a$} (q0)
+(q2) edge node[trlab,above] {$b$} (q3)
+(q3) edge node[trlab,fill=white,midway] {$a,b$} (q0);
+\stoptikzpicture
 
 The Teacher replies {\bf no} and provides the counterexample $babb$, so $S \gets S \cup \{ba,bab\}$.
 \stopstep
 
 \startstep{4}
 One more step brings us to the correct hypothesis $\autom_4$ (details are omitted).
-\todo{tikz \tt 304-323}
+
+\starttikzpicture[node distance=2cm]
+\node[initial,state,initial text={$\autom_4 = $}] (q0) {$q_0$};
+\node[state,above right= 15pt of q0] (q1) {$q_1$};
+\node[state,below right= 15pt of q0] (q2) {$q_2$};
+\node[state,accepting,right= 30pt of q0] (q3) {$q_3$};
+\node[state,right of=q3] (q4) {$q_4$};
+
+\path[->]
+(q0) edge node[trlab,above left] {$a$} (q1)
+(q1) edge node[trlab,above right] {$a$} (q3)
+(q0) edge node[trlab,below left] {$b$} (q2)
+(q2) edge node[trlab,below right] {$b$} (q3)
+(q1) edge[bend left] node[trlab,above] {$b$} (q4)
+(q3) edge node[trlab,above] {$a,b$} (q4)
+(q2) edge[bend right] node[trlab,below] {$a$} (q4)
+(q4) edge[loop right] node[trlab,right] {$a,b$} (q4);
+\stoptikzpicture
+
 \stopstep
 
+
 \stopsubsection
 \startsubsection
   [title={Learning Nominal Languages},
@@ -209,11 +360,46 @@ One more step brings us to the correct hypothesis $\autom_4$ (details are omitte
 Consider now an infinite alphabet $A = \{a,b,c,d,\dots\}$.
 The language $\lang_1$ becomes $\{aa,bb,cc,dd,\dots\}$.
 Classical theory of finite automata does not apply to this kind of languages, but one may draw an infinite deterministic automaton that recognizes $\lang_1$ in the standard sense:
-\todo{tikz \tt 330-355}
 
-where $\xrightarrow{A}$ and $\xrightarrow{\neq a}$ stand for the infinitely-many transitions labelled by elements of $A$ and $A \setminus \{a\}$, respectively.
+\starttikzpicture[node distance=2cm]
+\node[initial,state,initial text={$\autom_5 = $}] (q0) {$q_0$};
+\node[state, right of=q0] (qb) {$q_b$};
+\node[state, above=10 pt of qb] (qa) {$q_a$};
+\node[below=10pt of qb] (qdots) {$\vdots$};
+\node[state,accepting,right of=qb] (q3) {$q_3$};
+\node[state,right of=q3] (q4) {$q_4$};
+
+\path[->]
+(q0) edge node[trlab,above left] {$a$} (qa)
+(qa) edge node[trlab,above right] {$a$} (q3)
+(q0) edge node[trlab,above] {$b$} (qb)
+(qb) edge node[trlab,above] {$b$} (q3)
+(q0) edge node[trlab] {} (qdots)
+(qdots) edge node[trlab] {} (q3)
+(qa) edge[bend left] node[trlab,above] {${\neq} a$} (q4)
+(q3) edge node[trlab,above] {$A$} (q4)
+(qb) edge[bend right] node[trlab,below] {${\neq} b$} (q4)
+(qdots) edge[bend right] node[trlab] {} (q4)
+(q4) edge[loop right] node[trlab,right] {$A$} (q4);
+\stoptikzpicture
+
+where $\tot{A}$ and $\tot{{\neq} a}$ stand for the infinitely-many transitions labelled by elements of $A$ and $A \setminus \{a\}$, respectively.
 This automaton is infinite, but it can be finitely presented in a variety of ways, for example:
-\todo{tikz \tt 361-378}
+
+\starttikzpicture[node distance=2cm]
+\node[initial,state,initial text={}] (q0) {$q_0$};
+\node[state, right of=q0] (qx) {$q_x$};
+\node[state,accepting,right of=qx] (q3) {$q_3$};
+\node[state,right of=q3] (q4) {$q_4$};
+\node[above = 0pt of qx] (qa) {\scriptsize{$\forall x \in A$}};
+
+\path[->]
+(q0) edge node[trlab,above] {$x$} (qx)
+(qx) edge node[trlab,above] {$x$} (q3)
+(q3) edge node[trlab,above] {$A$} (q4)
+(qx) edge[bend right] node[trlab,below] {${\neq} x$} (q4)
+(q4) edge[loop right] node[trlab,right] {$A$} (q4);
+\stoptikzpicture
 
 One can formalize the quantifier notation above (or indeed the \quotation{dots} notation above that) in  several ways.
 A popular solution is to consider finite {\em register automata} \cite[DBLP:journals/tcs/KaminskiF94, DBLP:journals/tocl/DemriL09], i.e., finite automata equipped with a finite number of registers where alphabet letters can be stored and later compared for equality.
@@ -221,7 +407,7 @@ Our language $\lang_1$ is recognized by a simple automaton with four states and
 The problem of learning registered automata has been successfully attacked before
 \cite[DBLP:conf/vmcai/HowarSJC12].
 
-In this paper, however, we will consider nominal automata \cite[DBLP:journals/corr/BojanczykKL14] instead.
+In this paper, however, we will consider nominal automata \citep[DBLP:journals/corr/BojanczykKL14] instead.
 These automata ostensibly have infinitely many states, but the set of states can be finitely presented in a way open to effective manipulation.
 More specifically, in a nominal automaton the set of states is subject to an action of permutations of a set $\EAlph$ of {\em atoms}, and it is finite up to that action.
 For example, the set of states of $\autom_5$ is:
@@ -233,7 +419,7 @@ Technically speaking, the set of states has four {\em orbits} (one infinite orbi
 %
 Moreover, it is required that in a nominal automaton the transition relation is {\em equivariant}, i.e., closed under the action of permutations.
 The automaton $\autom_5$ has this property:
-For example, it has a transition $q_a\stackrel{a}{\longrightarrow} q_3$, and for any $\pi \colon \EAlph \to \EAlph$ there is also a transition $\pi(q_a) = q_{\pi(a)}\stackrel{\pi(a)}{\longrightarrow} q_3 = \pi(q_3)$.
+For example, it has a transition $q_a \tot{a} q_3$, and for any $\pi \colon \EAlph \to \EAlph$ there is also a transition $\pi(q_a) = q_{\pi(a)} \tot{\pi(a)} q_3 = \pi(q_3)$.
 
 Nominal automata with finitely many orbits of states are equi-expressive with finite register automata \cite[DBLP:journals/corr/BojanczykKL14], but they have an important theoretical advantage:
 They are a direct reformulation of the classical notion of finite automaton, where one replaces finite sets with orbit-finite sets and functions (or relations) with equivariant ones.
@@ -270,7 +456,22 @@ We start from $S, E = \{ \epsilon \}$.
 We have $S \Lext A = A$, which is infinite but admits a finite representation.
 In fact, for any $a \in A$, we have $A = \{ \pi(a) \mid \text{$\pi$ is a permutation}\}$.
 Then, by {\bf (P2)}, $row(\pi(a))(\epsilon) = row(a)(\epsilon) = 0$, for all $\pi$, so the first table can be written as:
-\todo{tikz \tt 411-434}
+
+\starttikzpicture
+\matrix[matrix of nodes,column 1/.style={anchor=base west}](tab)
+{
+ \phantom{$\epsilon$} & {$\epsilon$} \\
+ $\epsilon$ & $0$ \\
+ $a$ & $0$ \\
+};
+\draw (tab-1-1.south west) -| (tab-1-2.south east);
+\draw (tab-2-1.south west) -| (tab-2-2.south east);
+\draw (tab-1-1.north east) -| (tab-3-1.south east);
+\stoptikzpicture
+\starttikzpicture
+\node[initial,state,initial text={$\autom'_1 = $}] (q0) {$q_0$};
+\path (q0) edge[loop right] node[trlab,right] {$A$} (q0);
+\stoptikzpicture
 
 It is closed and consistent.
 Our hypothesis is $\autom_1'$, where $\delta_{\autom_1'}(row(\epsilon),x) = row(x) = q_0$, for all $x \in A$.
@@ -281,7 +482,22 @@ As in {\bf Step 1}, the Teacher replies with the counterexample $aa$.
 By equivariance of $\lang_1$, the counterexample tells us that \emph{all} words of length 2 with two repeated letters are accepted.
 Therefore we extend $S$ with the (infinite!) set of such words.
 The new symbolic table is:
-\todo{tikz \tt 440-452}
+
+\starttikzpicture
+\matrix[matrix of nodes,column 1/.style={anchor=base west}](tab)
+{
+ \phantom{$\epsilon$} & \epsilon \\
+ \epsilon & 0 \\
+ a   & 0 \\
+ aa  & 1 \\
+ ab  & 0 \\
+ aaa & 0 \\
+ aab & 0 \\
+};
+\draw (tab-1-1.south west) -| (tab-1-2.south east);
+\draw (tab-4-1.south west) -| (tab-4-2.south east);
+\draw (tab-1-2.north west) -| (tab-7-2.south west);
+\stoptikzpicture
 
 The lower part stands for elements of $S \Lext A$.
 For instance, $ab$ stands for words obtained by appending a fresh letter to words of length 1 (row $a$).
@@ -293,8 +509,23 @@ Again, by {\bf (P2)}, it is enough to consider rows of the table above.
 Consistency is violated, because $row(a) \neq row(aa)$.
 We found a \quotation{symbolic} witness $a$ for such violation.
 In order to fix consistency, while keeping $E$ equivariant, we need to add columns for all $\pi(a)$.
-The resulting table is 
-\todo{tikz \tt 457-470}
+The resulting table is
+
+\starttikzpicture
+\matrix[matrix of nodes,column 1/.style={anchor=base west}](tab)
+{
+ \phantom{$\epsilon$} & \epsilon & a & b & c & \dots \\
+ \epsilon & 0 & 0 & 0 & 0 & \dots \\
+ a   & 0 & 1 & 0 & 0 & \dots \\
+ aa  & 1 & 0 & 0 & 0 & \dots \\
+ ab  & 0 & 0 & 0 & 0 & \dots \\
+ aaa & 0 & 0 & 0 & 0 & \dots \\
+ aab & 0 & 0 & 0 & 0 & \dots \\
+};
+\draw (tab-1-1.south west) -| (tab-1-6.south east);
+\draw (tab-4-1.south west) -| (tab-4-6.south east);
+\draw (tab-1-2.north west) -| (tab-7-2.south west);
+\stoptikzpicture
 
 where non-specified entries are $0$.
 Only finitely many entries of the table are relevant:
@@ -304,7 +535,19 @@ The table is trivially consistent.
 
 Notice that this step encompasses both {\bf Step 2} and {\bf 3}, because the rows $b$ and $bb$ added by {\bf Step 2} are already represented by $a$ and $aa$.
 The hypothesis automaton is
-\todo{tikz \tt 475-488}
+
+\starttikzpicture[node distance=2cm]
+\node[initial,state,initial text={$\autom'_2 = $}] (q0) {$q_0$};
+\node[state,right of=q0] (qx) {$q_x$};
+\node[state,accepting,right of=qx] (q2) {$q_2$};
+\node[right of=q2] (dummy) {$\forall x \in A$};
+
+\path[->]
+(q0) edge[bend left] node[trlab,above] {$x$} (qx)
+(qx) edge[bend left] node[trlab,above] {$\neq x$} (q0)
+(qx) edge node[trlab,above] {$x$} (q2)
+(q2) edge[bend right=50] node[trlab,below] {$A$} (q0);
+\stoptikzpicture
 
 This is again incorrect, but one additional step will give the correct hypothesis automaton, shown earlier in \in{?}[eq:aut]).
 \stopstep
@@ -451,14 +694,15 @@ In our algorithm, we will assume a teacher as described at the start of \in{Sect
 In particular, the teacher is able to answer membership queries and equivalence queries, now in the setting of nominal languages.
 We fix a target language $\lang$, which is assumed to be a nominal regular language.
 
-The learning algorithm for nominal automata, \nLStar{}, will be very similar to \LStar{} in \in{Figure}[fig:alg].
+The learning algorithm for nominal automata, \nLStar{}, will be very similar to \LStar{} in \in{Algorithm}[alg:alg].
 In fact, we only change the following lines:
 \placeformula[eq:changes]
-\startformula
-6' S \gets S \cup \orb(sa) \\
-9' E \gets E \cup \orb(ae) \\
-12' S \gets S \cup \pref(\orb(t))
-\stopformula
+\startformula[interlinespace=small]\startmathmatrix[n=2, align={middle, left}, distance=1cm]
+\NC \text{\color[darkgray]{7'}    \NC S \gets S \cup \orb(sa)        \NR
+\NC \text{\color[darkgray]{11'}}  \NC E \gets E \cup \orb(ae)      \NR
+\NC \text{\color[darkgray]{16'}}  \NC S \gets S \cup \pref(\orb(t)) \NR
+\stopmathmatrix\stopformula
+
 The basic data structure is an \emph{observation table} $(S, E, T)$ where $S$ and $E$ are orbit-finite subsets of $A^{\ast}$ and $T \colon S \cup S \Lext A \times E \to 2$ is an equivariant function defined by $T(se) = \lang(se)$ for each $s \in S \cup S \Lext A$ and $e \in E$.
 Since $T$ is determined by $\lang$ we omit it from the notation.
 Let $row \colon S \cup S \Lext A \to 2^E$ denote the curried counterpart of $T$.
@@ -1300,8 +1544,8 @@ For an implementation of automata learning over other kinds of atoms without com
 The efficiency of our current implementation, as measured in \in{Section}[sec:tests], leaves much to be desired.
 There is plenty of potential for running time optimization, ranging from improvements in the learning algorithms itself, to optimizations in the NLambda library (such as replacing the external and general-purpose SMT solver with a purpose-built, internal one, or a tighter integration of nominal mechanisms with the underlying Haskell language as it was done by \cite[authoryears][DBLP:journals/entcs/Shinwell06]), to giving up the functional programming paradigm for an imperative language such as LOIS \cite[DBLP:conf/cade/KopczynskiT16, DBLP:conf/popl/KopczynskiT17].
 
-\stopsection
 
+\stopsection
 \startsubject
   [title={Acknowledgements}]
 
diff --git a/environment.tex b/environment.tex
index 5e5b227..0d62855 100644
--- a/environment.tex
+++ b/environment.tex
@@ -3,6 +3,8 @@
 % Bug in context? Zonder de ../ kunnen componenten de files niet vinden
 \usepath[{environment,../environment}]
 
+\setupexternalfigures[directory={images,../images}]
+
 \definemode[draft][yes]
 
 \environment bib
diff --git a/environment/bib.tex b/environment/bib.tex
index 66946e6..533ff6e 100644
--- a/environment/bib.tex
+++ b/environment/bib.tex
@@ -13,12 +13,10 @@
 \btxremapauthor [Slawomir Lasota] [S\l{}awomir Lasota]
 \btxremapauthor [Eryk Kopczynski] [Eryk Kopczy\'nski]
 
-%\define[1]\citenp{\cite[left=,right=][#1]}   % Bla and Foo, 2015
-%\define[1]\citet{\cite[authoryears][#1]}   % Bla and Foo (2015)
-%\define[1]\citep{\cite[authoryear][#1]}    % (Bla and Foo, 2015)
+\def\citenp[#1]{\cite[left=,right=][#1]}   % Bla and Foo, 2015
+\def\citet[#1]{\cite[authoryears][#1]}     % Bla and Foo (2015)
+\def\citep[#1]{\cite[authoryear][#1]}      % (Bla and Foo, 2015)
 
-%\cite[authoryear]   -> (Bla, Foo & Baz, 2015)
-%\cite[authoryears]  -> Bla, Foo & Baz (2015)
 %\cite[lefttext={See },righttext={ p.\nbsp yy}][book]  -> (See Bla, Foo & Baz, 2015, p. yy)
 
 \stopenvironment
diff --git a/environment/font.tex b/environment/font.tex
index 4500b69..720656f 100644
--- a/environment/font.tex
+++ b/environment/font.tex
@@ -18,6 +18,7 @@
 % (Neo) Euler heeft niet alle symbolen, maar Pagella wel
 \definefallbackfamily[mainfamily] [mm] [TeX Gyre Pagella Math] [range=0x0266D-0x0266F,rscale=0.95] % sharp and flat
 \definefallbackfamily[mainfamily] [mm] [TeX Gyre Pagella Math] [range={0x025A1,0x025B3}] % triangle, square
+\definefallbackfamily[mainfamily] [mm] [TeX Gyre Pagella Math] [range={0x022EE}] % vdots
 % Neo Euler voor wiskunde. Merk op dat dit niet cursief is!
 \definefontfamily[mainfamily] [mm] [Neo Euler] [rscale=0.95]
 % \definefontfamily[mainfamily] [mm] [Latin Modern Math]
diff --git a/environment/notation.tex b/environment/notation.tex
index 09a5695..f89d571 100644
--- a/environment/notation.tex
+++ b/environment/notation.tex
@@ -1,5 +1,10 @@
 \startenvironment notation
 
+% Pijlen met text erboven
+\definemathstackers [arrowstack] [voffset=-.7\mathexheight, topcommand=\m]
+\definemathextensible [arrowstack] [tot] ["27F6]
+
+
 \define[1]\Fam{\mathcal{#1}}
 \define\lang{\mathcal{L}}
 \define\autom{\mathcal{A}}
@@ -8,6 +13,7 @@
 \define\pref{pref}
 \define\suff{suff}
 
+% Of misschien toch \ss?
 \define[1]\kw{\text{\tt #1}}
 
 % L* algorithms
@@ -22,6 +28,9 @@
 \define\NLambda{{\sc Nλ}}
 \define\LOIS{{\sc Lois}}
 
+% Libraries
+\define\cpp{{C++}}
+
 % Maths
 \define\N{\naturalnumbers}
 \define\Q{\rationals}
diff --git a/images/ESM-manual.png b/images/ESM-manual.png
new file mode 100644
index 0000000..b434a45
Binary files /dev/null and b/images/ESM-manual.png differ
diff --git a/images/ESMProtocols.pdf b/images/ESMProtocols.pdf
new file mode 100644
index 0000000..643d564
Binary files /dev/null and b/images/ESMProtocols.pdf differ
diff --git a/images/ESMStates.pdf b/images/ESMStates.pdf
new file mode 100644
index 0000000..0c9b64f
Binary files /dev/null and b/images/ESMStates.pdf differ
diff --git a/images/ESRAOverview.pdf b/images/ESRAOverview.pdf
new file mode 100644
index 0000000..330e9f4
Binary files /dev/null and b/images/ESRAOverview.pdf differ
diff --git a/images/Learner.pdf b/images/Learner.pdf
new file mode 100644
index 0000000..94c6bea
Binary files /dev/null and b/images/Learner.pdf differ
diff --git a/images/formats.pdf b/images/formats.pdf
new file mode 100644
index 0000000..6e201a2
Binary files /dev/null and b/images/formats.pdf differ
diff --git a/images/hyp_20_partial_ds.pdf b/images/hyp_20_partial_ds.pdf
new file mode 100644
index 0000000..0daf1dd
Binary files /dev/null and b/images/hyp_20_partial_ds.pdf differ
diff --git a/images/model-example1.pdf b/images/model-example1.pdf
new file mode 100644
index 0000000..356ba15
Binary files /dev/null and b/images/model-example1.pdf differ
diff --git a/images/model-example2.pdf b/images/model-example2.pdf
new file mode 100644
index 0000000..cea3cf0
Binary files /dev/null and b/images/model-example2.pdf differ