diff --git a/content/introduction.tex b/content/introduction.tex
index 7b53627..6b45dc1 100644
--- a/content/introduction.tex
+++ b/content/introduction.tex
@@ -55,13 +55,13 @@ So it makes sense to study a learning paradigm which allows for \emph{queries},
 A typical query learning framework was established by \citet[Angluin87].
 In her framework, the learning algorithm may pose two types of queries to a \emph{teacher}, or \emph{oracle}:
 
-\description{Membership queries (MQs)}
+\description{Membership queries (MQ)}
 The learner poses such a query by providing a word $w \in \Sigma^{*}$ to the teacher.
 The teacher will then reply whether $w \in \lang$ or not.
 This type of query is often generalised to more general output, in these cases we consider $\lang \colon \Sigma^\ast \to O$ and the teacher replies with $\lang(w)$.
 In some papers, such a query is then called an \emph{output query}.
 
-\description{Equivalence queries (EQs)}
+\description{Equivalence queries (EQ)}
 The learner can provide a hypothesised description $H$ of $\lang$ to the teacher.
 If the hypothesis is correct, the teacher replies with \kw{yes}.
 If, however, the hypothesis is incorrect, the teacher replies with \kw{no} together with a counterexample, i.e., a word which is in $\lang$ but not in the hypothesis or vice versa.
@@ -78,9 +78,9 @@ Another paradigm which is relevant for our type of applications is \emph{PAC-lea
 Here, the algorithm can again use MQs as before, but the EQs are replaced by random sampling.
 So the allowed query is:
 
-\description{Random sample queries (EXs)}
-If the learner poses this query (there are no parameters), the teacher responds with a random word $w$ together with its label (i.e., whether $w \in \lang$ or not).
-\todo{Wat is random? Kansverdeling introduceren?}
+\description{Random sample queries (EX)}
+If the learner poses this query (there are no parameters), the teacher responds with a random word $w$ together with its label, i.e., whether $w \in \lang$ or not.
+(Here, \emph{random} means that the words are sampled by some probability distribution known to the teacher.)
 
 Instead of requiring that the learner exactly learns the model, we only require the following.
 The learner should \emph{probably} return a model which is \emph{approximate} to the target.
@@ -95,7 +95,6 @@ But polynomial in what?
 The learner has no input, other than the access to a teacher.
 We ask the algorithms to be polynomial in the \emph{size of the target} (i.e., the size of the description which has yet to be learned).
 In the case of PAC learning we also require it to be polynomial in the two parameters for confidence.
-\todo{Afhankelijk van de verdeling}
 
 Deterministic automata can be efficiently learned in the PAC model.
 In fact, any efficient exact learning algorithm with MQs and EQs can be transformed into an efficient PAC algorithm with MQs (see \citenp[KearnsV94], exercise 8.1).
@@ -120,12 +119,10 @@ There are plentiful of challenges to solve, such as timing, choosing your alphab
 Equivalence queries, however, are in general not doable.
 Even if we have the (machine) code, it is often way too complicated to check equivalence.
 That is why we resort to testing with EX queries.
-The EX query from PAC learning normally assumes a fixed, unknown distribution.
-\todo{Prob. distribution on words}
+The EX query from PAC learning normally assumes a fixed, unknown probability distribution on words.
 In our case, we choose and implement a distribution to test against.
-\todo{Concreter: that we use for membership queries.}
 This cuts both ways:
-On one hand, it allows us to only test behaviour we really care about, on the other hand the results are only as good as our choice of distribution.
+On the one hand, it allows us to only test behaviour we really care about, on the other hand the results are only as good as our choice of distribution.
 We deviate even further from the PAC-model as we sometimes change our distribution while learning.
 Yet, as applications show, this is a useful way of learning software behaviour.
 
@@ -173,18 +170,15 @@ This gave developers opportunities to solve problems before replacing the old co
   [title={Research challenges}]
 
 In this thesis, we will mostly see learning of deterministic automata or Mealy machines.
-Although this is restrictive as many pieces of software require richer models, it has been successfully applied in the above examples.
-The restrictions include the following.
-\todo{Limitations / assumptions}
+Although this is limited, as many pieces of software require richer models, it has been successfully applied in the above examples.
+The limitations include the following.
 \startitemize[after]
 \item The system behaves deterministically.
 \item One can reliably reset the system.
 \item The system can be modelled with a finite state space.
 This also means that the model does not incorporate time or data.
-\item The set of input actions is finite.
-\todo{Alphabet}
+\item The input alphabet is finite.
 \item One knows when the target is reached.
-\todo{Weten we wel mbt kansverdeling}
 \stopitemize
 
 \description{Research challenge 1: Approximating equivalence queries.}
@@ -194,7 +188,6 @@ For example, we may require the hypothesis to be correct, provided that the real
 Efficiency is important here:
 We want to obtain those guarantees fast and we want to quickly find counterexamples when the hypothesis is wrong.
 Test generation methods is the topic of the first part in this thesis.
-\todo{to test if hypothesis is correct}
 We will review existing algorithms and discuss new algorithms for test generation.
 
 \startdescription[title={Research challenge 2: Generalisation to infinite alphabets.}]
@@ -235,13 +228,6 @@ In particular, the set of experiments is polynomial in the number of states.
 These techniques will be discussed in detail in \in{Chapter}[chap:test-methods].
 More background and other related problems, as well as their complexity results, are well exposed in a survey of \citet[LeeY94].
 
-To give an example of conformance checking, we model a record player as a finite state machine.
-We will not model the audible output -- that would depend not only on the device, but also the record one chooses to play\footnote{In particular, we have to add time to the model as one side of a record only lasts for roughly 25 minutes. Unless we take a record with sound on the locked groove such as the \emph{Sgt. Pepper's Lonely Hearts Club Band} album by \emph{The Beatles}.}.
-Instead, the only observation we can make is looking how fast the turntable spins.
-The device has two buttons: a start-stop button (\playstop) and a speed button (\spin) which toggles between $33 \frac{1}{3}$ rpm and $45$ rpm.
-When turned on, the system starts playing immediately at $33 \frac{1}{3}$ rpm -- this is useful for DJing.
-The intended behaviour of the record player has four states as depicted in \in{Figure}[fig:record-player].
-
 \startplacefigure
   [title={Behaviour of a record player modelled as a finite state machine.},
    reference=fig:record-player]
@@ -263,6 +249,13 @@ The intended behaviour of the record player has four states as depicted in \in{F
 \stoptikzpicture}
 \stopplacefigure
 
+To give an example of conformance checking, we model a record player as a finite state machine.
+We will not model the audible output -- that would depend not only on the device, but also the record one chooses to play\footnote{In particular, we have to add time to the model as one side of a record only lasts for roughly 25 minutes. Unless we take a record with sound on the locked groove such as the \emph{Sgt. Pepper's Lonely Hearts Club Band} album by \emph{The Beatles}.}.
+Instead, the only observation we can make is looking how fast the turntable spins.
+The device has two buttons: a start-stop button (\playstop) and a speed button (\spin) which toggles between $33 \frac{1}{3}$ rpm and $45$ rpm.
+When turned on, the system starts playing immediately at $33 \frac{1}{3}$ rpm -- this is useful for DJing.
+The intended behaviour of the record player has four states as depicted in \in{Figure}[fig:record-player].
+
 Let us consider some faults which could be present in an implementation with four states.
 In \in{Figure}[fig:faulty-record-players], two flawed record players are given.
 In the first (\in{Figure}{a}[fig:faulty-record-players]), the sequence \playstop\,\spin\,\spin\, leads us to the wrong state.
@@ -608,8 +601,6 @@ This work was presented at ICGI 2018:
 \startsection
   [title=Conclusion and Outlook]
 
-\todo{Loopt een beetje door elkaar}
-
 With the current tools for model learning, it is possible to learn big state machines of black box systems.
 It involves using the clever algorithms for learning (such as the TTT algorithm by \citenp[Isberner15]) and efficient testing methods (see \in{Chapter}[chap:test-methods]).
 However, as the industrial case study from \in{Chapter}[chap:applying-automata-learning] shows, the bottleneck is often in conformance testing.
@@ -621,9 +612,10 @@ A question for future research is how this additional information can be integra
 
 Black box testing still has theoretical challenges.
 Current generalisations to non-deterministic systems or language inclusion (such as black box testing for IOCO) often need exponentially big test suites.
-Whether this is necessary is unknown (to me), we only have upper bounds but no lower bounds.
+Whether this is necessary is unknown (to me):
+we only have upper bounds but no lower bounds.
 An interesting approach could be to see if there exists a notion of reduction between test suites.
-This is analogous to the reductions used complexity theory to prove problems hard, or reductions used in PAC theory to prove learning problems to be inherently unpredictable.
+This is analogous to the reductions used in complexity theory to prove hardness of problems, or reductions used in PAC theory to prove learning problems to be inherently unpredictable.
 
 Another path taken in this thesis is the research on nominal automata.
 This was motivated by the problem of learning automata over infinite alphabets.
@@ -638,7 +630,7 @@ The nominal learning algorithms can be implemented in just a few hundreds lines
 In this thesis, we tackle some efficiency issues when computing with nominal sets.
 In \in{Chapter}[chap:ordered-nominal-sets] we characterise orbits in order to give an efficient representation (for the total-order symmetry).
 Another result is the fact that some nominal automata can be \quote{compressed} to separated automata, which can be exponentially smaller (\in{Chapter}[chap:separated-nominal-automata]).
-However, the nominal tools still leave much to desired in terms of efficiency.
+However, the nominal tools still leave much to be desired in terms of efficiency.
 
 Last, it would be interesting to marry the two paths taken in this thesis.
 I am not aware of $n$-complete test suites for register automata or nominal automata.
diff --git a/content/separated-nominal-automata.tex b/content/separated-nominal-automata.tex
index aed6e2a..af33b7b 100644
--- a/content/separated-nominal-automata.tex
+++ b/content/separated-nominal-automata.tex
@@ -982,7 +982,7 @@ If we know a nominal automaton recognises an $\sb$-language, then we are better
 From the $\sb$-semantics of separated automata, it follows that we have a Myhill-Nerode theorem, which means that learning is feasible.
 We expect that this can be useful, since we can achieve an exponential reduction this way.
 
-Boja\'{n}czyk et al. prove that nominal automata are equivalent to register automata in terms of expressiveness \citep[BojanczykKL14].
+\citet[BojanczykKL14] prove that nominal automata are equivalent to register automata in terms of expressiveness.
 However, when translating from register automata with $n$ states to nominal automata, we may get exponentially many orbits.
 This happens for instance in the FIFO automaton (\in{Example}[ex:fifo]).
 We have shown that the exponential blow-up is avoidable by using separated automata, for this example and in general for $\sb$-equivariant languages.
@@ -993,9 +993,6 @@ We believe that this is the hypothesised \quotation{substitution monad} from \in
 The monad is monoidal (sending separated products to Cartesian products) and if $X$ is an orbit-finite nominal set, then so is $T(X)$.
 This means that we can consider nominal $T$-automata and we can perhaps determinise them using coalgebraic methods \citep[SilvaBBR13].
 
-%Super-separated automata $X \to O \sepprod (\Sigma \wandto X)$.
-%\todo{}
-
 
 \stopsection
 \startsubject[title={Acknowledgements}]