Final typos etc
This commit is contained in:
parent
a312dc4aed
commit
26e2be44d2
2 changed files with 22 additions and 33 deletions
|
@ -55,13 +55,13 @@ So it makes sense to study a learning paradigm which allows for \emph{queries},
|
||||||
A typical query learning framework was established by \citet[Angluin87].
|
A typical query learning framework was established by \citet[Angluin87].
|
||||||
In her framework, the learning algorithm may pose two types of queries to a \emph{teacher}, or \emph{oracle}:
|
In her framework, the learning algorithm may pose two types of queries to a \emph{teacher}, or \emph{oracle}:
|
||||||
|
|
||||||
\description{Membership queries (MQs)}
|
\description{Membership queries (MQ)}
|
||||||
The learner poses such a query by providing a word $w \in \Sigma^{*}$ to the teacher.
|
The learner poses such a query by providing a word $w \in \Sigma^{*}$ to the teacher.
|
||||||
The teacher will then reply whether $w \in \lang$ or not.
|
The teacher will then reply whether $w \in \lang$ or not.
|
||||||
This type of query is often generalised to more general output, in these cases we consider $\lang \colon \Sigma^\ast \to O$ and the teacher replies with $\lang(w)$.
|
This type of query is often generalised to more general output, in these cases we consider $\lang \colon \Sigma^\ast \to O$ and the teacher replies with $\lang(w)$.
|
||||||
In some papers, such a query is then called an \emph{output query}.
|
In some papers, such a query is then called an \emph{output query}.
|
||||||
|
|
||||||
\description{Equivalence queries (EQs)}
|
\description{Equivalence queries (EQ)}
|
||||||
The learner can provide a hypothesised description $H$ of $\lang$ to the teacher.
|
The learner can provide a hypothesised description $H$ of $\lang$ to the teacher.
|
||||||
If the hypothesis is correct, the teacher replies with \kw{yes}.
|
If the hypothesis is correct, the teacher replies with \kw{yes}.
|
||||||
If, however, the hypothesis is incorrect, the teacher replies with \kw{no} together with a counterexample, i.e., a word which is in $\lang$ but not in the hypothesis or vice versa.
|
If, however, the hypothesis is incorrect, the teacher replies with \kw{no} together with a counterexample, i.e., a word which is in $\lang$ but not in the hypothesis or vice versa.
|
||||||
|
@ -78,9 +78,9 @@ Another paradigm which is relevant for our type of applications is \emph{PAC-lea
|
||||||
Here, the algorithm can again use MQs as before, but the EQs are replaced by random sampling.
|
Here, the algorithm can again use MQs as before, but the EQs are replaced by random sampling.
|
||||||
So the allowed query is:
|
So the allowed query is:
|
||||||
|
|
||||||
\description{Random sample queries (EXs)}
|
\description{Random sample queries (EX)}
|
||||||
If the learner poses this query (there are no parameters), the teacher responds with a random word $w$ together with its label (i.e., whether $w \in \lang$ or not).
|
If the learner poses this query (there are no parameters), the teacher responds with a random word $w$ together with its label, i.e., whether $w \in \lang$ or not.
|
||||||
\todo{Wat is random? Kansverdeling introduceren?}
|
(Here, \emph{random} means that the words are sampled by some probability distribution known to the teacher.)
|
||||||
|
|
||||||
Instead of requiring that the learner exactly learns the model, we only require the following.
|
Instead of requiring that the learner exactly learns the model, we only require the following.
|
||||||
The learner should \emph{probably} return a model which is \emph{approximate} to the target.
|
The learner should \emph{probably} return a model which is \emph{approximate} to the target.
|
||||||
|
@ -95,7 +95,6 @@ But polynomial in what?
|
||||||
The learner has no input, other than the access to a teacher.
|
The learner has no input, other than the access to a teacher.
|
||||||
We ask the algorithms to be polynomial in the \emph{size of the target} (i.e., the size of the description which has yet to be learned).
|
We ask the algorithms to be polynomial in the \emph{size of the target} (i.e., the size of the description which has yet to be learned).
|
||||||
In the case of PAC learning we also require it to be polynomial in the two parameters for confidence.
|
In the case of PAC learning we also require it to be polynomial in the two parameters for confidence.
|
||||||
\todo{Afhankelijk van de verdeling}
|
|
||||||
|
|
||||||
Deterministic automata can be efficiently learned in the PAC model.
|
Deterministic automata can be efficiently learned in the PAC model.
|
||||||
In fact, any efficient exact learning algorithm with MQs and EQs can be transformed into an efficient PAC algorithm with MQs (see \citenp[KearnsV94], exercise 8.1).
|
In fact, any efficient exact learning algorithm with MQs and EQs can be transformed into an efficient PAC algorithm with MQs (see \citenp[KearnsV94], exercise 8.1).
|
||||||
|
@ -120,12 +119,10 @@ There are plentiful of challenges to solve, such as timing, choosing your alphab
|
||||||
Equivalence queries, however, are in general not doable.
|
Equivalence queries, however, are in general not doable.
|
||||||
Even if we have the (machine) code, it is often way too complicated to check equivalence.
|
Even if we have the (machine) code, it is often way too complicated to check equivalence.
|
||||||
That is why we resort to testing with EX queries.
|
That is why we resort to testing with EX queries.
|
||||||
The EX query from PAC learning normally assumes a fixed, unknown distribution.
|
The EX query from PAC learning normally assumes a fixed, unknown probability distribution on words.
|
||||||
\todo{Prob. distribution on words}
|
|
||||||
In our case, we choose and implement a distribution to test against.
|
In our case, we choose and implement a distribution to test against.
|
||||||
\todo{Concreter: that we use for membership queries.}
|
|
||||||
This cuts both ways:
|
This cuts both ways:
|
||||||
On one hand, it allows us to only test behaviour we really care about, on the other hand the results are only as good as our choice of distribution.
|
On the one hand, it allows us to only test behaviour we really care about, on the other hand the results are only as good as our choice of distribution.
|
||||||
We deviate even further from the PAC-model as we sometimes change our distribution while learning.
|
We deviate even further from the PAC-model as we sometimes change our distribution while learning.
|
||||||
Yet, as applications show, this is a useful way of learning software behaviour.
|
Yet, as applications show, this is a useful way of learning software behaviour.
|
||||||
|
|
||||||
|
@ -173,18 +170,15 @@ This gave developers opportunities to solve problems before replacing the old co
|
||||||
[title={Research challenges}]
|
[title={Research challenges}]
|
||||||
|
|
||||||
In this thesis, we will mostly see learning of deterministic automata or Mealy machines.
|
In this thesis, we will mostly see learning of deterministic automata or Mealy machines.
|
||||||
Although this is restrictive as many pieces of software require richer models, it has been successfully applied in the above examples.
|
Although this is limited, as many pieces of software require richer models, it has been successfully applied in the above examples.
|
||||||
The restrictions include the following.
|
The limitations include the following.
|
||||||
\todo{Limitations / assumptions}
|
|
||||||
\startitemize[after]
|
\startitemize[after]
|
||||||
\item The system behaves deterministically.
|
\item The system behaves deterministically.
|
||||||
\item One can reliably reset the system.
|
\item One can reliably reset the system.
|
||||||
\item The system can be modelled with a finite state space.
|
\item The system can be modelled with a finite state space.
|
||||||
This also means that the model does not incorporate time or data.
|
This also means that the model does not incorporate time or data.
|
||||||
\item The set of input actions is finite.
|
\item The input alphabet is finite.
|
||||||
\todo{Alphabet}
|
|
||||||
\item One knows when the target is reached.
|
\item One knows when the target is reached.
|
||||||
\todo{Weten we wel mbt kansverdeling}
|
|
||||||
\stopitemize
|
\stopitemize
|
||||||
|
|
||||||
\description{Research challenge 1: Approximating equivalence queries.}
|
\description{Research challenge 1: Approximating equivalence queries.}
|
||||||
|
@ -194,7 +188,6 @@ For example, we may require the hypothesis to be correct, provided that the real
|
||||||
Efficiency is important here:
|
Efficiency is important here:
|
||||||
We want to obtain those guarantees fast and we want to quickly find counterexamples when the hypothesis is wrong.
|
We want to obtain those guarantees fast and we want to quickly find counterexamples when the hypothesis is wrong.
|
||||||
Test generation methods is the topic of the first part in this thesis.
|
Test generation methods is the topic of the first part in this thesis.
|
||||||
\todo{to test if hypothesis is correct}
|
|
||||||
We will review existing algorithms and discuss new algorithms for test generation.
|
We will review existing algorithms and discuss new algorithms for test generation.
|
||||||
|
|
||||||
\startdescription[title={Research challenge 2: Generalisation to infinite alphabets.}]
|
\startdescription[title={Research challenge 2: Generalisation to infinite alphabets.}]
|
||||||
|
@ -235,13 +228,6 @@ In particular, the set of experiments is polynomial in the number of states.
|
||||||
These techniques will be discussed in detail in \in{Chapter}[chap:test-methods].
|
These techniques will be discussed in detail in \in{Chapter}[chap:test-methods].
|
||||||
More background and other related problems, as well as their complexity results, are well exposed in a survey of \citet[LeeY94].
|
More background and other related problems, as well as their complexity results, are well exposed in a survey of \citet[LeeY94].
|
||||||
|
|
||||||
To give an example of conformance checking, we model a record player as a finite state machine.
|
|
||||||
We will not model the audible output -- that would depend not only on the device, but also the record one chooses to play\footnote{In particular, we have to add time to the model as one side of a record only lasts for roughly 25 minutes. Unless we take a record with sound on the locked groove such as the \emph{Sgt. Pepper's Lonely Hearts Club Band} album by \emph{The Beatles}.}.
|
|
||||||
Instead, the only observation we can make is looking how fast the turntable spins.
|
|
||||||
The device has two buttons: a start-stop button (\playstop) and a speed button (\spin) which toggles between $33 \frac{1}{3}$ rpm and $45$ rpm.
|
|
||||||
When turned on, the system starts playing immediately at $33 \frac{1}{3}$ rpm -- this is useful for DJing.
|
|
||||||
The intended behaviour of the record player has four states as depicted in \in{Figure}[fig:record-player].
|
|
||||||
|
|
||||||
\startplacefigure
|
\startplacefigure
|
||||||
[title={Behaviour of a record player modelled as a finite state machine.},
|
[title={Behaviour of a record player modelled as a finite state machine.},
|
||||||
reference=fig:record-player]
|
reference=fig:record-player]
|
||||||
|
@ -263,6 +249,13 @@ The intended behaviour of the record player has four states as depicted in \in{F
|
||||||
\stoptikzpicture}
|
\stoptikzpicture}
|
||||||
\stopplacefigure
|
\stopplacefigure
|
||||||
|
|
||||||
|
To give an example of conformance checking, we model a record player as a finite state machine.
|
||||||
|
We will not model the audible output -- that would depend not only on the device, but also the record one chooses to play\footnote{In particular, we have to add time to the model as one side of a record only lasts for roughly 25 minutes. Unless we take a record with sound on the locked groove such as the \emph{Sgt. Pepper's Lonely Hearts Club Band} album by \emph{The Beatles}.}.
|
||||||
|
Instead, the only observation we can make is looking how fast the turntable spins.
|
||||||
|
The device has two buttons: a start-stop button (\playstop) and a speed button (\spin) which toggles between $33 \frac{1}{3}$ rpm and $45$ rpm.
|
||||||
|
When turned on, the system starts playing immediately at $33 \frac{1}{3}$ rpm -- this is useful for DJing.
|
||||||
|
The intended behaviour of the record player has four states as depicted in \in{Figure}[fig:record-player].
|
||||||
|
|
||||||
Let us consider some faults which could be present in an implementation with four states.
|
Let us consider some faults which could be present in an implementation with four states.
|
||||||
In \in{Figure}[fig:faulty-record-players], two flawed record players are given.
|
In \in{Figure}[fig:faulty-record-players], two flawed record players are given.
|
||||||
In the first (\in{Figure}{a}[fig:faulty-record-players]), the sequence \playstop\,\spin\,\spin\, leads us to the wrong state.
|
In the first (\in{Figure}{a}[fig:faulty-record-players]), the sequence \playstop\,\spin\,\spin\, leads us to the wrong state.
|
||||||
|
@ -608,8 +601,6 @@ This work was presented at ICGI 2018:
|
||||||
\startsection
|
\startsection
|
||||||
[title=Conclusion and Outlook]
|
[title=Conclusion and Outlook]
|
||||||
|
|
||||||
\todo{Loopt een beetje door elkaar}
|
|
||||||
|
|
||||||
With the current tools for model learning, it is possible to learn big state machines of black box systems.
|
With the current tools for model learning, it is possible to learn big state machines of black box systems.
|
||||||
It involves using the clever algorithms for learning (such as the TTT algorithm by \citenp[Isberner15]) and efficient testing methods (see \in{Chapter}[chap:test-methods]).
|
It involves using the clever algorithms for learning (such as the TTT algorithm by \citenp[Isberner15]) and efficient testing methods (see \in{Chapter}[chap:test-methods]).
|
||||||
However, as the industrial case study from \in{Chapter}[chap:applying-automata-learning] shows, the bottleneck is often in conformance testing.
|
However, as the industrial case study from \in{Chapter}[chap:applying-automata-learning] shows, the bottleneck is often in conformance testing.
|
||||||
|
@ -621,9 +612,10 @@ A question for future research is how this additional information can be integra
|
||||||
|
|
||||||
Black box testing still has theoretical challenges.
|
Black box testing still has theoretical challenges.
|
||||||
Current generalisations to non-deterministic systems or language inclusion (such as black box testing for IOCO) often need exponentially big test suites.
|
Current generalisations to non-deterministic systems or language inclusion (such as black box testing for IOCO) often need exponentially big test suites.
|
||||||
Whether this is necessary is unknown (to me), we only have upper bounds but no lower bounds.
|
Whether this is necessary is unknown (to me):
|
||||||
|
we only have upper bounds but no lower bounds.
|
||||||
An interesting approach could be to see if there exists a notion of reduction between test suites.
|
An interesting approach could be to see if there exists a notion of reduction between test suites.
|
||||||
This is analogous to the reductions used complexity theory to prove problems hard, or reductions used in PAC theory to prove learning problems to be inherently unpredictable.
|
This is analogous to the reductions used in complexity theory to prove hardness of problems, or reductions used in PAC theory to prove learning problems to be inherently unpredictable.
|
||||||
|
|
||||||
Another path taken in this thesis is the research on nominal automata.
|
Another path taken in this thesis is the research on nominal automata.
|
||||||
This was motivated by the problem of learning automata over infinite alphabets.
|
This was motivated by the problem of learning automata over infinite alphabets.
|
||||||
|
@ -638,7 +630,7 @@ The nominal learning algorithms can be implemented in just a few hundreds lines
|
||||||
In this thesis, we tackle some efficiency issues when computing with nominal sets.
|
In this thesis, we tackle some efficiency issues when computing with nominal sets.
|
||||||
In \in{Chapter}[chap:ordered-nominal-sets] we characterise orbits in order to give an efficient representation (for the total-order symmetry).
|
In \in{Chapter}[chap:ordered-nominal-sets] we characterise orbits in order to give an efficient representation (for the total-order symmetry).
|
||||||
Another result is the fact that some nominal automata can be \quote{compressed} to separated automata, which can be exponentially smaller (\in{Chapter}[chap:separated-nominal-automata]).
|
Another result is the fact that some nominal automata can be \quote{compressed} to separated automata, which can be exponentially smaller (\in{Chapter}[chap:separated-nominal-automata]).
|
||||||
However, the nominal tools still leave much to desired in terms of efficiency.
|
However, the nominal tools still leave much to be desired in terms of efficiency.
|
||||||
|
|
||||||
Last, it would be interesting to marry the two paths taken in this thesis.
|
Last, it would be interesting to marry the two paths taken in this thesis.
|
||||||
I am not aware of $n$-complete test suites for register automata or nominal automata.
|
I am not aware of $n$-complete test suites for register automata or nominal automata.
|
||||||
|
|
|
@ -982,7 +982,7 @@ If we know a nominal automaton recognises an $\sb$-language, then we are better
|
||||||
From the $\sb$-semantics of separated automata, it follows that we have a Myhill-Nerode theorem, which means that learning is feasible.
|
From the $\sb$-semantics of separated automata, it follows that we have a Myhill-Nerode theorem, which means that learning is feasible.
|
||||||
We expect that this can be useful, since we can achieve an exponential reduction this way.
|
We expect that this can be useful, since we can achieve an exponential reduction this way.
|
||||||
|
|
||||||
Boja\'{n}czyk et al. prove that nominal automata are equivalent to register automata in terms of expressiveness \citep[BojanczykKL14].
|
\citet[BojanczykKL14] prove that nominal automata are equivalent to register automata in terms of expressiveness.
|
||||||
However, when translating from register automata with $n$ states to nominal automata, we may get exponentially many orbits.
|
However, when translating from register automata with $n$ states to nominal automata, we may get exponentially many orbits.
|
||||||
This happens for instance in the FIFO automaton (\in{Example}[ex:fifo]).
|
This happens for instance in the FIFO automaton (\in{Example}[ex:fifo]).
|
||||||
We have shown that the exponential blow-up is avoidable by using separated automata, for this example and in general for $\sb$-equivariant languages.
|
We have shown that the exponential blow-up is avoidable by using separated automata, for this example and in general for $\sb$-equivariant languages.
|
||||||
|
@ -993,9 +993,6 @@ We believe that this is the hypothesised \quotation{substitution monad} from \in
|
||||||
The monad is monoidal (sending separated products to Cartesian products) and if $X$ is an orbit-finite nominal set, then so is $T(X)$.
|
The monad is monoidal (sending separated products to Cartesian products) and if $X$ is an orbit-finite nominal set, then so is $T(X)$.
|
||||||
This means that we can consider nominal $T$-automata and we can perhaps determinise them using coalgebraic methods \citep[SilvaBBR13].
|
This means that we can consider nominal $T$-automata and we can perhaps determinise them using coalgebraic methods \citep[SilvaBBR13].
|
||||||
|
|
||||||
%Super-separated automata $X \to O \sepprod (\Sigma \wandto X)$.
|
|
||||||
%\todo{}
|
|
||||||
|
|
||||||
|
|
||||||
\stopsection
|
\stopsection
|
||||||
\startsubject[title={Acknowledgements}]
|
\startsubject[title={Acknowledgements}]
|
||||||
|
|
Reference in a new issue