Archived
1
Fork 0

Intro en stijl

This commit is contained in:
Joshua Moerman 2019-01-06 14:20:32 +01:00
parent 9ccffdbbea
commit 65638971d3
5 changed files with 73 additions and 73 deletions

View file

@ -126,14 +126,50 @@ We deviate even further from the PAC-model as we sometimes change our distributi
Yet, as applications show, this is a useful way of learning software behaviour. Yet, as applications show, this is a useful way of learning software behaviour.
\stopsection
\startsection
[title={Applications of Model Learning}]
Since this thesis only contains one real-world application on learning in \in{Chapter}[chap:applying-automata-learning], it is good to mention a few others.
Although we remain in the context of learning software behaviour, the applications are quite different from each other.
This is by no means a complete list.
\description{Bug finding in protocols.}
A prominent example is by \citet[Fiterau-Brostean16].
They learn models of TCP implementations -- both clients and server sides.
Interestingly, they found bugs in the (closed source) Windows implementation.
Later, \citet[Fiterau-BrosteanH17] also found a bug in the sliding window of the Linux implementation.
Other protocols have been learned as well, such as the MQTT protocol by \citet[TapplerAB17], TLS by \citet[RuiterP15], and SSH by \citet[Fiterau-BrosteanLPRVV17].
Many of these applications reveal bugs by learning a model and consequently apply model checking.
The combination of learning and model checking was first described by \citet[PeledVY02].
\description{Bug finding in smart cards.}
\citet[AartsRP13] learn the software on smart cards of several Dutch and German banks.
These cards use the EMV protocol, which is run on the card itself.
So this is an example of a real black box system, where no other monitoring is possible and no code is available.
No vulnerabilities were found, although each card had a slightly different protocol.
Card readers have been learned by \citet[ChaluparPPR14].
They built a Lego machine which could automatically press buttons and the researchers found a security flaw.
\description{Regression testing.}
\citet[HungarNS03] describe the potential of automata learning in regression testing.
The aim is not to find bugs, but to monitor the development process of a system.
By considering the differences between models at different stages, one can generate regressions tests.
\description{Refactoring legacy software.}
Model learning can also be used in order to verify refactored software.
\citet[SchutsHV16] have applied this at a project within Philips.
They learn both an old version and a new version of the same component.
By comparing the learned models, some differences could be seen.
This gave developers opportunities to solve problems before replacing the old component by the new one.
\stopsection \stopsection
\startsection \startsection
[title={Research challenges}] [title={Research challenges}]
% Model learning is far from a
In this thesis, we will mostly see learning of deterministic automata or Mealy machines. In this thesis, we will mostly see learning of deterministic automata or Mealy machines.
Although this is restrictive as many pieces of software require richer models, it has been successfully applied in many different areas. Although this is restrictive as many pieces of software require richer models, it has been successfully applied in the above examples.
\todo{why}
The restrictions include the following. The restrictions include the following.
\startitemize[after] \startitemize[after]
\item The system behaves deterministically. \item The system behaves deterministically.
@ -169,44 +205,6 @@ The approach in my thesis will be based on symmetries, which gives yet another i
\stopdescription \stopdescription
\stopsection
\startsection
[title={Applications of Model Learning}]
Since this thesis only contains one real-world application on learning in \in{Chapter}[chap:applying-automata-learning], it is good to mention a few others.
Although we remain in the context of learning software behaviour, the applications are quite different.
This is by no means a complete list.
\description{Bug finding in protocols.}
A prominent example is by \citet[Fiterau-Brostean16].
They learn models of TCP implementations -- both clients and server sides.
Interestingly, they found bugs in the (closed source) Windows implementation.
Later, \citet[Fiterau-BrosteanH17] also found a bug in the sliding window of the Linux implementation.
Other protocols have been learned as well, such as the MQTT protocol by \citet[TapplerAB17], TLS by \citet[RuiterP15], and SSH by \citet[Fiterau-BrosteanLPRVV17].
Many of these applications reveal bugs by learning a model and consequently apply model checking.
The combination of learning and model checking was first described by \citet[PeledVY02].
\description{Bug finding in smart cards.}
\citet[AartsRP13] learn the software on smart cards of several Dutch and German banks.
These cards use the EMV protocol, which is run on the card itself.
So this is an example of a real black box system, where no other monitoring is possible and no code is available.
No vulnerabilities were found, although each card had a slightly different protocol.
Card readers have been learned by \citet[ChaluparPPR14].
They built a Lego machine which could automatically press buttons and the researchers found a security flaw.
\description{Regression testing.}
\citet[HungarNS03] describe the potential of automata learning in regression testing.
The aim is not to find bugs, but to monitor the development process of a system.
By considering the differences between models at different stages, one can generate regressions tests.
\description{Refactoring legacy software.}
Model learning can also be used in order to verify refactored software.
\citet[SchutsHV16] have applied this at a project within Philips.
They learn both an old version and a new version of the same component.
By comparing the learned models, some differences could be seen.
This gave developers opportunities to solve problems before replacing the old component by the new one.
\stopsection \stopsection
\startsection \startsection
[title={Black Box Testing}] [title={Black Box Testing}]
@ -443,7 +441,6 @@ One has to be careful as not all results from automata theory transfer to nomina
A notable example is the powerset construction, which converts a non-deterministic automaton into a deterministic one. A notable example is the powerset construction, which converts a non-deterministic automaton into a deterministic one.
The problem here is that the powerset of a set is generally \emph{not} orbit-finite and so the finiteness requirement is not met. The problem here is that the powerset of a set is generally \emph{not} orbit-finite and so the finiteness requirement is not met.
Consequently, languages accepted by nominal DFAs are \emph{not} closed under Kleene star, or even concatenation. Consequently, languages accepted by nominal DFAs are \emph{not} closed under Kleene star, or even concatenation.
\todo{ref?}
\stopsubsection \stopsubsection
@ -458,7 +455,7 @@ However, the chapters do get more technical and mathematical -- especially in \i
Detailed discussion on related work and future directions of research are presented in each chapter. Detailed discussion on related work and future directions of research are presented in each chapter.
\startcontribution[title={\in{Chapter}[chap:test-methods]: FSM-based test methods.}] \startcontribution[title={Chapter \ref[default][chap:test-methods]: FSM-based test methods.}]
This chapter introduces test generation methods which can be used for learning. This chapter introduces test generation methods which can be used for learning.
The methods are presented in a uniform way, which allows to give a single proof of completeness for all these methods. The methods are presented in a uniform way, which allows to give a single proof of completeness for all these methods.
Moreover, the uniform presentation gives room to develop new test generation methods. Moreover, the uniform presentation gives room to develop new test generation methods.
@ -471,7 +468,7 @@ The main contributions are:
\stopitemize \stopitemize
\stopcontribution \stopcontribution
\startcontribution[title={\in{Chapter}[chap:applying-automata-learning]: Applying automata learning to embedded control software.}] \startcontribution[title={Chapter \ref[default][chap:applying-automata-learning]: Applying automata learning to embedded control software.}]
In this chapter we will apply model learning to an industrial case study. In this chapter we will apply model learning to an industrial case study.
It is a unique benchmark as it is much bigger than any of the applications seen before (3410 states and 77 inputs). It is a unique benchmark as it is much bigger than any of the applications seen before (3410 states and 77 inputs).
This makes it challenging to learn a model and the main obstacle is finding counterexamples. This makes it challenging to learn a model and the main obstacle is finding counterexamples.
@ -486,7 +483,7 @@ This is based on the following publication:
\cite[entry][DBLP:conf/icfem/SmeenkMVJ15]. \cite[entry][DBLP:conf/icfem/SmeenkMVJ15].
\stopcontribution \stopcontribution
\startcontribution[title={\in{Chapter}[chap:minimal-separating-sequences]: Minimal separating sequences for all pairs of states.}] \startcontribution[title={Chapter \ref[default][chap:minimal-separating-sequences]: Minimal separating sequences for all pairs of states.}]
Continuing on test generation methods, this chapter presents an efficient algorithm to construct separating sequences. Continuing on test generation methods, this chapter presents an efficient algorithm to construct separating sequences.
Not only is the algorithm efficient -- it runs in $\bigO(n \log n)$ time -- it also constructs minimal length sequences. Not only is the algorithm efficient -- it runs in $\bigO(n \log n)$ time -- it also constructs minimal length sequences.
The algorithm is inspired by a minimisation algorithm by \citet[Hopcroft71], but extending it to construct witnesses is non-trivial. The algorithm is inspired by a minimisation algorithm by \citet[Hopcroft71], but extending it to construct witnesses is non-trivial.
@ -502,7 +499,7 @@ This is based on the following publication:
\cite[entry][DBLP:conf/lata/SmetsersMJ16]. \cite[entry][DBLP:conf/lata/SmetsersMJ16].
\stopcontribution \stopcontribution
\startcontribution[title={\in{Chapter}[chap:learning-nominal-automata]: Learning nominal automata.}] \startcontribution[title={Chapter \ref[default][chap:learning-nominal-automata]: Learning nominal automata.}]
In this chapter, we show how to learn automata over infinite alphabets. In this chapter, we show how to learn automata over infinite alphabets.
We do this by translating the \LStar{} algorithm directly to a nominal version, \nLStar{}. We do this by translating the \LStar{} algorithm directly to a nominal version, \nLStar{}.
The correctness proofs mimic the original proofs by \citet[Angluin87]. The correctness proofs mimic the original proofs by \citet[Angluin87].
@ -521,7 +518,7 @@ This is based on the following publication:
\cite[entry][MoermanS0KS17]. \cite[entry][MoermanS0KS17].
\stopcontribution \stopcontribution
\startcontribution[title={\in{Chapter}[chap:ordered-nominal-sets]: Fast computations on ordered nominal sets.}] \startcontribution[title={Chapter \ref[default][chap:ordered-nominal-sets]: Fast computations on ordered nominal sets.}]
In this chapter, we provide a library to compute with nominal sets. In this chapter, we provide a library to compute with nominal sets.
We restrict our attention to nominal sets over the total order symmetry. We restrict our attention to nominal sets over the total order symmetry.
This symmetry allows for a rather easy characterisation of orbits, and hence an easy implementation. This symmetry allows for a rather easy characterisation of orbits, and hence an easy implementation.
@ -538,7 +535,7 @@ This is based on the following publication:
\cite[entry][VenhoekMR18]. \cite[entry][VenhoekMR18].
\stopcontribution \stopcontribution
\startcontribution[title={\in{Chapter}[chap:separated-nominal-automata]: Separated nominal automata.}] \startcontribution[title={Chapter \ref[default][chap:separated-nominal-automata]: Separated nominal automata.}]
We investigate how to reduce the size of certain nominal automata. We investigate how to reduce the size of certain nominal automata.
This is based on the observation that some languages (with outputs) are not just invariant under symmetries, but invariant under arbitrary \emph{transformations}, or \emph{renamings}. This is based on the observation that some languages (with outputs) are not just invariant under symmetries, but invariant under arbitrary \emph{transformations}, or \emph{renamings}.
We define a new type of automaton, the \emph{separated nominal automaton}, and show that they exactly accept those languages which are closed under renamings. We define a new type of automaton, the \emph{separated nominal automaton}, and show that they exactly accept those languages which are closed under renamings.
@ -601,31 +598,36 @@ This work was presented at ICGI 2018:
\startsection \startsection
[title=Conclusion and Outlook] [title=Conclusion and Outlook]
With the current tools for learning, it is possible to learn big state machines of black box systems. With the current tools for model learning, it is possible to learn big state machines of black box systems.
However, a real bottleneck is conformance checking of the hypothesis. It involves using the clever algorithms for learning (such as the TTT algorithm by \citenp[Isberner15]) and efficient testing methods (see \in{Chapter}[chap:test-methods]).
This thesis provides background on conformance checking and also investigates a new algorithm. However, as the industrial case study from \in{Chapter}[chap:applying-automata-theory] shows, the bottleneck is often in conformance testing.
These testing algorithm are as efficient as they can. In order to improve on this bottleneck, one possible direction is to consider \quote{grey box testing.}
So in order to improve on this bottleneck, one possible direction is to consider \quotation{grey box testing}. The methods discussed in this thesis are all black box methods, this could be considered as \quote{too pessimistic.}
This means that we should be looking into using more information of the system during testing (and learning).
Often, we do have (parts of the) source code and we do know relationships between different inputs. Often, we do have (parts of the) source code and we do know relationships between different inputs.
The question is how this additional information can be integrated in the learning and testing of systems. The question is how this additional information can be integrated in a principled manner in the learning and testing of systems.
Another path taken in this thesis is the research on nominal automata. Another path taken in this thesis is the research on nominal automata.
This was motivated by the problem of learning automata over an infinite alphabet. This was motivated by the problem of learning automata over infinite alphabets.
So far, the results on nominal automata are mostly theoretical in nature. So far, the results on nominal automata are mostly theoretical in nature.
Nevertheless, we show that the nominal algorithms can be implemented and that the algorithms can be run concretely on black box systems (\in{Chapter}[chap:learning-nominal-automata]). Nevertheless, we show that the nominal algorithms can be implemented and that they can be run concretely on black box systems (\in{Chapter}[chap:learning-nominal-automata]).
The advantage of using the foundations of nominal sets is that the algorithms are closely related to the original \LStar{} algorithm.
Consequently, variations of \LStar{} can easily be implemented.
For instance, we show that \NLStar{} algorithm for non-deterministic automata works in the nominal case too.
The nominal learning algorithms can be implemented in just a few hundreds lines of code, much less than the approach taken by, e.g., \citet[Fiterau-Brostean18].
However, the tools leave much to desired in terms of efficiency. In this thesis we tackle some efficiency issues when computing with nominal sets.
Some of the efficiency is tackled in \in{Chapter}[chap:ordered-nominal-sets] for a particular symmetry. In \in{Chapter}[chap:ordered-nominal-sets] we characterise orbits in order to give a efficient representation (for the total-order symmetry).
Another result is the fact that some automata can be \quotation{compressed} if they accept a certain type of language (\in{Chapter}[chap:separated-nominal-automata]). Another result is the fact that some nominal automata can be \quote{compressed} to separated automata, which can be exponentially smaller (\in{Chapter}[chap:separated-nominal-automata]).
However, the nominal tools still leave much to desired in terms of efficiency.
Last, it would be interesting to marry the two paths taken in this thesis. Last, it would be interesting to marry the two paths taken in this thesis.
I am not aware of complete test suites for register automata or nominal automata. I am not aware of $n$-complete test suites for register automata or nominal automata.
The results on learning nominal automata in \in{Chapter}[chap:learning-nominal-automata] show that this should be possible, as an observation table should give a test suite. The results on learning nominal automata in \in{Chapter}[chap:learning-nominal-automata] show that this should be possible, as an observation table gives a test suite.
\footnote{The rows of a table are access sequences, and the columns provide a characterisation set.}
However, there is an interesting twist to this problem. However, there is an interesting twist to this problem.
The test methods from \in{Chapter}[chap:test-methods] can all account for extra states. The test methods from \in{Chapter}[chap:test-methods] can all account for extra states.
For nominal automata, we should be able to cope with extra states and extra registers. For nominal automata, we should be able to cope with extra states \emph{and} extra registers.
It will be interesting to see how the test suite grows as these two dimensions increase. It will be interesting to see how the test suite grows as these two dimensions increase.

View file

@ -526,7 +526,8 @@ These languages naturally arise as the semantics of register automata.
The definition of register automata is not as simple as that of ordinary finite automata. The definition of register automata is not as simple as that of ordinary finite automata.
Consequently, transferring results from automata theory to this setting often requires non-trivial proofs. Consequently, transferring results from automata theory to this setting often requires non-trivial proofs.
Nominal automata, instead, are defined as ordinary automata by replacing finite sets with orbit-finite nominal sets. Nominal automata, instead, are defined as ordinary automata by replacing finite sets with orbit-finite nominal sets.
The theory of nominal automata is developed by \citet[BojanczykKL14] and it is shown that many, but not all, algorithms from automata theory transfer to nominal automata. The theory of nominal automata is developed by \citet[BojanczykKL14] and it is shown that many algorithms, such as minimisation (based on the Myhill-Nerode equivalence), from automata theory transfer to nominal automata.
Not all algorithms work: e.g., the subset construction fails for nominal automata.
As an example we consider the following language on rational numbers: As an example we consider the following language on rational numbers:
\startformula \startformula
@ -636,7 +637,7 @@ Where applicable, the automata listed above were generated using the code from \
\startsubsection \startsubsection
[title={Minimising nominal automata}] [title={Minimising nominal automata}]
For languages recognised by nominal DFAs, a Myhill-Nerode theorem holds which relates states to right congruence classes \citep[BojanczykKL14]. For languages recognised by nominal DFAs, a Myhill-Nerode theorem holds which relates states to right congruence classes.
This guarantees the existence of unique minimal automata. This guarantees the existence of unique minimal automata.
We say an automaton is \emph{minimal} if its set of states has the least number of orbits and each orbit has the smallest dimension possible. We say an automaton is \emph{minimal} if its set of states has the least number of orbits and each orbit has the smallest dimension possible.
\footnote{Abstractly, an automaton is minimal if it has no proper quotients. \footnote{Abstractly, an automaton is minimal if it has no proper quotients.

View file

@ -64,9 +64,9 @@ This is relevant for name abstraction \citep[Pitts13], and has also been studied
We apply these connections between nominal sets and renaming sets in the context of automata theory. We apply these connections between nominal sets and renaming sets in the context of automata theory.
Nominal automata are an elegant model for recognising languages over infinite alphabets. Nominal automata are an elegant model for recognising languages over infinite alphabets.
They are expressively equivalent to the more classical register automata \citep[BojanczykKL14], They are expressively equivalent to the more classical register automata,
and have appealing properties that register automata lack, such as unique minimal automata. and have appealing properties that register automata lack, such as unique minimal automata.
However, moving from register automata to nominal automata can lead to an exponential blow-up in the number of states. However, moving from register automata to nominal automata can lead to an exponential blow-up in the number of states \citep[BojanczykKL14].
\footnote{Here \quote{number of states} refers to the number of orbits in the state space.} \footnote{Here \quote{number of states} refers to the number of orbits in the state space.}
As a motivating example, we consider a language modelling an $n$-bounded FIFO queue. As a motivating example, we consider a language modelling an $n$-bounded FIFO queue.

View file

@ -12,9 +12,8 @@
\usebtxdataset[../biblio-web.bib] \usebtxdataset[../biblio-web.bib]
\setupbtx[apa:cite][etallimit=2,separator:4={ and }] \setupbtx[apa:cite][etallimit=2,separator:4={ and }]
\setupbtx[apa:cite:url[left={}, right={}] % interaction in de lijst lijkt stuk te zijn? Dus ik zet het uit
% interaction in de lijst lijkt stuk te zijn? \setupbtx[apa:list][interaction=stop]
\setupbtx[apa:list][interaction=no]
\setupbtxrendering[pagestate=start] % print pagina nummers \setupbtxrendering[pagestate=start] % print pagina nummers
\setupbtxlist[apa][margin=1.75em] \setupbtxlist[apa][margin=1.75em]
@ -32,6 +31,4 @@
\def\citep[#1]{\cite[authoryear][#1]} % (Bla and Foo, 2015) \def\citep[#1]{\cite[authoryear][#1]} % (Bla and Foo, 2015)
\def\citeurl[#1]{\cite[alternative=url,left=,right=][#1]} % url \def\citeurl[#1]{\cite[alternative=url,left=,right=][#1]} % url
%\cite[lefttext={See },righttext={ p.\nbsp yy}][book] -> (See Bla, Foo & Baz, 2015, p. yy)
\stopenvironment \stopenvironment

View file

@ -5,7 +5,7 @@
%\setupsynctex[state=start] %\setupsynctex[state=start]
% Klikbare referenties % Klikbare referenties
\setupinteraction[state=start,focus=standard,contrastcolor=darkgreen] \setupinteraction[state=start, focus=standard, contrastcolor=darkgreen, style=normal]
% Bookmarks in de pdf % Bookmarks in de pdf
\placebookmarks[chapter,section] \placebookmarks[chapter,section]