Adds some documentation in latex

2025-04-27 06:57:44 +02:00 · 2015-03-30 11:40:35 +02:00 · 2015-03-30 11:40:35 +02:00 · ad58d6ac07
commit ad58d6ac07
parent 46540d62c7
2 changed files with 185 additions and 2 deletions
--- a/.gitignore
+++ b/.gitignore
@ -1,8 +1,23 @@
 # mac
 .DS_Store
-*.dot
+# qt creator/cmake
-*.png
+*.user
 build
 # latex
 *.aux
 *.tdo
 *.log
 # output
 *.pdf
 *.png
 # data
 .dot
 # our own tools
 *_seq
 *splitting_tree
 *test_suite
--- a/docs/explanation.tex
+++ b/docs/explanation.tex
@ -0,0 +1,168 @@
 \documentclass[envcountsame]{llncs}
 \usepackage{amsmath}
 \usepackage[backgroundcolor=white]{todonotes}
 \newcommand{\Def}[1]{\emph{#1}}
 \newcommand{\bigO}{\mathcal{O}}
 \begin{document}
 \maketitle
 \section{Introduction}
 Recently automata learning has gained popularity. Learning algorithms are
 applied to real world systems (systems under learning, or SUL for short) and
 this shows some problems.
 In the classical active learning algorithms such as $L^\ast$ one supposes a
 teacher to which the algorithm can ask \Def{membership queries} and \Def{
 equivalence queries}. In the former case the algorithms asks the teacher for the
 output (sequence) for a given input sequence. In the latter case the algorithm
 provides the teacher with a hypothesis and the teacher answers with either an
 input sequence on which the hypothesis behaves differently than the SUL or
 answers affirmatively in the case the machines are behaviorally equivalent.
 In real world applications we have to implement the teacher ourselves, despite
 the fact that we do not know all the details of the SUL. The membership queries
 are easily implemented by resetting the machine and applying the input. The
 equivalence queries, however, are often impossible to implement. Instead, we
 have to resort to some sort of random testing. Doing random testing naively is
 of course hopeless, as the state space is often too big. Luckily we we have a
 hypothesis at hand, we can use for model based testing.
 One standard framework for model based testing is pioneered by Chow and
 Vasilevski. Briefly, the framework supposes prefix sequences which allows us to
 go from the initial state to a given state $s$ (in the model) or even a given
 transition $t \to s$, and suffix sequences which test whether the machine
 actually is in state $s$. If we have the right suffixes and test every
 transition of the model, we can ensure that the SUL is either equivalent or has
 strictly more states. Such a test suite can be constructed with a size
 polynomially in the number of states of the model. This is contrary to
 exhaustive testing or (naive) random testing, where there are exponentially many
 sequences.
 For the prefixes we can use any single source shortest path algorithm. In fact,
 if we restrict ourselves to the above framework, this is the best we can do.
 This gives $n$ sequences of length at most $n-1$ (in fact the total sum is at
 most $\frac{1}{2}n(n-1)$).
 For the suffixes, we can use the standard Hopcroft algorithm to generate
 seperating sequences. If we want to test a given state $s$, we take the set of
 suffixes (we allow the set of suffixes to depend on the state we want to test)
 to be all seperating sequences for all other states $t$. This set has at most $n
 -1$ elements of at most length $n$, again the total sum is $\frac{1}{2}n(n-1)$.
 A natural question arises: can we do better?
 In the presence of a distinguishing sequence, Lee and Yannakakis prove that one
 can take a set of suffixes of just $1$ element of at most length $\frac{1}{2}n(n
 -1)$. This does not provide an improvement in the worst case scenario. Even
 worse, such a sequence might not exist.
 In this paper we propose a testing algorithm which combines the two methods
 described above. The distinguishing sequence might not exist, but the tree
 constructed during the Lee and Yannakakis provides a lot of information which we
 can complement with the classical Hopcroft approach. Despite the fact that this
 is not an improvement in the worst case scenario, this hybrid method enabled us
 to learn an industrial grade machine, which was infeasible to learn with the
 standard methods provided by LearnLib.
 \section{Preliminaries}
 We restrict our attention to \Def{Mealy machines}. Let $I$ (resp. $O$) denote
 the finite set of inputs (resp. outputs). Then a Mealy machine $M$ consists of a
 set of states $S$ with a initial states $s_0$, together with a transition
 function $\delta : I \times S \to S$ and an output function $\lambda : I \times
 S \to O$. Note that we assume machines to be deterministic and total. We also
 assume that our system under learning is a Mealy machine. Both functions $\delta
 $ and $\lambda$ are extended to words in $I^\ast$.
 We are in the context of learning, so we will generally denote the hypothesis by
 $H$ and the system under learning by $SUL$. Note that we can assume $H$ to be
 minimal and reachable.
 We assume that the alphabets $I$ and $O$ are fixed in these notes.
 \begin{definition}
 	A \Def{set of words $X$ (over $I$)} is a subset $X \subset I^\ast$.
 	Given a set of states $S$, then a \Def{family of sets $X$ (over $I$)} is a
 	collection $X = \{X_s\}_{s \in S}$ where each $X_s$ is a set of words.
 \end{definition}
 All words we will consider are over $I$, so we will refer to them as simply
 words. The idea of a family of sets was also introduced by Fujiwara. They are
 used to collect sequences which are relevant for a certain state. We define
 some operations on sets and families:
 \newcommand{\tensor}{\otimes}
 \begin{itemize}
 	\item Let $X$ and $Y$ be two sets of words over $I$, then $X \cdot Y$ is the
 	set of all concatenations: $X \cdot Y = \{ x y \,|\, x \in X, y \in Y \}$.
 	\item Let $X^n = X \cdots X$ denote the iterated concatenation and $X^{\leq k}
 	= \bigcup_{n \leq k} X^n$ all concatenations of at most length $k$.
 	In particular $I^n$ are all words of length
 	precisely $n$ and $I^{\leq k}$ are all words of length at most $k$.
 	\item Let $X = \{ X_s \}_{s \in S}$ and $Y = \{ Y_s \}_{s \in S}$ be two
 	families of sets. We define a new family of	words $X \tensor_H Y$ as
 	$(X \tensor_H Y)_s = \{ x y \,|\, x \in X_s, y \in Y_{\delta(s, x)} \}$.
 	Note that this depends on the transitions in the machine $H$.
 	\item Let $X$ be a family and $Y$ just a set of words,
 	then the usual concatenation is defined as $(X \cdot Y)_s = X_s \cdot Y$.
 	\item Let $X$ be a family of sets, then the union $\bigcup X$ forms a set
 	of words.
 \end{itemize}
 Let $H$ be a fixed machine and let $\tensor$ denote $\tensor_H$. We define some
 useful sets (which depend on $H$):
 \begin{itemize}
 	\item The set of prefixes $P_s = \{ x \,|\, \text{a shortest } x \text{ such
 	that } \delta(s, x) = t, t \in S \}$. Note that $P_{s_0}$ is particularly
 	interesting. These sets can be constructed by any shortest path algorithm.
 	Note that $P \cdot I$ is a set covering all transitions in $H$.
 	\item The set $W_s = \{ x \,|\, x \text{ seperates } s \text{ and } t, t \in
 	S\}$. This can be constructed using Hopcroft's algorithm or Gill's algorithm
 	if one wants minimal separating sequences.
 	\item If $x$ is an adaptive distinguishing sequence in the sense of Lee and
 	Yannakakis, and let $x_s$ denote the associated UIO for state $x$, we
 	define $Z_s = \{ x_s \}$.
 \end{itemize}
 We obtain different methods (note that all test suites are expressed as families
 of sets $X$, the actual test suite is $X_{s_0}$):
 \begin{itemize}
 	\item The originial Chow and Vasilevski (W-method) test suite is given by:
 	$$ P \cdot I^{\leq k+1} \cdot \bigcup W $$
 	which distinguishes $H$ from any non-equivalent machine with at
 	most $|S| + k$ states.
 	\item The Wp-method as described by Fujiwara:
 	$$ (P \cdot I^{\leq k} \cdot \bigcup W) \cup (P \cdot I^{\leq k+1} \tensor W) $$
 	which is a smaller test suite than the W-method, but just as strong. Note
 	that the original description by Fujiwara is more detailed in order to
 	reduce redundancy.
 	\item The method proposed by Lee and Yannakakis:
 	$$ P \cdot I^{\leq k+1} \tensor Z $$
 	which is as big as the Wp-method in the worst case (if it even exists) and
 	just as strong.
 \end{itemize}
 An important observation is that the size of $P$, $W$ and $Z$ are polynomially
 in the number of states of $H$, but that the middle part $I^{\leq k+1}$ is
 exponential. If the numbers of states of $SUL$ is known, one can perform a (big)
 exhaustive test. In practice this is not known or has a very large bound. To
 mitigate this we can exhaust $I^{\leq 1}$ and then resort to randomly sample $I
 ^\ast$. It is in this sampling phase that we want $W$ and $Z$ to contain the
 least number of elements as every element contributes to the exponential blowup.
 Also note that $W$ can also be constructed in different ways. For example taking
 $W_s = \{ u_s \}$, where $u_s$ is a UIO for state $s$ (assuming they all exist)
 gives valid variants of the first two methods. Also if an adaptive
 distinguishing sequence exists, all states have UIOs and we can use the first
 two methods. The third method, however, is slightly smallar as we do not need
 $\bigcup W$ in this case, because the UIOs constructed from an adaptive
 distinguishing sequence share (non-empty) prefixes.
 % fix from http://tex.stackexchange.com/questions/103735/list-of-todos-todonotes-is-empty-with-llncs
 \setcounter{tocdepth}{1}
 \listoftodos
 \end{document}