Moved the (outdated) docs to other repositories

2025-07-02 03:27:44 +02:00 · 2015-07-10 13:15:54 +02:00 · 2015-07-10 13:15:54 +02:00 · d44610e4d6
commit d44610e4d6
parent 613dd45977
2 changed files with 0 additions and 229 deletions
--- a/docs/explanation.tex
+++ b/docs/explanation.tex
@ -1,168 +0,0 @@
 \documentclass[envcountsame]{llncs}
 \usepackage{amsmath}
 \usepackage[backgroundcolor=white]{todonotes}
 \newcommand{\Def}[1]{\emph{#1}}
 \newcommand{\bigO}{\mathcal{O}}
 \begin{document}
 \maketitle
 \section{Introduction}
 Recently automata learning has gained popularity. Learning algorithms are
 applied to real world systems (systems under learning, or SUL for short) and
 this shows some problems.
 In the classical active learning algorithms such as $L^\ast$ one supposes a
 teacher to which the algorithm can ask \Def{membership queries} and \Def{
 equivalence queries}. In the former case the algorithms asks the teacher for the
 output (sequence) for a given input sequence. In the latter case the algorithm
 provides the teacher with a hypothesis and the teacher answers with either an
 input sequence on which the hypothesis behaves differently than the SUL or
 answers affirmatively in the case the machines are behaviorally equivalent.
 In real world applications we have to implement the teacher ourselves, despite
 the fact that we do not know all the details of the SUL. The membership queries
 are easily implemented by resetting the machine and applying the input. The
 equivalence queries, however, are often impossible to implement. Instead, we
 have to resort to some sort of random testing. Doing random testing naively is
 of course hopeless, as the state space is often too big. Luckily we we have a
 hypothesis at hand, we can use for model based testing.
 One standard framework for model based testing is pioneered by Chow and
 Vasilevski. Briefly, the framework supposes prefix sequences which allows us to
 go from the initial state to a given state $s$ (in the model) or even a given
 transition $t \to s$, and suffix sequences which test whether the machine
 actually is in state $s$. If we have the right suffixes and test every
 transition of the model, we can ensure that the SUL is either equivalent or has
 strictly more states. Such a test suite can be constructed with a size
 polynomially in the number of states of the model. This is contrary to
 exhaustive testing or (naive) random testing, where there are exponentially many
 sequences.
 For the prefixes we can use any single source shortest path algorithm. In fact,
 if we restrict ourselves to the above framework, this is the best we can do.
 This gives $n$ sequences of length at most $n-1$ (in fact the total sum is at
 most $\frac{1}{2}n(n-1)$).
 For the suffixes, we can use the standard Hopcroft algorithm to generate
 seperating sequences. If we want to test a given state $s$, we take the set of
 suffixes (we allow the set of suffixes to depend on the state we want to test)
 to be all seperating sequences for all other states $t$. This set has at most $n
 -1$ elements of at most length $n$, again the total sum is $\frac{1}{2}n(n-1)$.
 A natural question arises: can we do better?
 In the presence of a distinguishing sequence, Lee and Yannakakis prove that one
 can take a set of suffixes of just $1$ element of at most length $\frac{1}{2}n(n
 -1)$. This does not provide an improvement in the worst case scenario. Even
 worse, such a sequence might not exist.
 In this paper we propose a testing algorithm which combines the two methods
 described above. The distinguishing sequence might not exist, but the tree
 constructed during the Lee and Yannakakis provides a lot of information which we
 can complement with the classical Hopcroft approach. Despite the fact that this
 is not an improvement in the worst case scenario, this hybrid method enabled us
 to learn an industrial grade machine, which was infeasible to learn with the
 standard methods provided by LearnLib.
 \section{Preliminaries}
 We restrict our attention to \Def{Mealy machines}. Let $I$ (resp. $O$) denote
 the finite set of inputs (resp. outputs). Then a Mealy machine $M$ consists of a
 set of states $S$ with a initial states $s_0$, together with a transition
 function $\delta : I \times S \to S$ and an output function $\lambda : I \times
 S \to O$. Note that we assume machines to be deterministic and total. We also
 assume that our system under learning is a Mealy machine. Both functions $\delta
 $ and $\lambda$ are extended to words in $I^\ast$.
 We are in the context of learning, so we will generally denote the hypothesis by
 $H$ and the system under learning by $SUL$. Note that we can assume $H$ to be
 minimal and reachable.
 We assume that the alphabets $I$ and $O$ are fixed in these notes.
 \begin{definition}
 	A \Def{set of words $X$ (over $I$)} is a subset $X \subset I^\ast$.
 	Given a set of states $S$, then a \Def{family of sets $X$ (over $I$)} is a
 	collection $X = \{X_s\}_{s \in S}$ where each $X_s$ is a set of words.
 \end{definition}
 All words we will consider are over $I$, so we will refer to them as simply
 words. The idea of a family of sets was also introduced by Fujiwara. They are
 used to collect sequences which are relevant for a certain state. We define
 some operations on sets and families:
 \newcommand{\tensor}{\otimes}
 \begin{itemize}
 	\item Let $X$ and $Y$ be two sets of words over $I$, then $X \cdot Y$ is the
 	set of all concatenations: $X \cdot Y = \{ x y \,|\, x \in X, y \in Y \}$.
 	\item Let $X^n = X \cdots X$ denote the iterated concatenation and $X^{\leq k}
 	= \bigcup_{n \leq k} X^n$ all concatenations of at most length $k$.
 	In particular $I^n$ are all words of length
 	precisely $n$ and $I^{\leq k}$ are all words of length at most $k$.
 	\item Let $X = \{ X_s \}_{s \in S}$ and $Y = \{ Y_s \}_{s \in S}$ be two
 	families of sets. We define a new family of	words $X \tensor_H Y$ as
 	$(X \tensor_H Y)_s = \{ x y \,|\, x \in X_s, y \in Y_{\delta(s, x)} \}$.
 	Note that this depends on the transitions in the machine $H$.
 	\item Let $X$ be a family and $Y$ just a set of words,
 	then the usual concatenation is defined as $(X \cdot Y)_s = X_s \cdot Y$.
 	\item Let $X$ be a family of sets, then the union $\bigcup X$ forms a set
 	of words.
 \end{itemize}
 Let $H$ be a fixed machine and let $\tensor$ denote $\tensor_H$. We define some
 useful sets (which depend on $H$):
 \begin{itemize}
 	\item The set of prefixes $P_s = \{ x \,|\, \text{a shortest } x \text{ such
 	that } \delta(s, x) = t, t \in S \}$. Note that $P_{s_0}$ is particularly
 	interesting. These sets can be constructed by any shortest path algorithm.
 	Note that $P \cdot I$ is a set covering all transitions in $H$.
 	\item The set $W_s = \{ x \,|\, x \text{ seperates } s \text{ and } t, t \in
 	S\}$. This can be constructed using Hopcroft's algorithm or Gill's algorithm
 	if one wants minimal separating sequences.
 	\item If $x$ is an adaptive distinguishing sequence in the sense of Lee and
 	Yannakakis, and let $x_s$ denote the associated UIO for state $x$, we
 	define $Z_s = \{ x_s \}$.
 \end{itemize}
 We obtain different methods (note that all test suites are expressed as families
 of sets $X$, the actual test suite is $X_{s_0}$):
 \begin{itemize}
 	\item The originial Chow and Vasilevski (W-method) test suite is given by:
 	$$ P \cdot I^{\leq k+1} \cdot \bigcup W $$
 	which distinguishes $H$ from any non-equivalent machine with at
 	most $|S| + k$ states.
 	\item The Wp-method as described by Fujiwara:
 	$$ (P \cdot I^{\leq k} \cdot \bigcup W) \cup (P \cdot I^{\leq k+1} \tensor W) $$
 	which is a smaller test suite than the W-method, but just as strong. Note
 	that the original description by Fujiwara is more detailed in order to
 	reduce redundancy.
 	\item The method proposed by Lee and Yannakakis:
 	$$ P \cdot I^{\leq k+1} \tensor Z $$
 	which is as big as the Wp-method in the worst case (if it even exists) and
 	just as strong.
 \end{itemize}
 An important observation is that the size of $P$, $W$ and $Z$ are polynomially
 in the number of states of $H$, but that the middle part $I^{\leq k+1}$ is
 exponential. If the numbers of states of $SUL$ is known, one can perform a (big)
 exhaustive test. In practice this is not known or has a very large bound. To
 mitigate this we can exhaust $I^{\leq 1}$ and then resort to randomly sample $I
 ^\ast$. It is in this sampling phase that we want $W$ and $Z$ to contain the
 least number of elements as every element contributes to the exponential blowup.
 Also note that $W$ can also be constructed in different ways. For example taking
 $W_s = \{ u_s \}$, where $u_s$ is a UIO for state $s$ (assuming they all exist)
 gives valid variants of the first two methods. Also if an adaptive
 distinguishing sequence exists, all states have UIOs and we can use the first
 two methods. The third method, however, is slightly smallar as we do not need
 $\bigcup W$ in this case, because the UIOs constructed from an adaptive
 distinguishing sequence share (non-empty) prefixes.
 % fix from http://tex.stackexchange.com/questions/103735/list-of-todos-todonotes-is-empty-with-llncs
 \setcounter{tocdepth}{1}
 \listoftodos
 \end{document}
--- a/docs/test_selection.tex
+++ b/docs/test_selection.tex
@ -1,61 +0,0 @@
 \subsubsection{Augmented DS-method}
 \label{sec:randomPrefix}
 In order to reduce the number of tests, Chow~\cite{Ch78} and
 Vasilevskii~\cite{vasilevskii1973failure} pioneered the so called W-method. In
 their framework a test query consists of a prefix $p$ bringing the SUL to a
 specific state, a (random) middle part $m$ and a suffix $s$ assuring that the
 SUL is in the appropriate state. This results in a test suite of the form $P
 I^{\leq k} W$, where $P$ is a set of (shortest) access sequences, $I^{\leq k}$
 the set of all sequences of length at most $k$, and $W$ is a characterization
 set. Classically, this characterization set is constructed by taking the set of
 all (pairwise) separating sequences. For $k=1$ this test suite is complete in
 the sense that if the SUL passes all tests, then either the SUL is equivalent to
 the specification or the SUL has strictly more states than the specification. By
 increasing $k$ we can check additional states.
 We tried using the W-method as implemented by LearnLib to find counterexamples.
 The generated test suite, however, was still too big in our learning context.
 Fujiwara et al \cite{FBKAG91} observed that it is possible to let the set $W$ depend on the state the
 SUL is supposed to be. This allows us to only take a subset of $W$ which
 is relevant for a specific state. This slightly reduces the test suite which is
 as powerful as the full test suite. This methods is known as the Wp-method. More
 importantly, this observation allows for generalizations where we can carefully
 pick the suffixes.
 In the presence of an (adaptive) distinguishing sequence one can take $W$ to be a
 single suffix, greatly reducing the test suite. Lee and Yannakakis \cite{LYa94} describe an
 algorithm (which we will refer to as the LY algorithm) to efficiently
 construct this sequence, if it exists. In our case, unfortunately, most
 hypotheses did not enjoy existence of an adaptive distinguishing sequence. In
 these cases the incomplete result from the LY algorithm still contained a lot of
 information which we augmented by pairwise separating sequences.
 \begin{figure}
  \centering \includegraphics[width=\textwidth]{hyp_20_partial_ds.pdf}
  \caption{A small part of an incomplete distinguishing sequence as produced by
  the LY algorithm. Leaves contain a set of possible initial states, inner nodes
  have input sequences and edges correspond to different output symbols (of
  which we only drew some), where Q stands for quiescence.}
  \label{fig:distinguishing-sequence}
 \end{figure}
 As an example we show an incomplete adaptive distinguishing sequence for one of
 the hypothesis in Figure~\ref{fig:distinguishing-sequence}. When we apply the
 input sequence I46 I6.0 I10 I19 I31.0 I37.3 I9.2 and observe outputs O9 O3.3 Q ...
 O28.0, we know for sure that the SUL was in state 788. Unfortunately not all
 path lead to a singleton set. When for instance we apply the sequence I46 I6.0
 I10 and observe the outputs O9 O3.14 Q, we know for sure that the SUL was in one
 of the states 18, 133, 1287 or 1295. In these cases we have to perform more
 experiments and we resort to pairwise separating sequences.
 We note that this augmented DS-method is in the worst case not any better than
 the classical Wp-method. In our case, however, it greatly reduced the test
 suites.
 Once we have our set of suffixes, which we call $Z$ now, our test algorithm
 works as follows. The algorithm first exhausts the set $P I^{\leq 1} Z$. If this
 does not provide a counterexample, we will randomly pick test queries from $P
 I^2 I^\ast Z$, where the algorithm samples uniformly from $P$, $I^2$ and $Z$ (if
 $Z$ contains more that $1$ sequence for the supposed state) and with a geometric
 distribution on $I^\ast$.