mirror of
https://github.com/Jaxan/hybrid-ads.git
synced 2025-04-27 23:17:44 +02:00
Moved the (outdated) docs to other repositories
This commit is contained in:
parent
613dd45977
commit
d44610e4d6
2 changed files with 0 additions and 229 deletions
|
@ -1,168 +0,0 @@
|
||||||
\documentclass[envcountsame]{llncs}
|
|
||||||
|
|
||||||
\usepackage{amsmath}
|
|
||||||
\usepackage[backgroundcolor=white]{todonotes}
|
|
||||||
|
|
||||||
\newcommand{\Def}[1]{\emph{#1}}
|
|
||||||
\newcommand{\bigO}{\mathcal{O}}
|
|
||||||
|
|
||||||
\begin{document}
|
|
||||||
\maketitle
|
|
||||||
|
|
||||||
\section{Introduction}
|
|
||||||
|
|
||||||
Recently automata learning has gained popularity. Learning algorithms are
|
|
||||||
applied to real world systems (systems under learning, or SUL for short) and
|
|
||||||
this shows some problems.
|
|
||||||
|
|
||||||
In the classical active learning algorithms such as $L^\ast$ one supposes a
|
|
||||||
teacher to which the algorithm can ask \Def{membership queries} and \Def{
|
|
||||||
equivalence queries}. In the former case the algorithms asks the teacher for the
|
|
||||||
output (sequence) for a given input sequence. In the latter case the algorithm
|
|
||||||
provides the teacher with a hypothesis and the teacher answers with either an
|
|
||||||
input sequence on which the hypothesis behaves differently than the SUL or
|
|
||||||
answers affirmatively in the case the machines are behaviorally equivalent.
|
|
||||||
|
|
||||||
In real world applications we have to implement the teacher ourselves, despite
|
|
||||||
the fact that we do not know all the details of the SUL. The membership queries
|
|
||||||
are easily implemented by resetting the machine and applying the input. The
|
|
||||||
equivalence queries, however, are often impossible to implement. Instead, we
|
|
||||||
have to resort to some sort of random testing. Doing random testing naively is
|
|
||||||
of course hopeless, as the state space is often too big. Luckily we we have a
|
|
||||||
hypothesis at hand, we can use for model based testing.
|
|
||||||
|
|
||||||
One standard framework for model based testing is pioneered by Chow and
|
|
||||||
Vasilevski. Briefly, the framework supposes prefix sequences which allows us to
|
|
||||||
go from the initial state to a given state $s$ (in the model) or even a given
|
|
||||||
transition $t \to s$, and suffix sequences which test whether the machine
|
|
||||||
actually is in state $s$. If we have the right suffixes and test every
|
|
||||||
transition of the model, we can ensure that the SUL is either equivalent or has
|
|
||||||
strictly more states. Such a test suite can be constructed with a size
|
|
||||||
polynomially in the number of states of the model. This is contrary to
|
|
||||||
exhaustive testing or (naive) random testing, where there are exponentially many
|
|
||||||
sequences.
|
|
||||||
|
|
||||||
For the prefixes we can use any single source shortest path algorithm. In fact,
|
|
||||||
if we restrict ourselves to the above framework, this is the best we can do.
|
|
||||||
This gives $n$ sequences of length at most $n-1$ (in fact the total sum is at
|
|
||||||
most $\frac{1}{2}n(n-1)$).
|
|
||||||
|
|
||||||
For the suffixes, we can use the standard Hopcroft algorithm to generate
|
|
||||||
seperating sequences. If we want to test a given state $s$, we take the set of
|
|
||||||
suffixes (we allow the set of suffixes to depend on the state we want to test)
|
|
||||||
to be all seperating sequences for all other states $t$. This set has at most $n
|
|
||||||
-1$ elements of at most length $n$, again the total sum is $\frac{1}{2}n(n-1)$.
|
|
||||||
A natural question arises: can we do better?
|
|
||||||
|
|
||||||
In the presence of a distinguishing sequence, Lee and Yannakakis prove that one
|
|
||||||
can take a set of suffixes of just $1$ element of at most length $\frac{1}{2}n(n
|
|
||||||
-1)$. This does not provide an improvement in the worst case scenario. Even
|
|
||||||
worse, such a sequence might not exist.
|
|
||||||
|
|
||||||
In this paper we propose a testing algorithm which combines the two methods
|
|
||||||
described above. The distinguishing sequence might not exist, but the tree
|
|
||||||
constructed during the Lee and Yannakakis provides a lot of information which we
|
|
||||||
can complement with the classical Hopcroft approach. Despite the fact that this
|
|
||||||
is not an improvement in the worst case scenario, this hybrid method enabled us
|
|
||||||
to learn an industrial grade machine, which was infeasible to learn with the
|
|
||||||
standard methods provided by LearnLib.
|
|
||||||
|
|
||||||
|
|
||||||
\section{Preliminaries}
|
|
||||||
|
|
||||||
We restrict our attention to \Def{Mealy machines}. Let $I$ (resp. $O$) denote
|
|
||||||
the finite set of inputs (resp. outputs). Then a Mealy machine $M$ consists of a
|
|
||||||
set of states $S$ with a initial states $s_0$, together with a transition
|
|
||||||
function $\delta : I \times S \to S$ and an output function $\lambda : I \times
|
|
||||||
S \to O$. Note that we assume machines to be deterministic and total. We also
|
|
||||||
assume that our system under learning is a Mealy machine. Both functions $\delta
|
|
||||||
$ and $\lambda$ are extended to words in $I^\ast$.
|
|
||||||
|
|
||||||
We are in the context of learning, so we will generally denote the hypothesis by
|
|
||||||
$H$ and the system under learning by $SUL$. Note that we can assume $H$ to be
|
|
||||||
minimal and reachable.
|
|
||||||
|
|
||||||
We assume that the alphabets $I$ and $O$ are fixed in these notes.
|
|
||||||
\begin{definition}
|
|
||||||
A \Def{set of words $X$ (over $I$)} is a subset $X \subset I^\ast$.
|
|
||||||
|
|
||||||
Given a set of states $S$, then a \Def{family of sets $X$ (over $I$)} is a
|
|
||||||
collection $X = \{X_s\}_{s \in S}$ where each $X_s$ is a set of words.
|
|
||||||
\end{definition}
|
|
||||||
|
|
||||||
All words we will consider are over $I$, so we will refer to them as simply
|
|
||||||
words. The idea of a family of sets was also introduced by Fujiwara. They are
|
|
||||||
used to collect sequences which are relevant for a certain state. We define
|
|
||||||
some operations on sets and families:
|
|
||||||
\newcommand{\tensor}{\otimes}
|
|
||||||
\begin{itemize}
|
|
||||||
\item Let $X$ and $Y$ be two sets of words over $I$, then $X \cdot Y$ is the
|
|
||||||
set of all concatenations: $X \cdot Y = \{ x y \,|\, x \in X, y \in Y \}$.
|
|
||||||
\item Let $X^n = X \cdots X$ denote the iterated concatenation and $X^{\leq k}
|
|
||||||
= \bigcup_{n \leq k} X^n$ all concatenations of at most length $k$.
|
|
||||||
In particular $I^n$ are all words of length
|
|
||||||
precisely $n$ and $I^{\leq k}$ are all words of length at most $k$.
|
|
||||||
\item Let $X = \{ X_s \}_{s \in S}$ and $Y = \{ Y_s \}_{s \in S}$ be two
|
|
||||||
families of sets. We define a new family of words $X \tensor_H Y$ as
|
|
||||||
$(X \tensor_H Y)_s = \{ x y \,|\, x \in X_s, y \in Y_{\delta(s, x)} \}$.
|
|
||||||
Note that this depends on the transitions in the machine $H$.
|
|
||||||
\item Let $X$ be a family and $Y$ just a set of words,
|
|
||||||
then the usual concatenation is defined as $(X \cdot Y)_s = X_s \cdot Y$.
|
|
||||||
\item Let $X$ be a family of sets, then the union $\bigcup X$ forms a set
|
|
||||||
of words.
|
|
||||||
\end{itemize}
|
|
||||||
|
|
||||||
Let $H$ be a fixed machine and let $\tensor$ denote $\tensor_H$. We define some
|
|
||||||
useful sets (which depend on $H$):
|
|
||||||
\begin{itemize}
|
|
||||||
\item The set of prefixes $P_s = \{ x \,|\, \text{a shortest } x \text{ such
|
|
||||||
that } \delta(s, x) = t, t \in S \}$. Note that $P_{s_0}$ is particularly
|
|
||||||
interesting. These sets can be constructed by any shortest path algorithm.
|
|
||||||
Note that $P \cdot I$ is a set covering all transitions in $H$.
|
|
||||||
\item The set $W_s = \{ x \,|\, x \text{ seperates } s \text{ and } t, t \in
|
|
||||||
S\}$. This can be constructed using Hopcroft's algorithm or Gill's algorithm
|
|
||||||
if one wants minimal separating sequences.
|
|
||||||
\item If $x$ is an adaptive distinguishing sequence in the sense of Lee and
|
|
||||||
Yannakakis, and let $x_s$ denote the associated UIO for state $x$, we
|
|
||||||
define $Z_s = \{ x_s \}$.
|
|
||||||
\end{itemize}
|
|
||||||
|
|
||||||
We obtain different methods (note that all test suites are expressed as families
|
|
||||||
of sets $X$, the actual test suite is $X_{s_0}$):
|
|
||||||
\begin{itemize}
|
|
||||||
\item The originial Chow and Vasilevski (W-method) test suite is given by:
|
|
||||||
$$ P \cdot I^{\leq k+1} \cdot \bigcup W $$
|
|
||||||
which distinguishes $H$ from any non-equivalent machine with at
|
|
||||||
most $|S| + k$ states.
|
|
||||||
\item The Wp-method as described by Fujiwara:
|
|
||||||
$$ (P \cdot I^{\leq k} \cdot \bigcup W) \cup (P \cdot I^{\leq k+1} \tensor W) $$
|
|
||||||
which is a smaller test suite than the W-method, but just as strong. Note
|
|
||||||
that the original description by Fujiwara is more detailed in order to
|
|
||||||
reduce redundancy.
|
|
||||||
\item The method proposed by Lee and Yannakakis:
|
|
||||||
$$ P \cdot I^{\leq k+1} \tensor Z $$
|
|
||||||
which is as big as the Wp-method in the worst case (if it even exists) and
|
|
||||||
just as strong.
|
|
||||||
\end{itemize}
|
|
||||||
|
|
||||||
An important observation is that the size of $P$, $W$ and $Z$ are polynomially
|
|
||||||
in the number of states of $H$, but that the middle part $I^{\leq k+1}$ is
|
|
||||||
exponential. If the numbers of states of $SUL$ is known, one can perform a (big)
|
|
||||||
exhaustive test. In practice this is not known or has a very large bound. To
|
|
||||||
mitigate this we can exhaust $I^{\leq 1}$ and then resort to randomly sample $I
|
|
||||||
^\ast$. It is in this sampling phase that we want $W$ and $Z$ to contain the
|
|
||||||
least number of elements as every element contributes to the exponential blowup.
|
|
||||||
|
|
||||||
Also note that $W$ can also be constructed in different ways. For example taking
|
|
||||||
$W_s = \{ u_s \}$, where $u_s$ is a UIO for state $s$ (assuming they all exist)
|
|
||||||
gives valid variants of the first two methods. Also if an adaptive
|
|
||||||
distinguishing sequence exists, all states have UIOs and we can use the first
|
|
||||||
two methods. The third method, however, is slightly smallar as we do not need
|
|
||||||
$\bigcup W$ in this case, because the UIOs constructed from an adaptive
|
|
||||||
distinguishing sequence share (non-empty) prefixes.
|
|
||||||
|
|
||||||
% fix from http://tex.stackexchange.com/questions/103735/list-of-todos-todonotes-is-empty-with-llncs
|
|
||||||
\setcounter{tocdepth}{1}
|
|
||||||
\listoftodos
|
|
||||||
|
|
||||||
\end{document}
|
|
|
@ -1,61 +0,0 @@
|
||||||
\subsubsection{Augmented DS-method}
|
|
||||||
\label{sec:randomPrefix}
|
|
||||||
|
|
||||||
In order to reduce the number of tests, Chow~\cite{Ch78} and
|
|
||||||
Vasilevskii~\cite{vasilevskii1973failure} pioneered the so called W-method. In
|
|
||||||
their framework a test query consists of a prefix $p$ bringing the SUL to a
|
|
||||||
specific state, a (random) middle part $m$ and a suffix $s$ assuring that the
|
|
||||||
SUL is in the appropriate state. This results in a test suite of the form $P
|
|
||||||
I^{\leq k} W$, where $P$ is a set of (shortest) access sequences, $I^{\leq k}$
|
|
||||||
the set of all sequences of length at most $k$, and $W$ is a characterization
|
|
||||||
set. Classically, this characterization set is constructed by taking the set of
|
|
||||||
all (pairwise) separating sequences. For $k=1$ this test suite is complete in
|
|
||||||
the sense that if the SUL passes all tests, then either the SUL is equivalent to
|
|
||||||
the specification or the SUL has strictly more states than the specification. By
|
|
||||||
increasing $k$ we can check additional states.
|
|
||||||
|
|
||||||
We tried using the W-method as implemented by LearnLib to find counterexamples.
|
|
||||||
The generated test suite, however, was still too big in our learning context.
|
|
||||||
Fujiwara et al \cite{FBKAG91} observed that it is possible to let the set $W$ depend on the state the
|
|
||||||
SUL is supposed to be. This allows us to only take a subset of $W$ which
|
|
||||||
is relevant for a specific state. This slightly reduces the test suite which is
|
|
||||||
as powerful as the full test suite. This methods is known as the Wp-method. More
|
|
||||||
importantly, this observation allows for generalizations where we can carefully
|
|
||||||
pick the suffixes.
|
|
||||||
|
|
||||||
In the presence of an (adaptive) distinguishing sequence one can take $W$ to be a
|
|
||||||
single suffix, greatly reducing the test suite. Lee and Yannakakis \cite{LYa94} describe an
|
|
||||||
algorithm (which we will refer to as the LY algorithm) to efficiently
|
|
||||||
construct this sequence, if it exists. In our case, unfortunately, most
|
|
||||||
hypotheses did not enjoy existence of an adaptive distinguishing sequence. In
|
|
||||||
these cases the incomplete result from the LY algorithm still contained a lot of
|
|
||||||
information which we augmented by pairwise separating sequences.
|
|
||||||
|
|
||||||
\begin{figure}
|
|
||||||
\centering \includegraphics[width=\textwidth]{hyp_20_partial_ds.pdf}
|
|
||||||
\caption{A small part of an incomplete distinguishing sequence as produced by
|
|
||||||
the LY algorithm. Leaves contain a set of possible initial states, inner nodes
|
|
||||||
have input sequences and edges correspond to different output symbols (of
|
|
||||||
which we only drew some), where Q stands for quiescence.}
|
|
||||||
\label{fig:distinguishing-sequence}
|
|
||||||
\end{figure}
|
|
||||||
|
|
||||||
As an example we show an incomplete adaptive distinguishing sequence for one of
|
|
||||||
the hypothesis in Figure~\ref{fig:distinguishing-sequence}. When we apply the
|
|
||||||
input sequence I46 I6.0 I10 I19 I31.0 I37.3 I9.2 and observe outputs O9 O3.3 Q ...
|
|
||||||
O28.0, we know for sure that the SUL was in state 788. Unfortunately not all
|
|
||||||
path lead to a singleton set. When for instance we apply the sequence I46 I6.0
|
|
||||||
I10 and observe the outputs O9 O3.14 Q, we know for sure that the SUL was in one
|
|
||||||
of the states 18, 133, 1287 or 1295. In these cases we have to perform more
|
|
||||||
experiments and we resort to pairwise separating sequences.
|
|
||||||
|
|
||||||
We note that this augmented DS-method is in the worst case not any better than
|
|
||||||
the classical Wp-method. In our case, however, it greatly reduced the test
|
|
||||||
suites.
|
|
||||||
|
|
||||||
Once we have our set of suffixes, which we call $Z$ now, our test algorithm
|
|
||||||
works as follows. The algorithm first exhausts the set $P I^{\leq 1} Z$. If this
|
|
||||||
does not provide a counterexample, we will randomly pick test queries from $P
|
|
||||||
I^2 I^\ast Z$, where the algorithm samples uniformly from $P$, $I^2$ and $Z$ (if
|
|
||||||
$Z$ contains more that $1$ sequence for the supposed state) and with a geometric
|
|
||||||
distribution on $I^\ast$.
|
|
Loading…
Add table
Reference in a new issue