phd-thesis/content/test-methods.tex

\project thesis
\startcomponent test-methods

\startchapter
  [title={FSM-based Test Methods},
   reference=chap:test-methods]

\startsection
  [title={Preliminaries}]

In this chapter we will discuss some of the theory of test generation methods.
From this theoretical discussion we derive a new algorithm: the \emph{hybrid ads methods}, which is applied on a case study in \in{Chapter}[chap:applying-automata-learning].
A key aspect of such methods is the size of the obtained test suite.
On one hand we want to cover as much as the specification as possible.
On the other hand: testing takes time, so we want to minimise the size of a test suite.

Challenges such as scalability remain an issue in the field.
As we see in the case study of the next chapter, real world software components can consist of thousands of states.
Such amounts of states go beyond the comparative studies of, for example, \citet[DBLP:journals/infsof/DorofeevaEMCY10, DBLP:journals/infsof/EndoS13].


\startsubsection
  [title={Words and Mealy machines}]

We will focus on Mealy machines, as those capture many protocol specifications and reactive systems.
Note that we restrict ourselves to deterministic and complete machines.

We fix finite alphabets $I$ and $O$ of inputs respectively outputs.
We use the usual notation for operations on words:
$uv$ for the concatenation of two words $u, v \in I^{\ast}$ and $|u|$ for the length of $u$.

\startdefinition
A \emph{Mealy machine} $M$ consists of a finite set of \emph{states} $S$, an \emph{initial state} $s_0 \in S$ and two functions:
\startitemize
\item a \emph{transition function} $\delta \colon S \times I \to S$, and
\item an \emph{output function} $\lambda \colon S \times I \to O$.
\stopitemize

Both functions are extended to words as $\delta \colon S \times I^{\ast} \to S$ and $\lambda \colon S \times I^{\ast} \to O^{\ast}$.
By abuse of notation we will often write $s \in M$ instead of $s \in S$ and leave out $S$ altogether.
For a second Mealy machine $M'$ its members are denoted $S', s'_0, \delta'$ and $\lambda'$ by convention.

The \emph{behaviour} of a state $s$ is given by the output function $\lambda(s, -) \colon I^{\ast} \to O^{\ast}$.
Two states $s$ and $t$ are \emph{equivalent} if they have equal behaviours, written $s \sim t$, and two Mealy machines are equivalent if their initial states are equivalent.
\stopdefinition

{\todo{Voorbeeld gebruiken in komende sectie: uios: $0: aa$, $1: a$, $2$ geen uio, $3: v$ en $4: ac$. Een char. set is $\{ aa, c, ac \}$. Geen ADS (want $2$ geen uio).}}
An example Mealy machine is given in \in{Figure}[fig:running-example].

\startplacefigure
  [title={An example specification with input $I=\{a,b,c\}$ and output $O=\{0,1\}$.},
   list={An example specification.},
   reference=fig:running-example]
\hbox{
\starttikzpicture[node distance=0.9cm and 3cm,bend angle=20]
	\node[state] (0) {$s_0$};
	\node[state] (1) [above left = of 0] {$s_1$};
	\node[state] (2) [below left = of 0] {$s_2$};
	\node[state] (3) [above right = of 0] {$s_3$};
	\node[state] (4) [below right = of 0] {$s_4$};
	\node        (5) [above = of 0] {};
	\path[->]
	(5) edge (0)
	(0) edge [bend left=]    node [below] {${a}/1$} (1)
	(0) edge [bend right]    node [below] {${b}/0$} (4)
	(0) edge [loop below]    node [below] {${c}/0$} (0)
	(1) edge                 node [left ] {${a}/0$} (2)
	(1) edge [bend left]     node [above] {${b}/0$, ${c}/0$} (0)
	(2) edge [bend right]    node [below] {${b}/0$, ${c}/0$} (0)
	(2) edge [loop left]     node [left ] {${a}/1$} (1)
	(3) edge [bend left=30]  node [right] {${a}/1$} (4)
	(3) edge [bend right]    node [above] {${b}/0$} (0)
	(3) edge [loop right]    node [right] {${c}/1$} (3)
	(4) edge [bend left]     node [right] {${a}/1$} (3)
	(4) edge [loop right]    node [right] {${b}/1$} (4)
	(4) edge [bend right]    node [above] {${c}/0$} (0);
\stoptikzpicture
}
\stopplacefigure


\stopsubsection
\startsubsection
  [title={Testing}]

In conformance testing we have a specification modelled as a Mealy machine and an implementation (the system under test, or SUT) which we assume to behave as a Mealy machine \cite[DBLP:journals/tc/LeeY94].
Tests are generated from the specification and applied to the implementation. We assume that we can reset the implementation before every test.
If the output is different than the specified output, then we know the implementation is flawed.
The goals is to test as little as possible, while covering as much as possible.
\todo{Aannames in een itemize zetten en in rel/fut work bespreken.}

\startdefinition[reference=test-suite]
A \emph{test suite} is a finite subset $T \subseteq I^{\ast}$.
A test $t \in T$ is called \emph{maximal} if it is not a proper prefix of another $s \in T$.
We denote the set of maximal tests of $T$ with $max(T)$.
\todo{In de voorbeelden geef ik altijd $max(X)$,
	maar in de bewijzen is juist $Pref(X)$ handig.
	Moet dit consistent houden...}
\stopdefinition

The maximal tests are the only tests in $T$ we actually have to apply to our SUT as we can record the intermediate outputs.
Also note that we do not have to consider outputs in the test suite, as those follow from the deterministic specification.
We define the size of a test suite as usual \cite[DBLP:journals/infsof/DorofeevaEMCY10, DBLP:conf/hase/Petrenko0GHO14].
The size of a test suite is measured as the sum of the lengths of all its maximal tests plus one reset per test.

\startdefinition[reference=test-suite-size]
The \emph{size} of a test suite $T$ is defined to be
$||T|| = \sum\limits_{t \in max(T)} (|t| + 1)$.
\stopdefinition


\stopsubsection
\startsubsection
  [title={Completeness of test suites}]
\startexample[reference=incompleteness]
{\bf No test suite is complete.}
Consider the specification in \in{Figure}{a}[fig:incompleteness-example].
This machine will always produce zeroes.
For any test suite we can make a faulty implementation which passes the test suite.
Such an implementation might look like \in{Figure}{b}[fig:incompleteness-example] with $n$ big enough.
This justifies the following definition.

\startplacefigure
  [title={A basic example showing that finite test suites are incomplete. The implementation on the right will pass any test suite if we choose $n$ big enough.},
   list={A basic example showing that finite test suites are incomplete.},
   reference=fig:incompleteness-example]
\startcombination[2*1]
{\hbox{
	\starttikzpicture
	\node[state] (0) {$s_0$};
	\path[->]
	(0) edge [loop] node [below] {${a}/0$} (0);
	\stoptikzpicture
}} {(a)}
{\hbox{
	\starttikzpicture
	\node[state] (0) {$s'_0$};
	\node[state] (1) [right of=0] {$s'_1$};
	\node        (2) [right of=1] {$\cdots$};
	\node[state] (3) [right of=2] {$s'_n$};
	\path[->]
	(0) edge [bend left=20 ] node [below] {${a}/0$} (1)
	(1) edge [bend left=20 ] node [below] {${a}/0$} (2)
	(2) edge [bend left=20 ] node [below] {${a}/0$} (3)
	(3) edge [loop         ] node [below] {${a}/1$} (3);
	\stoptikzpicture
}} {(b)}
\stopcombination
\stopplacefigure
\stopexample


\startdefinition[reference=completeness]
Let $M$ be a Mealy machine and $T$ be a test suite.
We say that $T$ is \emph{$m$-complete (for $M$)} if for all inequivalent machines $M'$ with at most $m$ states there exists a $t \in T$ such that $\lambda(s_0, t) \neq \lambda'(s'_0, t)$.
\stopdefinition

We are often interested in the case of $m$-completeness, where $m = n + k$ for some $k \in \naturalnumbers$ and $n$ is the number of states in the specification.
Here $k$ will stand for the number of \emph{extra states} we can test.
The issue of an unknown bound is addressed later in the paper.


\stopsubsection
\startsubsection
  [title={Separating Sequences}]

Before we construct test suites, we discuss several types of useful sequences.
All the following notions are standard in the literature, and the corresponding references will be given in \in{Section}[sec:methods], where we discuss the test generation methods using these notions.
We fix a Mealy machine $M$.
For convenience we assume $M$ to be minimal, this implies that distinct states are, in fact, inequivalent.
All definitions can be generalised to non-minimal $M$, by replacing distinct (or other) with inequivalent.

\startitemize[after, before]
\item Given two states $s, t$ of $M$ we say that $w$ is a \defn{separating sequence} if $\lambda(s, w) \neq \lambda(t, w)$.
\item For a single state $s$, a sequence $w$ is a \defn{unique input output sequence (UIO)} if for every other state $t$ we have $\lambda(s, w) \neq \lambda(t, w)$.
\item Finally, a \defn{(preset) distinguishing sequence (DS)} is a single sequence $w$ which separates all states, i.e., for every distinct pair $s, t$ of $M$ we have $\lambda(s, w) \neq \lambda(t, w)$.
\stopitemize

\todo{Misschien in een definition env zetten? Of in descriptions.}
The above list is ordered from weaker to stronger notions, i.e., every distinguishing sequence is an UIO sequence for every state.
Similarly, an UIO for a state $s$ is a separating sequence for $s$ and any other $t$.
Separating sequences always exist for inequivalent states and finding them efficiently is the topic of \in{Chapter}[chap:separating-sequences].
On the other hand, UIOs and DSs do not always exist for a machine.
\todo{For example, ...}.

In order to separate multiple states at once, we might need sets of states.
This brings us to the following notions.

\startitemize[after, before]
\item A set of sequences $W$ is a called a \defn{characterisation set} if it contains a separating sequence for each pair of (distinct) states.
\item A \defn{state identifier} for a state $s$ is a set $W$ which contains a separating sequence for every other state $t$.
\item A set of state identifiers $\{ W_s \}_{s}$ is \defn{harmonised} if a separating sequence $w$ for states $s$ and $t$ exists in both $W_s$ and $W_t$.
This is sometimes called a \defn{separating family}.
\todo{Equivalently: $x \sim_\Fam{X} y$ implies $x \sim y$}
\todo{Preciezer zijn over prefixes...}
\item Following the definitions in \cite[DBLP:journals/tc/LeeY94], a separating family where each set is a singleton is an \defn{adaptive distinguishing sequence} (ads).
An ads is of special interest since they can identify a state using a single word.
\stopitemize

These notions are again related.
We obtain a characterisation set by taking the union of state identifiers for each state.
For every machine we can construct a set of harmonized state identifiers as will be shown in \in{Chapter}[chap:separating-sequences] and hence every machine has a characterisation set.
However, an adaptive distinguishing sequence may not exist.
\todo{Voorbeeld}

Besides sequences which separate states, we also need sequences which brings a machine to specified states.

\startdefinition
An \emph{access sequence for state $s$} is a word $w$ such that $\delta(q_0, w) = s$.
A set $P$ consisting of an access sequence for each state is called a \emph{state cover}.
If $P$ is a state cover, $P \cdot I$ is called a \emph{transition cover}.
\stopdefinition


\stopsubsection
\startsubsection
  [title={Constructions on sets of sequences}]

In order to define a test suite modularly, we introduce notation for combining sets of words.
For sets of words $X$ and $Y$, we define:

\startitemize[after]
\item their concatenation $X \cdot Y = \{ xy \mid x \in X, y \in Y \}$,
\item iterated concatenation $X^0 = \{ \epsilon \}$ and $X^{n+1} = X \cdot X^{n}$,
\item bounded concatenation $X^{\leq n} = \bigcup_{i \leq n} X^i$, and
\item prefix closure $\pref(X) = \{ y \mid y \text{ is a prefix of } x, x \in X \}$.
\stopitemize

On families we define:
\startitemize[after]
\item flattening: $\bigcup \Fam{X} = \{ x \mid x \in X_s, s \in S \}$,
\item union: $\Fam{X} \cup \Fam{Y}$ is defined point-wise:
$(\Fam{X} \cup \Fam{Y})_s = X_s \cup Y_s$, and
\item $\Fam{X}$-equivalence:
$x \sim_\Fam{X} y$ if $\lambda(x,w) = \lambda(y,w)$ for all $w \in X_x \cap X_y$.
\todo{Hangt af van $M$, maar is nog niet gefixed}
\stopitemize

Given a specification $M$ (with states $S$) we define two operations.
We omit $M$ from the notation as the specification is always clear from the context.
\startitemize[after]
\item concatenation:
$X \odot \Fam{Y} = \{ xy \mid x \in X, y \in Y_{\delta(s_0, x)} \}$ and
\item refinement: $\Fam{X} ; \Fam{Y}$ defined by
\startformula
(\Fam{X} ; \Fam{Y})_s = X_s \cup \bigcup_{y \sim_\Fam{X} x} D(\Fam{Y}, x, y),
\stopformula
where $D(\Fam{Y}, x, y) = Pref\{w \mid \lambda(x,w) \neq \lambda(y,w), w \in Y_x \cap Y_y\}$.
\stopitemize

The latter construction is new and will be used to define a hybrid test generation method in \in{Section}[sec:hybrid].

\startlemma[reference=lemma:refinement-props]
For all families $\Fam{X}$ and $\Fam{Y}$:
\startitemize
\item $\Fam{X} ; \Fam{X} = \Fam{X}$,
\item $\Fam{X} ; \Fam{Y} = \Fam{X}$, whenever $\Fam{X}$ is a separating family, and
\item $\Fam{X} ; \Fam{Y}$ is a separating family whenever $\Fam{Y}$ is a separating family.
\stopitemize
\stoplemma


\stopsubsection
\stopsection
\startsection
  [title={Test generation methods},
   reference=sec:methods]

In this section, we briefly review the classical conformance testing methods:
the W, Wp, UIO, UIOv, HSI, ADS methods.
Our method described is very similar to some of these methods, so we will relate them by describing them uniformly.

There are many more test generation methods.
Literature shows, however, that not all of them are complete.
For example, the method by \citet[DBLP:journals/tosem/Bernhard94] are falsified by \citet[DBLP:journals/tosem/Petrenko97] and the UIO-method from \citet[DBLP:journals/cn/SabnaniD88] is shown to be incomplete by \citet[DBLP:conf/sigcomm/ChanVO89].
For that reason, completeness of the correct methods is shown in the next section.
We fix a state cover $P$ throughout this section and take the transition cover $Q = P \cdot I$.
\todo{In v1 gebruik ik een andere volgorde: UIOv, Wp, HSI, ADS. Ook voorbeeld uitrekenen.}


\startsubsection
  [title={W-method \cite[DBLP:journals/tse/Chow78, Vasilevskii73]},
   reference=sec:w]

Possibly one of the earliest $m$-complete test methods.
\todo{Opmerking over Moore's vermoeden?}

\startdefinition
  [reference=w-method]
Let $W$ be a characterization set, the \defn{W test suite} is
defined as $(P \cup Q) \cdot I^{\leq k} \cdot W$.
\stopdefinition

If we have a separating family $\Fam{W}$, we can obtain a characterization
set by flattening: take $W = \bigcup \Fam{W}$.


\stopsubsection
\startsubsection
  [title={The Wp-method \cite[DBLP:journals/tse/FujiwaraBKAG91]},
   reference=sec:wp]

The W-method was refined by Fujiwara to use smaller sets when identifying states.
In order to do that he defined state-local sets of words.

\startdefinition
  [reference={state-identifier,wp-method}]
Let $\Fam{W}$ be a family of state identifiers, the \defn{Wp test suite} is
defined as $P \cdot I^{\leq k} \cdot \bigcup \Fam{W} \,\cup\, Q \cdot I^{\leq k}
\odot \Fam{W}$.
\stopdefinition

Note that $\bigcup \Fam{W}$ is a characterization set as defined for the W-method.
It is needed for completeness to test states with the whole set $\bigcup \Fam{W}$.
Once states are tested as such, we can use the smaller sets $W_s$ for testing transitions.


\stopsubsection
\startsubsection
  [title={The HSI-method \cite[LuoPB95, YevtushenkoP90]},
   reference=sec:hsi]

The Wp-method in turn was refined by Yevtushenko and Petrenko.
They make use of so called \emph{harmonized} state identifiers (which are called separating families in \cite[DBLP:journals/tc/LeeY94] and in present paper).
By having this global property of the family, less tests need to be executing when testing a state.

\startdefinition
  [reference=hsi-method]
Let $\Fam{H}$ be a separating family, the \defn{HSI test suite} is defined as
$(P \cup Q) \cdot I^{\leq k} \odot \Fam{H}$.
\stopdefinition

Our newly described test method is an instance of the HSI-method.
However, in \cite[LuoPB95, YevtushenkoP90] they describe the HSI-method together with a specific way of generating the separating families.
Namely, the set obtained by a splitting tree with shortest witnesses.
In present paper that is generalized, allowing for our extension to be an instance.


\stopsubsection
\startsubsection
  [title={The ADS-method \cite[DBLP:journals/tc/LeeY94]},
   reference=sec:ads]

As discussed before, when a Mealy machine admits a adaptive distinguishing sequence, only one test has to be performed for identifying a state.
This is exploited in the ADS-method.

\startdefinition
  [reference=ds-method]
Let $\Fam{Z}$ be a separating family where every set is a singleton, the \defn{ADS test suite} is defined as
$(P \cup Q) \cdot I^{\leq k} \odot \Fam{Z}$.
\stopdefinition


\stopsubsection
\startsubsection
  [title={The UIOv-method \cite[DBLP:conf/sigcomm/ChanVO89]},
   reference=sec:uiov]

Some Mealy machines which do not admit an adaptive distinguishing sequence,
may still admit state identifiers which are singletons.
These are exactly UIO sequences and gives rise to the UIOv-method.
In a way this is a generalization of the ADS-method, since the requirement that state identifiers are harmonized is dropped.

\startdefinition
  [reference={uio, uiov-method}]
Let $\Fam{U} = \{ \text{a single UIO for } s \}_{s \in S}$ be a family of UIO sequences, the \defn{UIOv test suite} is defined as
$P \cdot I^{\leq k} \cdot \bigcup \Fam{U} \,\cup\, Q \cdot I^{\leq k} \odot \Fam{U}$.
\stopdefinition

One might think that using a single UIO sequence instead of the set $\bigcup \Fam{U}$ to verify the state is enough.
In fact, this idea was used for the UIO-method which defines the test suite $(P \cup Q) \cdot I^{\leq k} \odot \Fam{U}$.
The example in \in{Figure}[fig:uio-counterexample] shows that this does not necessarily define a 3-complete test suite (this example is due to \citenp[DBLP:conf/sigcomm/ChanVO89]).
Take for example the UIOs $u_0 = aa, u_1 = a, u_2 = ba$ for the states $s_0, s_1, s_2$ respectively.
The test suite then becomes $\{ aaaa, abba, baaa, bba \}$ and the faulty implementation passes this suite.
This happens because the sequence $u_2$ is not an UIO in the implementation, and the state $s'_2$ simulates both UIOs $u_1$ and $u_2$.
Hence we also want to check that a state does not behave as one of the other states, and therefore we use $\bigcup \Fam{U}$.
With the same UIOs as above, the resulting UIOv test suite for the specification in \in{Figure}[fig:uio-counterexample] is $\{aaaa, aba, abba, baaa, bba\}$ of size $23$.

\startplacefigure
  [title={An example where the UIO method is not complete.},
   reference=fig:uio-counterexample]
\startcombination[2*1]
{\hbox{\starttikzpicture
  \node[state,initial,initial text=] (0) {$s_0$};
  \node[state] (1) [above right of=0] {$s_1$};
  \node[state] (2) [below right of=1] {$s_2$};
  \path[->]
  (0) edge [bend left=20 ] node [left ] {${a}/0$} (1)
  (0) edge [bend right=15] node [below] {${b}/1$} (2)
  (1) edge [bend left=20 ] node [left ] {${a}/1$} (0)
  (1) edge [bend right=20] node [right] {${b}/1$} (2)
  (2) edge [bend right=20] node [below] {${a}/0$} (0)
  (2) edge [bend right=15] node [right] {${b}/1$} (1);
\stoptikzpicture}} {Specification}
{\hbox{\starttikzpicture
  \node[state,initial,initial text=] (0) {$s'_0$};
  \node[state] (1) [above right of=0] {$s'_1$};
  \node[state] (2) [below right of=1] {$s'_2$};
  \path[->]
  (0) edge [bend left=20 ] node [left ] {${a}/0$} (1)
  (0) edge [bend right=15] node [below] {${b}/1$} (2)
  (1) edge [bend left=20 ] node [left ] {${a}/1$} (0)
  (1) edge [loop         ] node [above] {${b}/1$} (1)
  (2) edge [bend right=20] node [below] {${a}/0$} (0)
  (2) edge [bend right=15] node [right] {${b}/1$} (1);
\stoptikzpicture}} {Implementation}
\stopcombination
\stopplacefigure


\stopsubsection
\startsubsection
  [title={Hybrid ADS method},
   reference=sec:hybrid]

\todo{Referentie naar volgend hoofdstuk over Oce}
In this section we describe a new test generation method for Mealy machines.
Its completeness will be proven in a later section, together with completeness for all methods defined in this section.
From a high level perspective, the method uses the algorithm by \cite[authoryears][DBLP:journals/tc/LeeY94] to obtain an ads.
If no ads exists, their algorithm still provides some sequences which separates some inequivalent states.
Our extension is to refine the set of sequences by using pairwise separating sequences.
Hence, this method is a hybrid between the ADS-method and HSI-method.

The reason we do this is the fact that the ADS-method generally constructs small test suites as experiments by \cite[authoryears][DBLP:journals/infsof/DorofeevaEMCY10] suggest.
The test suites are small since an ads can identify a state with a single word, instead of a set of words which is generally needed.
Even if the ads does not exist, using the partial result of Lee and Yannakakis' algorithm can reduce the size of test suites.

Instead of manipulating separating families directly, we use a \emph{splitting tree}.
This is a data structure which is used to construct separating families or adaptive distinguishing sequences.

\startdefinition[reference=splitting-tree]
A \defn{splitting tree (for $M$)} is a rooted tree where each node $u$ has
\startitemize
	\item a set of states $l(u) \subseteq M$, and
	\item if $u$ is not a leaf, a sequence $\sigma(u) \in I^{\ast}$.
\stopitemize
We require that if a node $u$ has children $C(u)$ then
\startitemize
	\item the sets of states of the children of $u$ partition $l(u)$, i.e., the set
	$P(u) = \{l(v) \,|\, v \in C(u)\}$ is a \todo{non-trivial} partition of $l(u)$, and
	\item the sequence $\sigma(u)$ witnesses the partition $P(u)$, meaning that
	$\lambda(s, \sigma(u)) = \lambda(t, \sigma(u))$ iff $p = q$ for all $s \in p, t \in q$ for all $p, q \in P(u)$.
\stopitemize
A splitting tree is called \defn{complete} if all inequivalent states belong to	different leaves.
\stopdefinition

Efficient construction of a splitting tree is described in more detail in \in{Chapter}[chap:minimal-separating-sequences].
Briefly, the splitting tree records the execution of a partition refinement algorithm (such as Moore's or Hopcroft's algorithm).
Each non-leaf node encode a \defn{split} together with a witness, which is a separating sequence for its children.
From such a tree we can construct a state identifier for a state by locating the leaf containing that state and collecting all the sequences you read when traversing to the root.

For adaptive distinguishing sequences an additional requirement is put on the splitting tree:
for each non-leaf node $u$, the sequence $\sigma(u)$ defines an injective map $x \mapsto (\delta(x, \sigma(u)), \lambda(x, \sigma(u)))$ on the set $l(u)$.
\cite[authoryears][DBLP:journals/tc/LeeY94] call such splits \defn{valid}.
\in{Figure}[fig:example-splitting-tree] shows both valid and invalid splits.
Validity precisely ensures that after performing a split, the states are still distinguishable.

\startplacefigure
  [title={A complete splitting tree with shortest witnesses for the specification of \in{Figure}[fig:running-example].
          Only the splits $a$ and $aa$ are valid.
          \todo{Commands maken voor deze plaatjes.}},
   list={Complete splitting tree with shortest witnesses for \in{Figure}[fig:running-example].},
   reference=fig:example-splitting-tree]
\hbox{
\starttikzpicture[node distance=1.5cm]
\node (0) [text width=3cm, align=center, ]                 {$s_0, s_1, s_2, s_3, s_4$ $a$};
\node (1) [text width=1cm, align=center, below left of=0 ] {$s_1$};
\node (2) [text width=2.5cm, align=center, below right of=0] {$s_0, s_2, s_3, s_4$ $b$};
\node (3) [text width=2.0cm, align=center, below left of=2 ] {$s_0, s_2, s_3$ $c$};
\node (4) [text width=1cm, align=center, below right of=2] {$s_4$};
\node (5) [text width=1.8cm, align=center, below left of=3 ] {$s_0, s_2$ $aa$};
\node (6) [text width=1cm, align=center, below right of=3] {$s_3$};
\node (7) [text width=1cm, align=center, below left of=5 ] {$s_0$};
\node (8) [text width=1cm, align=center, below right of=5] {$s_2$};
\path[->]
(0) edge (1)
(0) edge (2)
(2) edge (3)
(2) edge (4)
(3) edge (5)
(3) edge (6)
(5) edge (7)
(5) edge (8);
\stoptikzpicture
}
\stopplacefigure

The following lemma is a result of \cite[authoryears][DBLP:journals/tc/LeeY94].

\startlemma
A complete splitting tree with only valid splits exists if and only if there exists an adaptive distinguishing sequence.
\stoplemma

Our method uses the exact same algorithm as the one by Lee and Yannakakis.
However, we also apply it in the case when the splitting tree with valid splits is not complete (and hence no adaptive distinguishing sequence exists).
Their algorithm still produces a family of sets, but is not necessarily a separating family.

In order to recover separability, we refine that family of sets.
Let $\Fam{Z'}$ be the result of Lee and Yannakakis' algorithm (to distinguish from their notation, we add a prime) and let $\Fam{H}$ be a separating family extracted from an ordinary splitting tree.
The hybrid ADS family is defined as $\Fam{Z'} ; \Fam{H}$, and can be computed as sketched in \in{Algorithm}[alg:hybrid].
\todo{Niet precies wat ik in de code doe (wel equivalent), omdat ik efficienter de boom doorloop. En $\Fam{Z,H}$ gegeven door splitting tree.}
By \in{Lemma}[lemma:refinement-props] we note the following: in the best case this family is an adaptive distinguishing sequence; in the worst case it is equal to $\Fam{H}$; and in general it is a combination of the two families.

\startplacealgorithm
  [title={Obtaining the hybrid separating family $\Fam{Z'} ; \Fam{H}$},
   reference=alg:hybrid]
\startalgorithmic
	\REQUIRE{A Mealy machine $M$}
	\ENSURE{A separating family $Z'$}
	\startlinenumbering
	\STATE $T_1 \gets$ splitting tree for Moore's minimization algorithm
	\STATE $\Fam{H} \gets$ separating family extracted from $T_1$
	\STATE $T_2 \gets$ splitting tree with valid splits (see \cite[DBLP:journals/tc/LeeY94])
	\STATE $\Fam{Z'} \gets$ (incomplete) family as constructed from $T_2$
	\FORALL{inequivalent states $s, t$ in the same leaf of $T_2$}{
		\STATE $u \gets lca(s, t)$
		\STATE $Z'_s \gets Z_s \cup \{ \sigma(u) \}$
		\STATE $Z'_t \gets Z_t \cup \{ \sigma(u) \}$
	}
	\ENDFOR
	\RETURN{$Z'$}
	\stoplinenumbering
\stopalgorithmic
\stopplacealgorithm

With the hybrid family we can define the test suite as follows.
In words, the suite consists of tests of the form $p w s$, where $p$ brings the SUT to a certain state, $w$ is to detect $k$ extra states and $s$ is to identify the state.
Its $m$-completeness is proven in \in{Section}[sec:completeness].
\todo{https://gitlab.science.ru.nl/moerman/hybrid-ads}

\startdefinition
Let $P$ be a state cover,
$\Fam{Z'}$ be a family of sets constructed with the Lee and Yannakakis algorithm and
let $\Fam{H}$ be a separating family.
The \defn{hybrid ADS} test suite is
\startformula
T = P \cdot I^{\leq k+1} \odot (\Fam{Z'} ; \Fam{H}).
\stopformula
\stopdefinition


\stopsubsection
\startsubsection[title={Example}]

\todo{Beter opschrijven, en uiteindelijke test suite geven.
	En vergelijken met andere methode? Misschien later pas.}
In the figure we see the (unique) result of Lee and Yannakakis' algorithm.
We note that the states $s_2, s_3, s_4$ are not split, so we need to refine the family for those states.

\startplacefigure
  [title={(a): Largest splitting tree with only valid splits for \in{Figure}[fig:running-example].
          (b): Its adaptive distinguishing tree in notation of \cite[DBLP:journals/tc/LeeY94].},
   list={Splitting tree and adaptive distinguishing sequence.},
   reference=fig:example-splitting-tree]
\startcombination[2*1]{
\hbox{
\starttikzpicture[node distance=2.0cm]
\node (0) [text width=2cm, align=center, ]                 {$s_0, s_1, s_2, s_3, s_4$ $a$};
\node (1) [text width=1cm, align=center, below left of=0 ] {$s_1$};
\node (2) [text width=2cm, align=center, below right of=0] {$s_0, s_2, s_3, s_4$ $aa$};
\node (3) [text width=1cm, align=center, below left of=2 ] {$s_0$};
\node (4) [text width=1cm, align=center, below right of=2] {$s_2, s_3, s_4$};
\path[->]
(0) edge (1)
(0) edge (2)
(2) edge (3)
(2) edge (4);
\stoptikzpicture
}} {(a)}
{\hbox{
\starttikzpicture[node distance=2.0cm]
\node (0) [text width=2cm, align=center, ]                 {$s_0, s_1, s_2, s_3, s_4$ $s_0, s_1, s_2, s_3, s_4$ $a$};
\node (1) [text width=1cm, align=center, below left of=0 ] {$s_1$ $s_2$};
\node (2) [text width=2cm, align=center, below right of=0] {$s_0, s_2, s_3, s_4$ $s_1, s_2, s_4, s_3$ $a$};
\node (3) [text width=1cm, align=center, below left of=2 ] {$s_0$ $s_2$};
\node (4) [text width=1cm, align=center, below right of=2] {$s_2, s_3, s_4$ $s_2, s_3, s_4$};
\path[->]
(0) edge (1)
(0) edge (2)
(2) edge (3)
(2) edge (4);
\stoptikzpicture
}} {(b)}
\stopcombination
\stopplacefigure

From the splitting tree in \in{Figure}[fig:example-splitting-tree], we obtain the following separating family $\Fam{H}$.
From the figure above we obtain the family $\Fam{Z}$.
These families and the refinement $\Fam{Z'};\Fam{H}$ are given below:

\starttabulate[|l|l|l|]
\NC $H_0 = \{aa,b,c\}$ \NC $Z'_0 = \{aa\}$ \NC $(Z';H)_0 = \{aa\}$     \NR
\NC $H_1 = \{a\}$      \NC $Z'_1 = \{a\}$  \NC $(Z';H)_1 = \{a\}$      \NR
\NC $H_2 = \{aa,b,c\}$ \NC $Z'_2 = \{aa\}$ \NC $(Z';H)_2 = \{aa,b,c\}$ \NR
\NC $H_3 = \{a,b,c\}$  \NC $Z'_3 = \{aa\}$ \NC $(Z';H)_3 = \{a,b,c\}$  \NR
\NC $H_4 = \{a,b\}$    \NC $Z'_4 = \{aa\}$ \NC $(Z';H)_4 = \{aa,b\}$   \NR
\stoptabulate
\todo{In startformula/startalign zetten}


\stopsubsection
\startsubsection
  [title={Overview},
   reference=sec:overview]

We give an overview of the aforementioned test methods.
We classify them in two directions,
\startitemize
\item whether they use harmonized state identifiers or not and
\item whether it used singleton state identifiers or not.
\stopitemize

\starttheorem
  [reference=thm:completeness]
The following test suites are all $n+k$-complete:

\starttabulate[|c|c|c|]
\NC                 \NC Arbitrary \NC Harmonized
\NR \HL %----------------------------------------
\NC Many / pairwise \NC Wp        \NC HSI
\NR
\NC % empty cell
\NC $ P \cdot I^{\leq k} \cdot \bigcup \Fam{W} \,\cup\, Q \cdot I^{\leq k} \odot \Fam{W} $
\NC $ (P \cup Q) \cdot I^{\leq k} \odot \Fam{H} $
\NR \HL %----------------------------------------
\NC Hybrid          \NC           \NC Hybrid ADS
\NR
\NC % empty cell
\NC % empty cell
\NC $ (P \cup Q) \cdot I^{\leq k} \odot (\Fam{Z'} ; \Fam{H}) $
\NR \HL %----------------------------------------
\NC Single / global \NC UIOv      \NC ADS
\NR
\NC % empty cell
\NC $ P \cdot I^{\leq k} \cdot \bigcup \Fam{U} \,\cup\, Q \cdot I^{\leq k} \odot \Fam{U} $
\NC $ (P \cup Q) \cdot I^{\leq k} \odot \Fam{Z} $
\NR \HL %----------------------------------------
\stoptabulate
\stoptheorem

\todo{Iets zeggen over de hybrid UIO method.}
\todo{Geef groottes van suites voor het voorbeeld. Merk op dat ADS and UIOv niet van toepassing zijn (state 2 heeft geen UIO)}

It should be noted that the ADS-method is a specific instance of the HSI-method and similarly the UIOv-method is an instance of the Wp-method.
What is generally meant by the Wp-method and HSI-method is the above formula together with a particular way to obtain the (harmonised) state identifiers.

We are often interested in the size of the test suite.
In the worst case, all methods generate a test suite with a size in $\bigO(pn^3)$ and this bound is tight \cite[Vasilevskii73].
Nevertheless we expect intuitively that the right column performs better, as we are using a more structured set (given a separating family for the HSI-method, we can always forget about the common prefixes and apply the Wp-method, which will never be smaller if constructed in this way).
Also we expect the bottom row to perform better as there is a single test for each state.
Small experimental results confirm this intuition
\cite[DBLP:journals/infsof/DorofeevaEMCY10].


\stopsubsection
\stopsection
\startsection
  [title={Proof of completeness},
   reference=sec:completeness]

\todo{Stukje over bisimulaties?}
We fix a specification $M$ which has a minimal representative with $n$ states and an implementation $M'$ with at most $n+k$ states.
We assume that all states are reachable from the initial state in both machines (i.e., both are \defn{connected}).
We define the following notation:

\startdefinition
Let $W \subseteq I^{\ast}$ be a set of words.
Two states $x, y$ (of possibly different machines) are $W$-equivalent, written $x \sim_W y$, if $\lambda(x, w) = \lambda(y, w)$ for all $w \in W$.
\stopdefinition

We note that for a fixed set $W$ the relation $\sim_W$ is a equivalence relation and that $W \subseteq V$ gives $x \sim_V y \Rightarrow x \sim_W y$.
The next lemma gives sufficient conditions for a test suite of the given anatomy to be
complete.
We then prove that these conditions hold for the test suites in this paper.

\startlemma
Let $\Fam{W}$ and $\Fam{W'}$ be two families of words and $P$ a state cover for $M$.
Let $T = P \cdot I^{\leq k} \odot \Fam{W} \,\cup\, P \cdot I^{\leq k+1} \odot \Fam{W'}$ be a test suite.
If
\startitemize[n]
\item for all $x, y \in M:$ $x \sim_{W_x \cap W_y} y$ implies $x \sim y$,
\item for all $x, y \in M$ and $z \in M'$: $x \sim_{W_x} z$ and $z \sim_{W'_y} y$ implies $x \sim y$, and
\item the machines $M$ and $M'$ agree on $T$,
\stopitemize
then $M$ and $M'$ are equivalent.
\todo{Puntje 2 verdient meer aandacht?}
\stoplemma
\startproof
First, we prove that $P \cdot I^{\leq k}$ reaches all states in $M'$.
For $p, q \in P$ and $x = \delta(s_0, p), y = \delta(s_0, q)$ such that $x \not\sim_{W_x \cap W_y} y$, we also have $\delta'(m'_0, p) \not\sim_{W_x \cap W_y} \delta'(m'_0, q)$ in the implementation $M'$.
By (1) this means that there are at least $n$ different behaviours in $M'$, hence at least $n$ states.

Now $n$ states are reached by the previous argument (using the set $P$).
By assumption $M'$ has at most $k$ extra states and so we can reach all those extra states by using $I^{\leq k}$ after $P$.

Second, we show that the reached states are bisimilar.
Define the relation $R = \{(\delta(q_0, p), \delta'(q_0', p)) \mid p \in P \cdot I^{\leq k}\}$.
Note that for each $(s, i) \in R$ we have $s \sim_{W_s} i$.
For each state $i \in M'$ there is a state $s \in M$ such that $(s, i) \in R$, since we reach all states in both machines by $P \cdot I^{\leq k}$.
We will prove that this relation is in fact a bisimulation up-to equivalence.

For output, we note that $(s, i) \in R$ implies $\lambda(s, a) = \lambda'(i, a)$ for all $a$,
since the machines agree on $P \cdot I^{\leq k+1}$.
For the successors, let $(s, i) \in R$ and $a \in I$ and consider the successors $s_2 = \delta(s, a)$ and $i_2 = \delta'(i, a)$.
We know that there is some $t \in M$ with $(t, i_2) \in R$.
We also know that we tested $i_2$ with the set $W'_t$.
So we have:
\startformula s_2 \sim_{W'_{s_2}} i_2 \sim_{W_t} t. \stopformula
By the second assumption, we conclude that $s_2 \sim t$.
So $s_2 \sim t$ and $(t, i) \in R$, which means that $R$ is a bisimulation up-to equivalence.
Using the theory of up-to techniques \cite[DBLP:journals/cacm/BonchiP15] we know that there exists a bisimulation relation containing $R$.
\todo{Up to is best nieuw. Misschien meer aandacht aan besteden. Ook nog Jurriaan citeren?}
In particular $q_0$ and $q'_0$ are bisimilar.
And so the machines $M$ and $M'$ are equivalent.
\stopproof

Before we show that the conditions hold for the test methods described in this paper, we reflect on the above proof first.
This proof is very similar to the completeness proof in \cite[DBLP:journals/tse/Chow78].
(In fact, it is also similar to Lemma 4 in \cite[DBLP:journals/iandc/Angluin87] which proves termination in the L* learning algorithm. This correspondence was noted in \cite[DBLP:conf/fase/BergGJLRS05].)
\todo{Hoofdstuk over leren van nom. aut. heeft ook deze stelling.}
In the first part we argue that all states are visited by using some sort of counting and reachability argument.
Then in the second part we show the actual equivalence.
To the best of the authors knowledge, this is first $m$-completeness proof which explicitly uses the concept of a bisimulation.
Using a bisimulation allows us to slightly generalize and use bisimulation up-to equivalence, dropping the the often-assumed requirement that $M$ is minimal.

\startlemma
Let $\Fam{W'}$ be a family of state identifiers for $M$.
Define the family $\Fam{W}$ by $W_s = \bigcup \Fam{W'}$.
Then the conditions (1) and (2) in the previous lemma are satisfied.
\stoplemma
\startproof
The first condition we note that $W_x \cap W_y = W_x = W_y$, and so $x \sim_{W_x \cap W_y} y$ implies $x \sim_{W_x} y$, now by definition of state identifier we get $x \sim y$.

For the second condition, let $x \sim_{\bigcup \Fam{W'}} z \sim_{W'_y} y$.
Then we note that $W'_y \subseteq \bigcup{W'}$ and so we get $x \sim_{W'_y} z \sim_{W'_y} y$.
By transitivity we get $x \sim_{W'_y} y$ and so by definition of state identifier we get $x \sim y$.
\stopproof

\startcorollary
The W, Wp, and UIOv test suites are $n+k$-complete.
\stopcorollary

\startlemma
Let $\Fam{H}$ be a separating family and take $\Fam{W} = \Fam{W'} = \Fam{H}$.
Then the conditions (1) and (2) in the previous lemma are satisfied.
\stoplemma
\startproof
Let $x \sim_{H_x \cap H_y} y$, then by definition of separating family $x \sim y$.
For the second condition, let $x \sim_{H_x} z \sim_{H_y} y$.
Then we get $x \sim_{H_x \cap H_y} z \sim_{H_x \cap H_y} y$ and so by transitivity $x \sim_{H_x \cap H_y} y$, hence again $x \sim y$.
\stopproof

\startcorollary
The HSI, ADS and hybrid ADS test suites are $n+k$-complete.
\stopcorollary


\stopsection
\startsection
  [title={Related Work and discussion}]

\todo{Opnieuw lezen, want verouderd. Voeg toe: non-det, no-reset, transition tour, checking sequence. Future work: benchmarks (ref benchmark wiki).}
Comparison of test methods already appeared in the recent papers \cite[DBLP:journals/infsof/DorofeevaEMCY10] and \cite[DBLP:journals/infsof/EndoS13].
Their work is mainly evaluated on randomly generated Mealy machines.
We continue their work by evaluating on many specifications from industry.
In addition, we evaluate in the context of learning, which has been lacking so far.

Many of the methods describe here are benchmarked on small Mealy machines in \cite[DBLP:journals/infsof/DorofeevaEMCY10] and \cite[DBLP:journals/infsof/EndoS13].
Additionally, they included the P, H an SPY methods.
Unfortunately, the P and H methods do not fit naturally in the overview presented in \in{Section}[sec:overview].
The P method is not able to test for extra states, making it less usable.
And the H method \todo{?}
The SPY method builds upon the HSI-method and changes the prefixes in order to minimize the size of a test suite, exploiting overlap in test sequences.
We believe that this technique is independent of the HSI-method and can in fact be applied to all methods presented in this paper.
As such, the SPY method should be considered as an optimization technique, orthogonal to the work in this paper.

More recently, a novel test method was devised which uses incomplete distinguishing sequences \cite[DBLP:journals/cj/HieronsT15].
They use sequences which can be considered to be adaptive distinguishing sequences on a subset of the state space.
With several of those one can cover the whole state space, obtaining a $m$-complete test suite.
This is a bit dual to our approach, as our \quotation{incomplete} adaptive distinguishing sequences define a course partition of the state space.
Our method becomes complete by refining the tests with pairwise separating sequences.

Some work is put into minimizing the adaptive distinguishing sequences themselves.
In \cite[DBLP:journals/fmsd/TurkerY14] the authors describe greedy algorithms which construct small adaptive distinguishing sequences.
\todo{We expect that similar
	heuristics also exist for the other test methods and that it will improve the
	performance.}
However, they show that finding the minimal adaptive distinguishing sequence is NP-complete, even approximation is NP-complete.
We would like to incorporate their greedy algorithms in our implementation.
\todo{Noem minimal sep seqs. Dit is niet genoeg voor een minimal test suite.}


\startsubsection
  [title={When $k$ is not known}]

In many of the applications described in \in{Section}[sec:applications] no bound on the number of states of the SUT was known.
In such cases it is possible to randomly select test cases from an infinite test suite.
Unfortunately, we lose the theoretical guarantees of completeness with random generation.
Still, for the applications in \in{Section}[sec:applications] it has worked well in finding flaws.

We can randomly test cases as follows.
In the above definition for the hybrid ADS test suite we replace $I^{\leq k}$ by $I^{\ast}$ to obtain an infinite test suite.
Then we sample tests as follows:
\startitemize[n]
\item sample an element $p$ from $P$ uniformly,
\item sample a word $w$ from $I^{\ast}$ with a geometric distribution, and
\item sample uniformly from $(\Fam{Z'} ; \Fam{H})_s$ for the state $s = \delta(s_0, pw)$.
\stopitemize
\stopsubsection


\todo{Enkele resultaten bespreken, test-suite-groottes vergelijken}
\todo{Future work? Meer benchmarks? Andere automaat-modellen?}


\stopsection
\referencesifcomponent
\stopchapter
\stopcomponent