Equivalence Checking¶
Stainless can prove equivalence of recursive programs using automated functional induction. Additionaly, it can treat many programs at once, reusing the information obtained throughout the process, to reason about subsequent programs. This paper explains the underlying algorithms.
To run the equivalence checker, use the --equivchk
option of Stainless. The option --comparefuns
specifies the names of candidate functions. The option --models
specifies the names of reference functions.
Example run¶
Consider the following three functions isSortedA
(incorrect), isSortedB
and isSortedC
(both correct), that implement a check whether a given input list is sorted in a non-decreasing order:
def isSortedA(l: List[Int]): Boolean = {
def leq(cur: Int, next: Int): Boolean = cur < next
def iter(l: List[Int]): Boolean =
if (l.isEmpty) true
else if (l.tail.isEmpty) true
else leq(l.head, l.tail.head) && iter(l.tail)
if (l.size < 2) true
else l.head <= l.tail.head && iter(l.tail)
}
def isSortedB(l: List[Int]): Boolean = {
if (l.isEmpty)
true
else if (!l.tail.isEmpty && l.head > l.tail.head)
false
else
isSortedB(l.tail)
}
def isSortedC(l: List[Int]): Boolean = {
def chk(l: List[Int], p: Int, a: Boolean): Boolean = {
if (l.isEmpty) a
else if (l.head < p) false
else chk(l.tail, l.head, a)
}
if (l.isEmpty) true
else chk(l, l.head, true)
}
And the following reference function isSortedR
:
def isSortedR(l: List[Int]): Boolean = {
def loop(p: Int, l: List[Int]): Boolean = l match {
case Nil() => true
case Cons(x, xs) if (p <= x) => loop(x, xs)
case _ => false
}
if (l.isEmpty) true
else loop(l.head, l.tail)
}
The following command invokes the equivalence checking (isSortedA
, isSortedB
, isSortedC
and isSortedR
are defined in frontends/benchmarks/equivalence/isSorted/isSorted.scala
):
stainless frontends/benchmarks/equivalence/isSorted/isSorted.scala --equivchk=true --comparefuns=isSortedA,isSortedB,isSortedC --models=isSortedR --timeout=3
Stainless automatically generates all the equivalence lemmas and reports resulting equivalence clusters. This is done by checking for equivalence of each function from the --comparefuns
list against the model functions from the --models
list, until the proof of equivalence or a counterexample is found for one of the models.
The output of equivalence checking is a classification of candidate functions into the following categories:
Wrong
, if the signature is wrong;
Correct
, if there is a reference program provably equivalent to this program;
Incorrect
, if there is a counterexample that disproves the equivalence;
Unknown
, when there is neither a proof nor a counterexample of equivalence.
For our example run, we get the following output:
[ Info ] Printing equivalence checking results:
[ Info ] List of functions that are equivalent to model IsSorted.isSortedB: IsSorted.isSortedC
[ Info ] List of functions that are equivalent to model IsSorted.isSortedR: IsSorted.isSortedB
[ Info ] List of erroneous functions: IsSorted.isSortedA
[ Info ] List of timed-out functions:
[ Info ] List of wrong functions:
[ Info ] Printing the final state:
[ Info ] Path for the function IsSorted.isSortedB: IsSorted.isSortedR
[ Info ] Path for the function IsSorted.isSortedC: IsSorted.isSortedB, IsSorted.isSortedR
[ Info ] Counterexample for the function IsSorted.isSortedA:
[ Info ] l -> Cons[Int](-1357890535, Cons[Int](-1089455080, Cons[Int](-1089455080, Nil[Int]())))