Wednesday, November 16, 2016

State Reachability

State Reachability is an important problem in computer science, with many applications in engineering. It is the simplest problem in the broader class of Model Checking problems. Principles of Model Checking looks like a good book to learn about model checking. Ken McMillan has been a pioneer in model checking algorithms from the very early days, in the 1980s. Among his many publications, I would refer specifically to his 2003 paper Interpolation and SAT-Based Model Checking which uses SAT-based methods. I will sketch out here another SAT-based method which should complement the interpolation method of McMillan.

State Reachability is an extremely difficult problem, in that for every algorithm there will be some relatively small instances where that algorithm takes a very long time, and a huge amount of memory, to solve that instance. Of course, computer science has still not proved that hard problems actually exist – this is the famous P=NP? theorem, as yet unproved – but if any hard problems exist, State Reachability is hard.

Satisfiability, or SAT, is another hard problem – indeed, it is the prime example of a hard problem. State Reachability is like the big brother of Satisfiability. Satisfiability relates to Boolean functions. A Boolean function is some sort of rule or formula that determines a 1 or 0 result for each possible value of some input. The input of a Boolean function is generally an array of 1/0 values. So a simple Boolean function might map the four input values {000, 011, 101, 110} to the result 0, and {001, 010, 100, 111} to 1. When there are only three bits in the input array, as in this example, it is easy to list all eight values of the inputs, compute the value of the function, and determine whether the result is always 0 or whether there are any input values that have a 1 result. As the number of bits in the input array gets larger, say 30 or 100 or 300 bits, the number of input values grows so quickly that it becomes practically impossible to consider each input value individually. Satisfiability algorithms are some of the oldest algorithms in computer science and continue to be improved. SatLive is a good site to find out about recent work in this field.

The State Reachability problem builds on the Satisfiability problem. State Reachability works with a finite state machine, constructed around a more complex Boolean function, one that maps an input array of 1/0 values to an output array of 1/0 values. Rather than computing a single output value from a single input value, a finite state machine performs a series of computations using the same Boolean function in each step. The output of each step becomes part of the input of the next step. This is essentially how digital electronic circuits, such as computers, work. Other engineering problems can also be abstracted to appear in this guise.

A finite state machine can be pictured like this:

Here the five squares on the left represent the present state of the machine, and the five squares on the right are the next state. The two value on the upper left are the inputs to the machine that steer the machine down one path or another. The grey blob in the middle represents the Boolean function that computes the next state of the machine on the basis of the present state and the machine's input values.

Each value of the output array is a state of the finite state machine. As the machine operates, the machine moves from one state to another. The State Reachability problem asks whether there is any way that the machine can start from a given initial state and eventually reach a given target state.

The behavior of a state machine can be illustrated with a state transition graph:

Here the green dot is the initial state and the red dot is the target state. It is easy to see that there is no path from the initial state to the target state. This kind of inspection gets much harder as the number of states of the machine grows. Each state is a Boolean vector value for the bits of the machine, so every new bit in the machine will double the number of states. Once the number of state bits grows above a few dozen, even a computer will have a hard time recording the complete state graph.

Each step of the machine’s operation involves the evaluation of a Boolean function. To determine whether the target state can be reached in a single step, in one evaluation of the Boolean function, is a Satisfiability problem. The behavior of the machine across two, three, or any fixed number of steps, can again be posed as a Satisfiability problem, just with a larger Boolean function. But as machines grow, the potential number of steps between states also grows. So deciding whether a target state can eventually be reached from an initial state, the State Reachability problem, gets very difficult very quickly as the size of the finite state machine grows.

The first quick and simple approach to checking whether a target state can be reached is to perform random simulation. This allows many states to be explored very quickly. If the target state is easy to reach, random simulation may well find a path to it.

Another approach is to use a Satisfiability checker to see whether there is a reasonably short path from the initial state to the target state. Paths that are too long will be too computationally expensive for the Satisfiability checker. It is simple enough to set up check to see if there is any path to the target that is less than some maximum number of steps. If a path is found, the State Reachability problem is thereby solved! If no path is found, a modern Satisfiability checker will be able to provide a minimized proof. This proof will typically not include every state variable in the machine. The algorithm I am proposing here requires the selection of some relevant state bit to be extracted from this proof.

The algorithm I am proposing here involves the incremental refinement of an abstract state machine. The states of the given finite state machine are partitioned; each block of the partition constitutes a state of the abstract state machine. The abstract state machine is refined incrementally by splitting blocks one at a time. Each block is defined by a partial evaluation of the state bits. A block is split by picking one of the unvalued bits of the block. The two new blocks are formed by given the selected state bit the 0 and the 1 value in turn. The state bit used for splitting will be picked from the various proofs of unsatisfiability generated by the Satisfiability checker as the algorithm proceeds.

The abstract state machine guides the exploration of the concrete state machine that defines the problem. A collection of reached concrete states is maintained. Each of these reached states belongs to a block in the partition that defines the abstract state machine, i.e. maps to an abstract state. The idea is to find ways to reach as many abstract states as possible. If a path cannot be found from an already reached state in one block of the partition to another block In the partition, it may be because there are no paths between any pair of states in those two blocks. If there are no possible paths in the abstract state machine between the block containing the initial state and the block containing the target state, then there are no paths in the concrete machine either. Discovering this would solve this State Reachability problem instance with a negative result.

When some states in a block can reach states in another block, but none of the reached states can, then the block should be refined. The hope is that one of the split off blocks will have all the reached states, while the other block has all the states that have paths to the unreached block. The idea is to work to keep as sparse as possible the state transition graph of the abstract state machine. This sparseness will help guide the Satisfiability checker to focus on abstract paths that have a good likelihood of mapping to concrete paths, or, on the other hand, will result in an abstract state graph with no path at all between the initial block and the target block.

I'll go through the pieces of the algorithm again, in pictures. First, random simulation is used to explore states that are easily reached from the initial state:

The abstract state machine is initially trivial, consisting of a single block, corresponding to a partial evaluation of the state bits that has none of the bits given a value:

The first run of the Satisfiability checker can look for paths from the initial state to the target state. If none are found, a bit can be selected from the proof of unsatisfiability. This bit can then be used to split the block:

Now the Satisfibility checker can be used to look for paths to states in a block that has not yet been reached:

Whenever interesting new states have been reached by using the Satisfiability checker, it can be worthwhile to explore further by running random simulation from these new states.

The Satisfiability checker is used to look for paths to the target or to unreached blocks, starting from reached states that are closest to these goals. When a path cannot be found, the resulting unsatisfiability proof can be used to pick a bit to use in splitting the block:

Before splitting a block, though, one should check whether any of the states in the starting block have a path to a state in the goal block. If not, the abstract state machine can mark as impossible that transition:

Eventually the abstract state graph will chop the path from the initial state to the target state into small enough pieces that a combination of random simulation and Satisfiability will find a path, or enough abstract states will be missing enough transitions that the target block will be unreachable from the initial block in the abstract machine. This, this algorithm will eventually determine an answer to the given State Reachability problem instance.