Friday, July 18, 2014

Computing with Graphs

There are a variety of different models of computation. Already in the 1930s there were several models, for example those of Alonzo Church and Alan Turing. The Church-Turing hypothesis proposes that these models don’t differ in the class of computational problem they can solve, at least if the model is rich enough. The set of problems that a Turing machine can solve is the maximal set that any computing device can tackle. The point of alternate models is not to expand that set of computable problems.

It can still be useful to work with other models. A key aspect of any computational model is the way it holds data. A Turing machine uses a tape, a one dimensional array of symbols that can grow in either direction without bound. This ability to store an unbounded amount of data is essential for any computational model to be equivalent to a Turing machine. How to access large amounts of data is a practical challenge for real computers as well as an interesting theoretical issue. A Turing tape gets the job done in very simple way but not in a very practical way. The Random Access Memory that is ubiquitous in modern computers is very practical, but theoretically not very elegant. This inelegance manifests at the practical level in the difficulty of picking the right size for memory address words. Large address words make it easy to access large amounts of data, but are expensive to use. Small address words are efficient but don’t provide the address space needed for complex computations.

My proposal here is to use graphs, of a particular class, to represent the state of a computation. Just like a Turing machine writes new symbols over old symbols on its tape in order to record the result of a computational step, the graph machine will modify a small part of the graph at each computational step.

We have around 70 years of practical experience with computing machines at this point, almost all with Random Access Memory for storing data. How to represent information in a practical way with this graph model, that is a problem that will take years to refine. The starting point is just to explore what works. Bit by bit we can discover the methods that work best.

Could such a radically different computation model ever develop enough through practical experience that it could make any dent in the ubiquity of the Random Access Memory model? My proposal here is for two uses of this graph model. First, it is an interesting model to explore theoretically. The main value I see here is the potential for a more accurate model of memory access times.

Second, it could provide a powerful processor architecture. New compilers for existing high level languages could insulate users from the change in architecture. Processors with this new architecture could certainly be built with existing Random Access Memory components. The greatest impact would be at the level of cache design and virtual memory mapping. This graph architecture is a bit like a hyper-virtual memory. Addresses are never exposed at the processor architecture level! This gives memory mapping huge flexibility. Since the topology of the data is exposed at the architectural level, caching strategies can be vastly more powerful.

This graph architecture is a radical departure from the conventional Random Access Memory, but it could actually be quite practical!

Let us now dive into the details!

The state of computation will be kept as a connected graph, a set of nodes connected by two types of edges: directed and undirected. Each node appears exactly three times in the edges: once as the source of a directed edge, once as the destination of a directed edge, and once at one end of an undirected edge. The entire graph will always be connected, i.e. given any two nodes there will be some way to walk along the edges to get from one node to the other.

Here is an example of a graph that could represent the state of a computation:

This is just like a string of symbols on a Turing tape. To understand what that string of symbols means, or what a graph like this might mean, that requires detailed understanding of the particular computation that is in process. Right now we are just looking at the ways that a computation could read and write snippets of information.

A graph machine would have a set of registers. Probably two registers is the theoretical minimum but a practical number might be the usual 16 or so that are common in today’s processing units. Each register points to some node or other of the graph. Processor instructions would provide operations like:

  • Move register n forward
  • Move register n backward
  • Hop register n across
  • Check whether registers n and m point to the same node
  • Make register n point to whatever node register m is pointing to
By running registers along the edges in various ways and then checking whether they got to the same place, it is easy to determine the shape of any region of the graph.

The general way that computation proceeds is that, depending on whatever aspects of the shape of the graph are important to the computation at hand, one or another graph rewrite operation can be performed.

The first operation is to swap the directed edges between two nodes that are connected by an undirected edge:

This operation can be applied to the example graph above, to yield:

It can then be applied a second time:

The complete graph can be partitioned by the pattern of directed edges. The directed edges will form one or more cycles in the graph. When directed edges are swapped, these cycles can be split or joined. Note that the swap operation can move in either direction along the sequence of graphs in this example.

The second operation is to swap undirected edges between two nodes that are connected by a directed edge:

This can be applied as a next step in our sequence:

And again:

The third and fourth operations expand or contract the graph:

Expansion operations can be applied anywhere on any graph. But contraction operations can only be applied where the proper four node shape exists in a graph, which some graphs may not have at all.

In our example, the contraction operation can be applied:

Now we can alternate, an undirected swap:

a contraction:

an undirected swap:

a contraction:

one more undirected swap:

and one more contraction:

finally arriving at a minimal graph. Note that we could start with this minimal graph and reverse the operations to move in a sequence back to the graph we started with.

We followed a general strategy here to reduce the original graph to the minimal graph.

  1. Swap directed edges to join the directed cycles into one big cycle
  2. Gradually reduce the size of the graph, by repeatedly
    1. Swapping undirected edges to form a contractable four node subgraph
    2. Contracting that subgraph
This strategy thus sketches a proof that these four operations,
  • Swap directed edges
  • Swap undirected edges
  • Contract
  • Expand
provide the capability to move in a sequence from any one graph to any other graph. We can start from any arbitrary graph and reduce it to this minimal shape. We can reverse those operations to construct a sequence from the minimal shape to that arbitrary graph. So a sequence of operations exists to move from any graph to any other graph. Of course in a practical computation one would not commonly move through the minimal graph, which corresponds to a blank Turing tape.

From a theoretical perspective it is useful to explore the power of this pure graphical architecture. From a practical perspective, probably a hybrid architecture will prove more efficient. One could augment the model by adding to each node a word of data, e.g. 64 bits that could be interpreted as a floating point number etc. Floating point numbers could certainly be encoded as shapes of subgraphs, so this augmentation is not theoretically necessary. But this sort of graph-based computation could well prove itself valuable beyond the theoretical context.

2 comments:

  1. a bit of the history behind this: http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=25648

    ReplyDelete
  2. I think the pi calculus of Milner et al. is also very similar to what I have proposed here. I confess that the formalism of the pi calculus is quite beyond me!

    http://en.wikipedia.org/wiki/%CE%A0-calculus

    ReplyDelete