Nondeterministic finite automata


In [1]:
from tock import *

Creating NFAs

NFAs are loaded and created similarly to DFAs:


In [2]:
m = read_csv("examples/sipser-1-27.csv")
m.is_finite(), m.is_deterministic()


Out[2]:
(True, False)

In [3]:
to_table(m)


Out[3]:
ε 0 1
>q1 q1 {q1,q2}
q2 q3 q3
q3 q4
@q4 q4 q4

There are two main differences in this table from a DFA. First, a cell can have more than one state. For example, if the machine is in state q1 and the next input symbol is a 1, then the machine can change to either q1 or q2. Use curly braces for sets of states. (It's an error to omit them.) If a cell has no states, either write {} or or leave the cell blank.

The second difference is that there is a column for the empty string, which can be written either as ε or &. (The ampersand is supposed to look like "et", so I figured it would be a good approximation to ε, which is a Greek "e".) If the machine is in state q2, then it can change to state q3 without reading in any input symbols.

This is what the state transition diagram looks like:


In [4]:
to_graph(m)


%3 _START 2 q1 _START->2 0 q3 3 q4 0->3 1 1 q2 1->0 0 ε 2->1 1 2->2 0 1 3->3 0 1

Now it's a little easier to see that this machine accepts strings that contain either 101 or 11.

Running NFAs

Let's run the automaton on a string:


In [5]:
run(m, "1 0 1 1 0 1")


%3 _START 2 q1 _START->2 0 q2 4 q3 0->4 6 q3 0->6 1 q1 5 q1 1->5 2->0 2->1 3 q1 9 q1 3->9 10 q2 3->10 5->3 7 q2 5->7 8 q4 6->8 18 q3 7->18 19 q4 8->19 20 q1 9->20 21 q3 10->21 22 q3 10->22 11 12 11->12 1 13 12->13 0 14 13->14 1 15 14->15 1 16 15->16 0 17 16->17 1 18->19 23 q4 19->23 24 q1 20->24 25 q2 20->25 26 q4 21->26 23->26 27 q3 25->27

Now the run graph is more interesting than for the DFA. As before, each node indicates a configuration, that is, a state that the machine can be in at a particular time. Nodes in the same column correspond to the same input position. An edge between two configurations means that the automaton can move from one to the other. Note that unlike in a DFA run, a node can have more than one outgoing edge. Since there is a path that ends with a double node, the machine accepted the string.

Previously, we used only_path() to display a sequence of configurations. That won't work here, because there are many paths. Instead, we can use:


In [6]:
run(m, '1 0 1 1 0 1').shortest_path()


Out[6]:
q1[1] 0 1 1 0 1
q1[0] 1 1 0 1
q1[1] 1 0 1
q1[1] 0 1
q2[0] 1
q31
q4ε

Here's another string:


In [7]:
run(m, "0 1 0 0 1 0")


%3 _START 18 q1 _START->18 0 q3 1 q3 2 q2 2->1 19 q3 2->19 3 q1 4 q1 4->3 5 q1 5->2 5->4 6 10 6->10 1 7 7->6 0 8 11 8->11 1 9 9->8 0 10->9 0 12 11->12 0 13 q1 13->5 14 q1 16 q2 14->16 17 q1 14->17 15 q3 16->0 16->15 17->13 18->14

The absence of a double node means that the string was rejected.

In a nondeterministic automaton, it's possible that there are infinitely many runs for a given input string. That's not a problem -- consider the following NFA, which (for some reason) loops through a bunch of epsilon transitions before reading in a single 0.


In [8]:
m = read_csv("examples/nfaloop.csv")
to_graph(m)


%3 _START 1 q1 _START->1 0 q3 0->0 ε 1->0 ε 2 q2 1->2 0

In [9]:
run(m, "0")


%3 _START 2 q1 _START->2 0 q3 0->0 1 q2 2->0 2->1 3 4 3->4 0

This graph represents an infinite number of runs, each of which loops in state q1 for some number of times, then moves on to state q2.