prog2_aut.py
At the beginning of this file, copy the code from the previous programming assignment for:
class Re:
def __init__(self):
self.type = None
self.char = None
self.left = None
self.right = None
def to_string(self):
...
def eps():
...
def symb(c):
...
def union(e1,e2):
...
def conc(e1,e2):
...
def star(e1):
...
class Automata:
def __init__(self,delta,start,accept):
self.delta = delta
self.start = start
self.accept = accept
def to_string(self):
s1 = str(self.delta)
s2 = str(self.start)
s3 = str(self.accept)
return "(" + s1 + ", " + s2 + ", " + s3 + ")"
The starting state self.start
is a particular state (usually an integer). The set of accepting states self.accept
is expected to be a list of states like [4,1,7]
.
The transition relation self.delta
is expected to be a list of transitions, where a single transition is represented as a triple. For example a single transition [1, 'a', 3]
encodes a transition from state 1
to state 3
labelled with symbol 'a'
. A transition relation for an NFA recoginizing the language (ab)*
could be
[ [1,'a',2], [2,'b',3], [3,"eps",1]]
. Notice how the epsilon transitions are labelled with the particular string "eps"
.
To illustrate the format expected for the transition function we provide the function transition_symb(delta,q,a)
that returns the list of states p
such that there exists a transtion [q,a,p]
in the transition relation delta
.
def transition_symb(delta, q, a):
if delta == []:
return []
else:
tr = delta[0]
q1 = tr[0]
a1 = tr[1]
p = tr[2]
if q == q1 and a == a1:
return [ p ] + transition_symb(delta[1:],q,a)
else:
return transition_symb(delta[1:],q,a)
Make sure you understand all the details of this function before proceeding.
Remark that this function can be used with a
being a symbol or an epsilon transition "eps"
. When the input symbol is "eps"
, the function returns all the states that can be reached from q
taking exactly one epsilon-transition.
The epsilon-closure of a state q
is the set of states that are reachable by an arbitrary number of epsilon-transitions.
Implement the epsilon-closure of a state q
according to a transition relation delta
as the function
def eps_closure(delta,q):
...
To implement this function you may only use the function transition_symb()
provided above. Hint: consider having two variables, one for the states you have already visited (initialized to []
), another variable for the states you have to visit (initialized to [q]
), and a while-loop that updates these two variables until there are no states available in states to visit.
Your function is expected to behave as in the following example:
>>> delta1 = [ [1,"eps",2], [2,'a',3], [4, "eps", 3], [4,"b", 5], [2,"eps",4], [3,"eps",2] ]
>>> print(eps_closure(delta1,1))
[3, 4, 2, 1]
The order of states appearing in the list does not matter. Nevertheless, no duplicates are expected in the result. Make sure the states 4
and 1
appear in the list.
q
and a symbol a
, we can compute the set of states that are accessible by an arbitrary number of epsilon-transitions followed by a transition labelled by a
:
def transition_eps_symb(delta, q, a):
eps_reached = eps_closure(delta,q)
res = []
for p in eps_reached:
res = transition_symb(delta,p,a) + res
return res
Make sure you understand all the details of this function before proceeding.
epsilon_closure()
and transition_eps_symb()
, implement the algorithm that decides whether a string w
matches an NFA nfa
, starting from a state q
. The function can be defined recursively on the length of w
and may have the structure given below:
def match_nfa(nfa,w,q):
if w == "":
eps_reached = eps_closure(nfa.delta,q)
for p in eps_reached:
...
return False
else:
a = w[0]
...
def conc_helper(states,s):
res = []
for q in states:
res = [[q,"eps",s]] +res
return res
For example, this function called on the set of states [1,3,5]
and state 7
returns the following transitions:
>>>print( conc_helper([1,3,5],7) )
[[5, 'eps', 7], [3, 'eps', 7], [1, 'eps', 7]]
re_to_nfa(e)
that converts a regexp e
to an NFA, completing the following piece of code filling up the parts indicated with "...
", that is the construction for the epsilon, union and conc:
state_count = 0
def new_state():
global state_count
s = state_count
state_count += 1
return s
def re_to_nfa(e):
if e.type == "eps":
...
elif e.type == "symb":
qstart = new_state()
qaccept = new_state()
return Automata([ [qstart, e.symb, qaccept] ],qstart,[qaccept])
elif e.type == "union":
...
elif e.type == "conc":
...
elif e.type == "star":
aut1 = re_to_nfa(e.left)
s = new_state()
delta = aut1.delta + [ [ s,"eps", aut1.start ] ] + conc_helper(aut1.accept, aut1.start)
return Automata(delta, s, [s]+aut1.accept)
You may have noticed that this code is using a function new_state()
. This is required to avoid having to rename the states by making sure the set of states allocated for 2 regexps are disjoint. To achieve this goal, we define a global variable named state_count
and a function new_state()
that increment the global variable state_count
and returns its previous value. The global variable is invisible from re_to_nfa()
. (Another solution without using a global variable is possible, but at the expense of adding an argument to re_to_nfa()
.)
re_to_nfa()
on different regexp, you may want to reset the global variable state_count
to 0
everytime.
re1 = ... # (aub)(cud)(euf)
re2 = ... # (a*b*)*
re3 = ... # (aub)*aaa((aub)*)
Convert each regexp to an NFA using re_to_nfa()
nfa1 = re_to_nfa(re1)
nfa2 = re_to_nfa(re2)
nfa3 = re_to_nfa(re3)
For each NFA, test the string matching with 4 representative strings. The strings should be choosen to illustrate the correctness of your implementation.
Make sure that if I run your program, it will print clearly all the tests, such that I can quickly evaluate your work, thanks!
csci341
git folder. Add, commit and push your work.