COMP218 (Automata Theory)
This is the order of how powerful various languages are in comparison to each-other.
Unrestricted Grammars
These grammars are made of a string of variables and terminals that lead to a string of variables and terminals like so:
\[\begin{aligned}
S&\rightarrow aBc\\
aB&\rightarrow cA\\
Ac&\rightarrow d
\end{aligned}\]
This is very similar to context free grammars.
Natural Language Processing
A language $L$ is Turing-acceptable if and only if $L$ is generated by an unrestricted grammar:
- Context-free grammars are not sufficient to capture all natural languages.
- Unrestricted grammars are sufficient, but too complex to be used in natural language processing (parsing is undecidable).
Context-Sensitive Grammars
These are made from a string of variables and terminals that lead to a string of variables and terminals, however for:
\[u\rightarrow v\]
The size of $u$ must the less than, or equal to, $v$:
\[\lvert u\rvert\leq\lvert v\rvert\]
For example, the language $\{a^nb^nc^n\}$ is context-sensitive:
\[\begin{aligned}
S&\rightarrow abc\vert aAbc\\
Ab&\rightarrow bA\\
Ac &\rightarrow Bbcc\\
bB &\rightarrow Bb\\
aB &\rightarrow aa\vert aaA
\end{aligned}\]
Linear-Bounded Automaton (LBA)
The computational version of context-sensitive grammars are LBAs.
These are Turing machines that can only write to the tape space reserved for the input string:
- There are left and right-hand markers that the input string lie between.
An example of languages accepted by LBAs are:
\[L=\{a^nb^nc^n\}\]
and
\[L=\{a^{n!}\}\]
- LBAs are more powerful than PDAs
- LBAs are less powerful than Turing machines.
LBAs allow us to find loops as there is a limit on the number of states it can be in.
Context-Sensitive Languages
A language $L$ is context sensitive if an only if $L$ is accepted by a linear-bounded automaton.
There are languages which are recursive but not context sensitive.
The following language is decidable but not context-sensitive:
\[L_d=\{\langle B\rangle\vert B \text{ is an LBA and }\langle B\rangle \notin L(B)\}\]
This means that $B$ is a LBA that isn’t in the language of $B$.
It is decidable by the following program:
- Check if input $x=\langle B\rangle$ for some LBA $B$, if not reject.
- Run $B$ on input $\langle B\rangle$, if accepts then reject, otherwise accept.
Proof by contradiction:
Suppose LBA \(B^*\) recognises \(L_d\) (\(B^*\) is in the language \(L(B^*)\)):
- If \(\langle B^*\rangle\notin L(M^*)\) then \(\langle B^*\rangle\in L(B^*)\).
- If \(\langle B^*\rangle\in L(B^*)\) then \(\langle B^*\rangle\notin L(B^*)\).
The Chomsky Hierarchy
The languages we have seen so far are organised in the following hierarchy (with the most expressive and complex at the top):
- Recursively Enumerable
- Recursive
- Context-Sensitive
- Context-Free
- Deterministic Context-Free
The number of all possible languages is $2^{\Sigma^*}$ and the complement of this are unrecognisable languages.
Post Correspondence Problem
Given an infinite set of tiles the the following form:
graph LR
subgraph tile
bab
cc
end
This is like a domino but the top and bottom could also include the empty string.
Undecidability of PCP
\[PCP = \{\langle T\rangle \vert T \text{ is a collection of tiles that contain a top-bottom match}\}\]
This language is:
- Not Recursive
- Recursively Enumerable:
- A Turing machine would just go on forever until a match is found.
Ambiguity of CFGs
\[\text{AMB}=\{G\vert G \text{ is an ambiguous CFG}\}\]
This language is:
- Not Recursive
- Recursively Enumerable
- You can only solve for ambiguous grammars.
We can argue that:
If $\text{AMB}$ is recursive, then so is $\text{PCP}$.
Solving Decidability of Ambiguous CFGs with the PCP
We need to find a way to convert the one problem to the other:
\[T(\text{collection of tiles})\rightarrow G(\text{CFG})\]
- If $T$ can be matched, then $G$ is ambiguous.
- If $T$ cannot be matched, then $G$ is unambiguous.
-
Number the tiles:
flowchart LR
subgraph 1
direction LR
bab
cc
end
subgraph 2
direction LR
c
ab
end
subgraph 3
direction LR
a
ab2[ab]
end
1 -.- 2 -.- 3
-
Generate the terminals from the tile numbers and the distinct letters on the tiles.
Terminals: a, b, c, 1, 2, 3
- There are always three variables:
- $S$ - Start
- $T$ - Top
- $B$ - Bottom
- For the productions we always have:
and then for each of the tiles we have productions like so:
- $T\rightarrow \text{bab}T1$
- $B\rightarrow \text{cc}B1$
- $T\rightarrow \text{bab}1$
- $B\rightarrow \text{cc}1$
If the top row and bottom row match then there are multiple derivations for the same string:
- $S\Rightarrow T$ $\Rightarrow\text{bab}T1$ $\Rightarrow\text{babc}T21$ $\Rightarrow\text{babcc}221$
- $S\Rightarrow B$ $\Rightarrow\text{cc}B1$ $\Rightarrow\text{ccab}B21$ $\Rightarrow\text{ccabab}221$
These two don’t produce the same string. Therefore, there is no match and there is only one derivation. Hence the parse tree is not ambiguous.
We currently know the decidability of the following problems:
Decidable |
Undecidable |
DFA $M$ accepts $w$ |
TM $M$ accepts $w$ |
CFG $G$ generates $w$ |
TM $M$ halts on $w$ |
DFAs $M$ and $M’$ accept the same inputs |
TM accepts some input |
|
TM $M$ and $M’$ accept the same inputs |
Computation Histories as Strings
If $M$ halts on “w”, the computation history of $(M,w)$ is the sequence of configurations $C_1,\ldots,C_i$ that $M$ goes through on input $w$.
The computation history can be written as a string hist
over alphabet $\Gamma\cup Q\cup\{\#\}$:
- Accepting History - $q_\text{acc}$ occurs in
hist
.
- Rejecting History - $q_\text{rej}$ occurs in
hist
.
Undecidable CFG Example
Consider the following CFG:
\[\text{ALL}_\text{CFG}=\{\langle G\rangle\vert G\text{ is a CFG that generates all strings}\}\]
This language is undecidable.
We can argue that if $\text{ALL}_\text{CFG}$ is recursive, then so is $\overline{A_\text{TM}}$.
graph LR
m["(G)"] --> A
A -->|if G generates all strings| accept
A -->|if not| reject
flowchart LR
m["(M,w)"] --> c
subgraph S
c[construct G] -->|"(G)"| A
end
A -->|if M rej/loops w| accept
A -->|if M accepts| reject
…
I’m not too sure about what the method is completing in this example. Ask about slide 137 if it comes up in the tutorials.
This is a common technique that can be used to show how hard a problem is or how to reduce a hard problem to a simpler ones.
As we know that $A_\text{TM}$ is not recursive we can use that to prove that a language $L$ is not recursive:
- If some TM $R$ decides $L$ then using $R$ we can build another TM $S$ so that $S$ decides $A_\text{TM}$
This is impossible, as $A_\text{TM}$ is not recursive. This proves the theory by contradiction.
Reducability Example
Recursively Enumerable Languages
Consider that we have the following language:
\[\text{AEPS}_\text{TM}=\{\langle M\rangle\vert M\text{ is a TM that accepts input }\epsilon\}\]
To know if $M$ accepts $\epsilon$ we will have to simulate it:
- This means that we may end up in a loop as $M$ is a Turing machine.
This is just a guess but it still needs to be proved.
We can complete the proof by completing the reduction:
graph LR
m["(M')"] --> R
R -->|if M' accepts epsilon| accept
R -->|if not| reject
We can make a Turing machine $M’$ that:
- On input $z$ (ignore),
- Simulate $M$ on input “w”:
- If $M$ accepts $w$, accept.
- Otherwise, reject.
We can then make a Turing machine $S$ that has the following actions:
- On input $\langle M,w\rangle$:
- Construct the following TM $M’$:
- On input $z$ (ignore),
- Simulate $M$ on input “w”:
- If $M$ accepts $w$, accept.
- Otherwise, reject.
- Run $R$ on input $\langle M’\rangle$ and output its answer.
flowchart LR
m["(M,w)"] --> c
subgraph S
c[construct M'] -->|"(M')"| R
end
R -->|if M accepts w| accept
R -->|if not| reject
If this machine $S$ exists then $S$ accepts $\langle M,w\rangle$ if and only if $M$ accepts $w$. This means that $S$ decides $A_\text{TM}$, which is impossible.
Therefore this language is undecidable but still recursively enumerable.
Undecidable Languages
Consider that we have the following Turing machine:
\[E_\text{TM}=\{\langle M\rangle\vert M\text{ is a TM that accepts no input}\}\]
We need to show:
- If $E_\text{TM}$ can be decided by some TM $R$
- then $\text{SOME}_\text{TM}$ can be decided by another TM $S$.
Where:
\[\text{SOME}_\text{TM} = \{\langle M\rangle\vert M\text{ is a TM that accepts some input}\}\]
Therefore consider $S$:
- On input $\langle M\rangle$ where $M$ is a TM:
- Run $R$ on input $\langle M\rangle$:
- If $R$ accepts, reject.
- If $R$ rejects, accept.
Then $S$ decides $\text{SOME}_\text{TM}$ is a contradiction, proving that this language is undecidable.
We know if $L$ and $L’$ are both recursively enumerable then they are both recursive:
- As they are both opposite and aren’t recursive then they must both be unrecognisable.
We have found that:
The language $A_\text{TM}$ is recursively enumerable but not recursive.
Consider that we have a set of languages that are not recognisable:
\[\begin{aligned}
\overline{A_\text{TM}}=&\{\langle M,w\rangle:M\text{ rejects or loops on input }w\}\\
&\cup\{x:x\text{ does not describe a TM or a valid input}\}
\end{aligned}\]
The language $\overline{A_\text{TM}}$ is not recursively enumerable.
Proof of Unrecognisable Languages
If $L$ and $\overline L$ are both recursively enumerable then $L$ is recursive.
We know that $A_\text{TM}$ is recursively enumerable, so if $\overline{A_\text{TM}}$ was also, then $A_\text{TM}$ would be recursive:
- But Turing’s theorem says that $A_\text{TM}$ is not recursive.
This statement is a contradiction.
Consider the following TM for the statement at the top:
flowchart LR
w --> M & M'
subgraph
M
M'
end
M -->|accept if w in L| accept
M -->|reject/loop if w not in L| reject
M' -->|reject/loop if w in L| accept
M' -->|accept if w not in L| reject
This machine will eventually give us an answer as if one machine gets into a loop then the other one is still able to answer.
Let $M=\text{TM}$ for $L$, $M’=\text{TM}$ for $\overline L$.
On input $w$:
-
Simulate $M$ on input $w$. If it accept, accept.
If $M$ loops on $w$, we will never get to step two.
-
Simulate $M’$ on input $w$. If it accepts, reject.
Bounded Simulation
To fix the issue where we don’t get to step two we can complete a bounded simulation.
Let $M=\text{TM}$ for $L$, $M’=\text{TM}$ for $\overline L$.
For $t:=0$ to infinity:
- Do $t$ transitions of $M$ on $w$. If it accepts, accept.
- Do $t$ transitions of $M’$ on $w$. If it accepts, reject.
A TM that decides $A_\text{TM}$ is so powerful that it will destroy itself.
Proof of Turing’s Theorem by Contradiction
Suppose that $A_\text{TM}$ is decidable.
There exists a total TM $H$ that decides $A_\text{TM}$.
It should have the following terminal states:
- Accept:
- If $M$ accepts $\langle M\rangle$
- Reject
- If $M$ rejects $\langle M\rangle$
- If $M$ loops on $\langle M \rangle$
Consider that the input string $w=\langle M\rangle$.
Also consider that we have a Turing machine $H’$ that has its outputs swapped:
flowchart LR
a[ ] --> H
subgraph H'
H --> acc & rej
end
acc --> r2[rej]
rej --> a2[acc]
Also consider that we have a TM called $D$ that does the following:
graph LR
subgraph D
copy -->|"(M,(M))"| H'
end
M["(M)"] --> copy
H' --> acc & rej
Consider that $M=D$ is fed into $D$:
- If $D$ accepts $\langle D\rangle$ then $D$ rejects $\langle D\rangle$.
- If $D$ rejects $\langle D\rangle$ then $D$ accepts $\langle D\rangle$.
This is a contradiction and therefore $A_\text{TM}$ is undecidable.
There is also an example of this proof in table form in the lecture video.
A universal Turing machine defines a formal specification to record a program $M$:
graph LR
p[Program M] & i[Input x for M] --> U --> w[Whatever M does on x]
The universal TM $U$ takes as input a program $M$ and a string $x$ and simulates $M$ on $x$.
Turing Machines as Programs
We can write the following Turing machine:
stateDiagram
direction LR
[*] --> q
state M {
direction LR
q --> q:□/□R
q --> qa:0/0R
q --> qr:1/1R
}
qa -->[*]
as the following program:
<M> = (q, qa, qr) (0, 1) (0, 1, □)
((q, q, □/□R)
(q, qa, 0/0R)
(q, qr, 1/1R))
(q) (qa) (qr)
This relates to the definition of a Turing machine which is:
\[(Q,\Sigma,\Gamma,\delta,q_0,q_\text{acc},q_\text{rej})\]
Universal Turing Machine
Once we have $M$ we can then pass it into the universal Turing machine $U$:
U := On input <M, w>
Simulate M on input w
If M enters accept state, accept.
If M enters reject state, reject.
Acceptance of Turing Machines
Consider we want to make a Turing machine that decides whether a Turing machine accepts a given string:
\[A_\text{TM}=\{\langle M,w\rangle:M\text{ is a TM that accepts }w\}\]
From this we can have three results:
- $M$ accepts $w$ - $U$ accepts $\langle M,w\rangle$.
- $M$ rejects $w$ - $U$ rejects $\langle M,w\rangle$.
- $M$ loops on $w$ - $U$ loops on $\langle M,w\rangle$.
Therefore $U$ recognises this language by does not decide $A_\text{TM}$:
- Recognises - The set of inputs that make it accept.
- Decides - The set that is recognised and halts.
Terminology
The following terms are sometimes referred to by different names:
- Recursive Language:
- Decidable Problem
- Total Turing machine.
- Recursively Enumerable Language:
- Semi-decidable problems.
- Recognisable languages.
- Sometimes called an undecidable problem.
- Undecidable Problem:
Checking Decidability with a TM
To check whether a DFA is decidable, we can use a Turing machine.
For this we need to simulate the DFA $D$ with the input $w$ on the turing machine:
- If the simulation ends in an accepting state then the Turing machine accepts.
To perform the simulation we complete the following:
- Put markers on the start state of $D$ and the first symbol of $w$.
- Until the marker for $w$ reaches the last symbol:
- If state marker is on accepting state, accept.
We can also use our intuition to find decidability.
Decidability
- Not all CFGs are decidable.
- Entscheidungs problem states that there is not algorithm to decide whether a given statement is provable from the axioms using the rules of first-order logic.
The running time of a TM $M$ is the function $t_M(n)$ where the function equals:
The maximum number of steps that $M$ takes on any input of length $n$.
This is basically big O notation.
Efficiency & the Church-Turing Thesis
Although different types of Turing machines are equally as powerful:
- They can’t all compute the same problem in the same running time.
Cobham-Edmonds Thesis
A computational problem can be feasibly solved on a computational device only if they can be computed in polynomial time, i.e., in time $O(n^c)$ where $n$ is the size of the input and $c$ is some constant.
This can differ between machine architectures.
P vs NP
- $P$ is the class of language that can be decided on a TM with polynomial running time in the input length.
- $NP$ is the class of language that can be decided on a nondeterministic TM with polynomial running time in the input length
It is an open question whether $P=NP$.
Our formal model of a computer will use the following instruction set:
Instruction |
Description |
load x |
Put the value $x$ into $R_0$. |
load Rk |
Copy the value of $R_k$ into $R_0$. |
store Rk |
Copy the value of $R_0$ into $R_k$ |
read Rk |
Copy the value at memory location $R_k$ into $R_0$. |
write Rk |
Copy the value of $R_0$ into memory location $R_k$. |
add Rk |
Add $R_0$ and $R_k$, and put the result in $R_0$. |
jump n |
Set PC to $n$. |
jzero n |
Set PC to $n$, if $R_0$ is zero. |
jpos n |
Set the PC to $n$, if $R_0$ is positive. |
Random Access Machines
Random access machines have the following memory components:
- $PC$ - Program Counter
- $R_x$ - Registers
- $M$ - Memory.
Simulating a TM on a RAM
- Write the contents of the tape to $M$ where the language $\Gamma=\{0,1,2,\ldots,k\}$ with 0 being equal to a blank.
- Write the index of the head pointer to $R_0$
There is a full example of this in the lecture.
Simulating a RAM on a 2 Tape TM
The configuration of a RAM consists of:
- Program Counter
- Contents of Registers
- Indexes and contents of all non-empty memory cells.
Therefore for this RAM:
- PC - 14
- $R_0$ - 3
- $R_1$ - 17
- $R_2$ - 5
- $M$ - 2, 0, 1, 2, 0, …
The configuration would be:
\[(14, 3, 17, 5, (0, 2), (2, 1), (3, 2))\]
The TM has a simulation tape and a scratch tape. The simulation tape stores the RAM configuration:
- The TM has a set of states corresponding to each program instruction of the RAM.
- The TM tape is updated according to RAM instructions.
There is a full example of simulating a RAM on a 2 tape Turing machine in the lecture video
All variants of Turing machines have the same computational ability.
The Church-Turing Thesis
Every effectively calculable function is a computable function where:
- Effectively Calculable - Produced by any intuitively effective means whatsoever.
- Effectively Computable - Produced by a Turing-machine or equivalent mechanical device.
Multitape Turing Machine
A multitape Turing machine has a head that can access to multiple different infinitely long tapes.
- The transition may depend on the contents of all the cells.
- Different tape heads can be moved independently.
This can be more convenient but is not more computationally powerful.
We can use instructions like the following to instruct three heads:
stateDiagram
direction LR
q3 --> q7:0/1R\n□/1R\n0/0L
This would then take the Turing machine from the following state to the next:
flowchart LR
subgraph q3
head1[Head] --> 10 & 22 & 31
subgraph T1
10[0]
11[1]
12[0]
end
subgraph T2
20[0]
21[1]
22[ ]
end
subgraph T3
30[1]
31[0]
32[0]
end
end
subgraph q7
head2[Head] --> 211 & 223 & 230
subgraph T21
210[1]
211[1]
212[0]
end
subgraph T22
220[0]
221[1]
222[1]
223[ ]
end
subgraph T23
230[1]
231[0]
232[0]
end
end
q3 --> q7
Each line is an instruction for each of the three heads.
Simulating a Multitape TM
We can represent a multitape machine by writing the contents next to each other on a single tape with infinite space $\#$ between them.
So the following multitapes:
- $x_1,x_2,\ldots,\overset \downarrow {x_a}\ldots,x_i,\ldots$
- $y_1,y_2,\ldots,\overset \downarrow {y_a},\ldots$
- $z_1,z_2,\ldots,\overset \downarrow {z_a},\ldots$
can be represented as the following single tape:
\[\overset \downarrow \#,x_1,x_2,\ldots,\dot{x_a},\ldots,x_i\#y_1,y_2,\ldots\dot{y_a}\#z_1,z_2,\ldots\dot{z_a}\#\]
Therefore we can use the following method on input $w_1,w_n$:
- Replace the tape contents by $\#\dot{w_1},w_2\ldots w_n\#\dot{\square}\#\dot{\square}\#$ and remember (in state) that $M$ is in state $q_0$.
-
To simulate a step of $M$, make a pass over the tape to find $\dot{x},\dot{y},\dot{z}$. If $M$ is in state $q_j$ and has transition:
stateDiagram
direction LR
qj --> qi:x/x'A\ny/y'B\nz/z'C
update the state/tap accordingly.
- If $M$ reaches accept/reject state, accept/reject.
Non-Deterministic Turing Machines
This type of Turing machine has no addition expressive power compared to a regular Turing machine.
Consider that we want to make a TM that accepts the following language:
\[L=\{w\#w:w\in\{a,b\}^*\}\]
We can write the following program, that instructs the head:
- Until you reach $\#$
- Read and remember entry ($x\underline bbaa\#xbbaa$)
- Write $x$ ($x\underline xbaa\#xbbaa$)
- Move right past $\#$ and past all $x$s ($xxbaa\#x\underline bbaa$)
- If this entry is different, reject
otherwise:
- Write $x$ ($xxbaa\#x\underline baa$)
- Move left past $\#$ and to right of first $x$ ($xx\underline baa\#xxbaa$)
- If you see only $x$s followed by $\square$, accept.
From this we can make the following state machine:
stateDiagram
direction LR
[*] --> q0
q0 --> qa1:a/xR
q0 --> q1:#/#R
q0 --> qb1:b/xR
qa1 --> qa1:a/aR\nb/bR
qa1 --> qa2:#/#R
q1 --> q1:x/xR
q1 --> qacc:□/□R
qacc --> [*]
qb1 --> qb1:a/aR\nb/bR
qb1 --> qb2:#/#R
qa2 --> qa2:x/xR
qa2 --> q2:a/xL
qb2 --> qb2:x/xR
qb2 --> q2:b/xL
q2 --> q2:a/aL\nb/bL\nx/xL
q2 --> q3:#/#L
q3 --> q3:a/aL\nb/bL
q3 --> q0:x/xR
All other transitions go to $q_\text{rej}$.
We will generally just use the high level description for simplicity.
Examples
There are additional examples of Turing machines that complete:
- Multiplication
- Comparing Strings
- Reading Graphs as Strings $\langle G\rangle$ (Serialisation)
in the lecture video.
What is a Computer
A computer is a machine that manipulates data according to a list of instructions:
graph LR
program --> computer
input --> computer
computer --> output
It is impossible for computer to predict what any computer program will do.
Turing Machines
graph
control -->|head| a
subgraph tape
1[ ]
2[ ]
a
b
c[b]
3[ ]
4[ ]
end
A Turing machine can:
- Read and b to a tape that is infinitely long in both directions, at the head position.
- Can move the head left and right.
- Has two special states: accept and reject.
A Turing machine is $(Q,\Sigma,\Gamma,\delta,q_0,q_\text{acc},q_\text{rej})$:
This definition states that Turing machines are deterministic. Deterministic Turing machines can simulate non-deterministic ones.
Turing Machine Notation
The following state diagram:
stateDiagram
direction LR
q1 --> q2:a/bL
means that if we read an $a$:
- Replace with a $b$
- Move left.
The configuration of the Turing machine can be written like so::
\[abq_1a\]
Where:
Configurations
We say the configuration $C$ yields $C’$ if the TM can go from $C$ to $C’$ in one step:
\[abq_1a \text{ yeilds } abbq_\text{acc}\]
- The start configuration of the TM on input $w$ is $q_0w$.
- An accepting configuration is one that contains $q_\text{acc}$.
- A rejecting configuration is one that contains $q_\text{rej}$
The Language of a Turing Machine
We say that $M$ accepts $x$ if there exists a sequence of configurations $C_0,C_1,\ldots,C_k$ where:
- $C_0$ is starting.
- $C_i$ yields $C_{i+1}$.
- $C_k$ is accepting.
Halting
Turing Machines can get get into a loop where they never halt.
We say $M$ halts on $x$ if there exists a sequence of configurations $C_0,C_1,\ldots,C_k$ where:
- $C_0$ is starting.
- $C_i$ yields $C_{i+1}$.
- $C_k$ is accepting or rejecting.
A TM is total if it halts on every input.
A language $L$ is recursive (decidable) if it is recognised by a total TM.
Intersection of Context-Free & Regular Languages
The intersection of a CFL and a regular language:
\[L_1\cap L_2\]
is context free.
Combining the Languages
To combine the CFL $M_1$ (a PDA) and regular language $M_2$ (a DFA) we can use the following transformations:
-
Transition 1
stateDiagram
state M1 {
direction lr
q1 --> q2:a, b/c
}
state M2 {
direction lr
p1 --> p2:a
}
state M {
direction lr
q1,p1 --> q2,p2:a, b/c
}
M1 --> M
M2 --> M
-
Transition 2
stateDiagram
state M1 {
direction lr
q1 --> q2:epsilon, b/c
}
state M2 {
direction lr
p1
}
state M {
direction lr
q1,p1 --> q2,p1:epsilon, b/c
}
M1 --> M
M2 --> M
-
Initial State
stateDiagram
state M1 {
direction lr
[*] --> q0
}
state M2 {
direction lr
[*] --> p0
}
state M {
direction lr
[*] --> q0,p0
}
M1 --> M
M2 --> M
-
Final States
stateDiagram
state M1 {
direction lr
q1 --> [*]
}
state M2 {
direction lr
p1 --> [*]
p2 --> [*]
}
state M {
direction lr
q1,p1 --> [*]
q1,p2 --> [*]
}
M1 --> M
M2 --> M
General Rules
Applications of Regular Closure
You can use combinations of regular and context-free closures to prove that a language is context-free:
-
If we can make a language as an intersection of context-free and regular languages, then we know that is it context-free.
This allows you to prove that the language is context free without having to build a grammar.
There is an example of this at the end of the lecture video.
Union
The union of two CFLs:
\[L_1\cup L_2\]
is context free.
In general, for two context free languages $L_1$ and $L_2$ with start variables $S_1$ and $S_2$ the grammar of the union can be represented as:
\[S\rightarrow S_1\vert S_2\]
Where all other productions are kept the same.
Concatenation
The concatenation of two CFLs:
\[L_1L_2\]
is context free.
In general, for two context free languages $L_1$ and $L_2$ with start variables $S_1$ and $S_2$ the grammar of the concatenation can be represented as:
\[S\rightarrow S_1S_2\]
Kleene Star
The star operation of a CFL:
\[L^*\]
is context free.
In general the star operation of a language can be represented as:
\[S_1\text{ (new start variable)}\\
S_1\rightarrow SS_1\vert \epsilon\]
Intersection
The intersection of two CFLs:
\[L_1\cap L_2\]
is not necessarily context free.
Complement
The complement of a CFL:
\[\bar L\]
is not necessarily context free.
If a derivation is long enough, some variable must appear twice on the same path in a parse tree (pigeon hole principle):
- This means that any sufficiently large derivation will have a middle part that can be repeated indefinitely.
Pumping in General
Consider that you have a CFG that accepts the following string with a production that repeats at least once.:
\[uvwxy\]
We can make the production repeat more to create the following string:
\[uv^2wx^2y\]
Without repeating the production, the following is valid under the same language:
\[uvy\]
Pumping Example
Consider the following language, you’d like to know whether it is a CFG:
\[L=\{a^nb^nc^n:n\geq0\}\]
If $L$ is a CFG then we should be able to split it into portions of $uv^nwx^ny$ and $uvy$.
This is not possible with this language no matter how it is split.
More examples of non-context free languages are given in the following lecture video.
Pumping Lemma for Context-Free Languages
For every context-free language $L$:
There exists a numver $n$ such that for every string $z$ in $L$ longer than $n$, we can write $z=uvwxy$ where:
- $\lvert vwx\rvert\leq n$
- $\lvert vx\rvert\geq 1$
- For every $i\geq0$, in the string $uv^iwx^iy$ is in the language $L$.
To prove that a language is not context-free we need to have one of two cases:
-
$v$ or $x$ contains two kinds of symbols:
\[a\underbrace{aa\ldots aa}_vbbb\ldots \underbrace{bbcc}_xc\ldots cc\]
Then $uv^2wx^2y$ is not in $L$ because the pattern is wrong.
-
$v$ or $x$ contains one kind of symbol:
\[a\underbrace{aa\ldots }_vaabbb\ldots \underbrace{bb\ldots bb}_xc\ldots cc\]
Then $uv^2wx^2y$ does not have the same number of $a$s $b$s and $c$s.
Simplified PDA
To convert a PDA to CFG you first have to convert to a simplified PDA:
graph LR
CFG --> PDA --> spda[Simplified PDA] --> CFG
To be simplified it must have the following properties:
- Has a single accept state.
- Empties it’s stack before accepting.
- Each transaction is either a push, or a pop, but not both.
Consider the following PDA:
stateDiagram
[*] --> q0
q0 --> q1:epsilon, epsilon/$
q1 --> [*]
q1 --> q1: 0, epsilon/a
q1 --> q2: 1, a/b
q2 --> [*]
q2 --> q2: 1, a/epsilon\n1, b/epsilon
q1 --> q3: epsilon, epsilon/epsilon
q2 --> q3: epsilon, epsilon/epsilon
q3 --> q3: epsilon, a/epsilon\nepsilon, b/epsilon
q3 --> [*]
q3 --> q4: epsilon, $/epsilon
q4 --> [*]
-
Single accept state:
stateDiagram
[*] --> q0
q0 --> q1:epsilon, epsilon/$
q1 --> q1: 0, epsilon/a
q1 --> q2: 1, a/b
q2 --> q2: 1, a/epsilon\n1, b/epsilon
q1 --> q3: epsilon, epsilon/epsilon
q2 --> q3: epsilon, epsilon/epsilon
q3 --> q3: epsilon, a/epsilon\nepsilon, b/epsilon
q3 --> q4: epsilon, $/epsilon
q4 --> [*]
-
Empty stack before accepting:
This is already the case so no change is needed.
-
Each transition is a push, pop, but not both:
stateDiagram
[*] --> q0
q0 --> q1:epsilon, epsilon/$
q1 --> q1: 0, epsilon/a
q1 --> q5: 1, a/epsilon
q5 --> q2: 1, epsilon/b
q2 --> q2: 1, a/epsilon\n1, b/epsilon
q1 --> q6: epsilon, epsilon/x
q6 --> q3: epsilon, x/epsilon
q2 --> q7: epsilon, epsilon/x
q7 --> q3: epsilon, x/epsilon
q3 --> q3: epsilon, a/epsilon\nepsilon, b/epsilon
q3 --> q4: epsilon, $/epsilon
q4 --> [*]
Simplified PDA to CFG
Consider the following run of the PDA on the input 00011
:
Input |
State |
Stack |
|
|
|
|
$q_0$ |
|
|
|
|
$\epsilon$ |
$q_1$ |
$ |
|
|
|
$0$ |
$q_1$ |
$ |
a |
|
|
$0$ |
$q_1$ |
$ |
a |
a |
|
$0$ |
$q_1$ |
$ |
a |
a |
a |
$1$ |
$q_5$ |
$ |
a |
a |
|
$\epsilon$ |
$q_2$ |
$ |
a |
a |
b |
$1$ |
$q_2$ |
$ |
a |
a |
|
$\epsilon$ |
$q_7$ |
$ |
a |
a |
x |
$\epsilon$ |
$q_3$ |
$ |
a |
a |
|
$\epsilon$ |
$q_3$ |
$ |
a |
|
|
$\epsilon$ |
$q_3$ |
$ |
|
|
|
$\epsilon$ |
$q_4$ |
|
|
|
|
-
We can view each column of the same character in the stack as a production:
-
For the first column of a
we can have the following production:
\[A_{13}=\{x,x\text{ leads from } q_1 \text{ to } q_3\text{ and does not dip under the red line}\}\]
The red line here is the first column of a
.
This is called $A_{13}$ as it takes us from $q_1$ to $q_3$. This is taken from the row before the column up to the row of the last entry in the column.
-
From this we can consider the inputs that surround the column in order to complete the production:
\[A_{13}\rightarrow 0A_{13}\epsilon\]
These are taken from the row where the column starts and the row after the column ends.
There are additional, simpler, examples available in the lecture video.
General Method
The CFG will be made of two type of productions:
- Variables ($A_{ij}$) - Go from $q_i$ to $q_j$.
- Start Variable ($A_{0f}$) - Go from the single start state $q_0$ to the single accept state $q_f$.
We can then simplify the following components:
-
$A_{ij} \rightarrow aA_{i_1j_1}b$
stateDiagram
direction LR
state intermediate {
direction LR
[*] --> [*]
}
qi --> qi1:a, epsilon/x
qi1 --> intermediate
intermediate --> qj1
qj1 --> qj:b, x/epsilon
-
$A_{ik}\rightarrow A_{ij}A_{jk}$
stateDiagram
direction LR
state intermediate1 {
direction LR
[*] --> [*]
}
state intermediate2 {
direction LR
[*] --> [*]
}
qi --> intermediate1
intermediate1 --> qj
qj --> intermediate2
intermediate2 --> qk
-
$A_{ii}\rightarrow\epsilon$
stateDiagram
qi
Pushdown Automata Convention
When we have a sequence of transitions like:
stateDiagram
direction LR
q0 --> q01: x,a/b
q01 --> q02: epsilon, epsilon/c
q02 --> q1: epsilon, epsilon/d
Pop $a$ then push $b$, $c$ and $d$.
We will abbreviate it like this:
stateDiagram
direction LR
q0 --> q1: x, a/dcb
Replace $a$ by $dcb$ on top of the stack (notice: the reverse order: the first symbol of the word is at the top of the stack)
Converting a CFG to a PDA
The idea is to simulate the derivations.
We will convert the following context free grammar:
\[\begin{aligned}
A&\rightarrow 0A1\\
A&\rightarrow B\\
B&\rightarrow \#
\end{aligned}\]
This would produce the following PDA:
stateDiagram
direction LR
[*] --> q0
q0 --> q1:epsilon, epsilon/A$
q1 --> q2:epsilon, $/epsilon
q2 --> [*]
q1 --> q1:epsilon,A/0A1\nepsilon,A/B\nepsilon,B/#
q1 --> q1:0,0/epsilon\n1,1/epsilon\n#,#/epsilon
- The first transition is the main production.
- The middle is all productions and terminals.
- The end is the empty stack.
We can then use this to match a given input string (00#11
):
Stack |
Input Processed |
$A |
|
$1A0 |
|
$1A |
0 |
$11A0 |
0 |
$11A |
00 |
$11B |
00 |
$11# |
00 |
$11 |
00# |
$1 |
00#1 |
$ |
00#11 |
As only the empty stack remains, then the input is valid.
General CFG to PDA Conversion
stateDiagram
direction LR
[*] --> q0
q0 --> q1:epsilon, epsilon/S$
q1 --> q2:epsilon, $/epsilon
q2 --> [*]
q1 --> q1:epsilon,A/a1...ak\nfor every production A --> a1...ak
q1 --> q1:a,a/epsilon\nfor every terminal a
For additional examples of PDAs see this lecture video.
Pushdown automata are a computation version (like NFA/DFA) of context free grammars. They are equivalent to CFGs.
Pushdown automata are similar to NFAs but they have an infinite stack:
- As the PDA is reading the input, it can push/pop symbols from the top of the stack.
Building a PDA
We can build a PDA for the following context free grammar:
\[L=\{0^n1^n:n\geq1\}\]
stateDiagram
direction LR
[*] --> A:push $
A --> A:read 0\npush x
A --> B:read 1\npop x
B --> B:read 1\npop x
B --> [*]:pop $
- We remember each 0 by pushing $x$ onto the stack.
- When we see a 1, we pop an $x$ from the stack.
- We want to accept when we hit the stack bottom.
We use $ to mark the bottom of the stack.
Notation for PDAs
The prior PDA didn’t use the correct notation. Below is the example from before, and a PDA with correct notation:
stateDiagram
state Nieve {
[*] --> A:push $
A --> A:read 0\npush x
A --> B:read 1\npop x
B --> B:read 1\npop x
B --> [*]:pop $
}
state Correct Notation {
[*] --> q1:epsilon, epsilon/$
q1 --> q1:0, epsilon/x
q1 --> q2:1, x/epsilon
q2 --> q2:1, x/epsilon
q2 --> [*]:epsilon, $/epsilon
}
This notation is in the form read, pop/push
Definition of a PDA
A pushdown automaton is ($Q,\Sigma,\Gamma, \delta, q_0,F$) where:
- $Q$ is a finite set of states.
- $\Sigma$ is the input alphabet.
- $\Gamma$ is the stack alphabet.
- $q_0$ in $Q$ is the initial state.
- $F\subseteq Q$ is a set of final states.
- $\delta$ is the transition function.
\[\delta:\underbrace{Q}_\text{state}\times\underbrace{(\Sigma\cup\{\epsilon\})}_\text{input symbol}\times\underbrace{(\Gamma\cup\{\epsilon\})}_\text{pop symbol}\rightarrow\text{subsets of }\underbrace{Q}_\text{state}\times\underbrace{(\Gamma\cup\{\epsilon\})}_\text{push symbol}\]
The Language of a PDA
The language of a PDA is the set of all strings in $\Sigma^*$ that can lead the PDA to an accepting state after the whole input is read.
- A PDA is nondeterministic.
- Multiple transitions on the same pop/input are allowed.
- Transitions may but do not have to push, pop or read input.
Sometimes acceptance is defined by emptying the stack or even an empty stack and accepting state. All these variants correspond to the same class of languages.
- A PCFG is a probabilistic version of a CFG where each production has a probability.
- Probabilities of all productions rewriting a given non-terminal must add up to 1, defining a distribution for each non-terminal.
- String generation is now probabilistic where production probabilities are used to nondeterministically select a production for rewriting a given non-terminal.
This allows us to find the most likely meaning of a sentence in ambiguous languages.
Probabilistic Context Free Grammars
Each production is is assigned a probability. For each non-terminal this probability must add up to 1:
Production |
Probability |
$S\rightarrow NPVP$ |
0.8 |
$S\rightarrow Aux NPVP$ |
0.1 |
$S\rightarrow VP$ |
0.1 |
Parse Tree Probability
The probability of a parse tree is the product of all the production probabilities used.
- If multiple parse trees are valid then we interpret the meaning from the parse tree with the greatest probability.
Probabilistic CYK
In addition to storing the non-terminal in each cell, we also need to store the probability:
- Cell $ij$ contains the most probable parse tree of each non-terminal that can generate the part of the input word from $i$ through $j$ together with its associated probability.
When transforming the grammar to CNF, we set the production probabilities to preserve the probability of derivations.
Parsing CYK Tables
- Use $\max$ to combine probabilities of multiple parse trees in each cell.
- The probability of generating a given sentence is the sum of all probabilities of multiple parse tree in each cell.
Sentence Probability
For all cells in last row
If there is a production A -> xi with probability p
Put A:p in table cell ii
For cells st in other rows (going bottom to top)
If there is a production A -> BC with probability p
where B:pB is in cell sj and C:pC is in cell (j + 1)t and
A:pA in cell st (if not present assume pA = 0)
Update A:pA + p * pB * pC in cell st
Most Likely Parse Tree
For all cells in last row
If there is a production A -> xi with probability p
Put A;p in table cell ii
For cells st in other rows (going bottom up)
If there is a production A -> BC with probability p
where B:pB is in cell sj and C:pC is in cell (j + 1)t and
A:pA in cell st (if not present assume pA=0)
Update A:max(pA, p * pB * pC) in cell st
To complete this algorithm the grammar must be in Chomsky Normal Form, including no $\epsilon$ or unit productions.
Method
This method uses a table like so:
$1k$ |
|
|
|
$\vdots$ |
$\vdots$ |
|
|
12 |
23 |
|
|
11 |
22 |
|
$kk$ |
$x_1$ |
$x_2$ |
$\cdots$ |
$x_k$ |
Where $x_k$ is a symbol from the input sting at position $k$.
For all cells in last row
If there is a production A -> xi
Put A in table cell ii
For cells st in other rows (going bottom-up)
If there is a production A -> BC
where B is in cell sj and C is in cell (j + 1)t
Put A in cell st
Cell $st$ remembers all variables able to generate sub-string $x_s\ldots x_t$.
Example
Consider that we have the following language and input string:
\[\begin{aligned}
S&\rightarrow AB\vert BC\\
A&\rightarrow BA\vert a\\
B&\rightarrow CC\vert b\\
C&\rightarrow AB\vert a\\
\hline
x&=\text{baaba}
\end{aligned}\]
Going up the table, we should write all the productions that can generate the previous row (from the bottom up):
$\underline S,A,C$ (baaba) |
|
|
|
|
- (baab) |
$S,A,C$ (aaba) |
|
|
|
- (baa) |
$B$ (aab) |
$\underline B$ (aba) |
|
|
$S,\underline A$ (ba) |
$B$ (aa) |
$S,\underline C$ (ab) |
$S,A$ (ba) |
|
$\underline B$ (b) |
$\underline A,C$ (a) |
$\underline A,C$ (a) |
$\underline B$ (b) |
$A,\underline C$ (a) |
b |
a |
a |
b |
a |
Key: <productions that can make the sub-string> (<sub-string>)
Productions in the parse tree are underlined.
- For each level you go up you should increase the string length that you are generating:
- String length of 1 starting directly underneath.
- String length of 2 starting directly underneath.
- If you are analysing a sub-string you have already done then you can just copy the answer.
Parse Tree Reconstruction
If the input can be generated then the start symbol will be in the top left (in this case $S$).
- To make the parse tree, start with the start symbol and connect edges between productions in $S$ that can create the whole string.
- Keep iterating over the productions until you reach non-terminals at the end of the tree
graph
1[S] --> 2[A] & 3[B]
2 --> 4[B] & 5[A]
3 --> 6[C] & 7[C]
4 --> 8[b]
5 --> 9[a]
6 --> 10[A] & 11[B]
7 --> 12[a]
10 --> 13[a]
11 --> 14[b]
To parse quicker we can use the Cocke-Younger-Kasami algorithm. To do this we must prepare the CFG and convert it to Chomsky normal form:
- Eliminate $\epsilon$ productions.
- Eliminate unit productions.
- Add rules for terminals and split longer sequences of non-terminals.
A CFG is in Chomsky normal form if every production has the form:
\[A\rightarrow BC\text{ or }A\rightarrow a\]
We allow $S\rightarrow\epsilon$ for the start variable only.
The third step of conversion goes like so. We start with the following production:
\[A\rightarrow BcDE\]
-
Replace Terminals with new variables:
\[\begin{aligned}
A&\rightarrow BCDE\\
C&\rightarrow c
\end{aligned}\]
-
Break up sequences with new variables:
\[\begin{aligned}
A&\rightarrow BX\\
X&\rightarrow CY\\
Y&\rightarrow DE\\
\hline
C&\rightarrow c
\end{aligned}\]
If ORs are used |
, then treat each as a separate production when simplifying. ORs are allowed as they are just a shorthand.
Parsing
- We can use a brute force method like BFS to search if an input is in the language.
- This is not ideal as it produces an exponentially large tree.
- If input is not in the language parsing will never stop.
When to Stop
We could try to aid the previous method by implementing the following:
\[\lvert\text{derived string}\rvert>\lvert\text{input}\rvert\]
This can be problematic as:
- Strings can shrink due to $\epsilon$ productions
- There can be loops in the productions.
Removal of $\epsilon$-Productions
A variable $N$ is nullable if it can derive the empty string:
\[N\overset*\Rightarrow\epsilon\]
-
Identify all nullable variables $N$.
This includes all productions that point to nullable productions.
-
Remove all $\epsilon$-Productions carefully (while adding extra productions to compensate for this).
For every nullable $N$:
- If you see $X\rightarrow\alpha N\beta$, add $X\rightarrow\alpha\beta$
- If you see $N\rightarrow\epsilon$, remove it.
-
If start variable $S$ is nullable:
- Add a net start variable $S’$.
- Add special productions $S’\rightarrow S\vert\epsilon$.
Removal of Unit Productions
A unit production is a production of the form:
\[A\rightarrow B\]
For example:
\[\begin{aligned}
S&\rightarrow0S1\vert1S01\vert T\\
T&\rightarrow S\vert R\vert\epsilon\\
R&\rightarrow 0SR
\end{aligned}\]
graph LR
S --> T
T --> S & R
-
If there is a cycle of unit productions:
\(A\rightarrow B\rightarrow C \rightarrow A\)
delete it and replace everything with $A$.
This would give the following graph and productions:
graph LR
S --> R
\[\begin{aligned}
S&\rightarrow0S1\vert1S01\vert R\vert\epsilon\\
R&\rightarrow 0SR
\end{aligned}\]
-
Replace every chain:
\[A\rightarrow B\rightarrow C\rightarrow \alpha\]
by $A\rightarrow \alpha, B\rightarrow \alpha,C\rightarrow \alpha$.
This would give the following productions:
\[\begin{aligned}
S&\rightarrow0S1\vert1S01\vert 0SR\vert\epsilon\\
R&\rightarrow 0SR
\end{aligned}\]
Complete Method
- Eliminate $\epsilon$ productions.
- Eliminate unit productions.
-
Try all possible derivations via BFS but stop parsing when:
\[\lvert\text{derived string}\rvert>\lvert\text{input}\rvert\]
The order of these steps is important as eliminating $\epsilon$ productions can introduce unit productions.
As there could be more than one parse tree from a given input this can cause issues when trying to define the meaning of a string.
Ambiguity
A CFG is ambiguous if some string has more than one parse tree.
Is $S\rightarrow SS\vert x$ ambiguous?
Yes as you could make the following parse trees for the same input:
graph
subgraph B
1[S] --> 2[S] & 3[S]
2 --> 4[S] & 5[S]
4 --> 7[x]
5 --> 6[x]
3 --> 8[x]
end
subgraph A
21[S] --> 23[S] & 22[S]
23 --> 28[x]
22 --> 24[S] & 25[S]
24 --> 27[x]
25 --> 26[x]
end
Disambiguation
Sometimes you can rewrite the grammar to remove ambiguity:
\[S\rightarrow Sx\vert x\]
This gives only the following parse tree for the previous input:
graph
1[S] --> 2[S] & 3[x]
2 --> 4[S] & 5[x]
4 --> 6[x]
Precedence
For situations, like mathematical operators, there should be a precedence that should be built into the parse tree.
In situations like this you can create a new production that groups the parse tree accordingly:
\[\underbrace{\underbrace{2}_F*\underbrace{(\underbrace{1}_T+\underbrace{\underbrace{2}_F*\underbrace{2}_F)}_T}_F}_E\]
Here the terms are split into factors $F$ and terms $T$.
This can be written as the following productions:
\[\begin{aligned}
E&\rightarrow T\vert E+T\\
T&\rightarrow F\vert T*F\\
F&\rightarrow(E)\vert1\vert2
\end{aligned}\]
Completeness
Disambiguation is not always possible because:
- There exists inherently ambiguous languages.
- There is no general procedure for disambiguation.
In programming languages, ambiguity comes from precedence rules, and we can deal with it like in the previous example.
- Every regular language is context free.
- Not all context free languages are regular.
Regular to Context-Free Conversion
Regular Expression |
Context Free Grammar |
$\emptyset$ |
Grammar with no rules. |
$\epsilon$ |
$S\rightarrow\epsilon$ |
$a$ (alphabet symbol) |
$S\rightarrow a$ |
$E_1+E_2$ |
$S\rightarrow S_1\vert S_2$ |
$E_1E_2$ |
$S\rightarrow S_1S_2$ |
$E_1^*$ |
$S\rightarrow SS_1\vert\epsilon$ |
$S$ becomes that net start symbol.
For examples of many context free grammars see this lecture
Definition
A context-free grammar is given by ($V,\Sigma,R,S$) where:
Notation & Conventions
For the following set of productions:
- $E\rightarrow E+E$
- $E\rightarrow (E)$
-
$E\rightarrow N$
- $N\rightarrow 0N$
- $N\rightarrow 1N$
- $N\rightarrow 0$
- $N\rightarrow 1$
The variables are the capital letters. Terminals are any symbol that is not a capital letter. The start variable is taken from the first production.
You can write the above in the following short-hand:
- $E\rightarrow E+E\vert(E)\vert N$
- $N\rightarrow 0N\vert1N\vert0\vert1$
Derivation
Derivation is a sequential application of productions. We don’t care what order we use the productions as long as the string is valid.
We apply the rules until all symbols are terminal.
When applying a single production use the notation:
\[\alpha\Rightarrow\beta\]
When the full derivation is reached use:
\[\alpha\overset*\Rightarrow\beta\]
Two derivations are different if they use two different rules at any point.
Context-Free Languages
The languages generated by CFG $G$ is the set of all strings at the end of a derivation:
\[L(G)=\{w:w\in\Sigma^*\text{ and }S\overset*\Rightarrow w\}\]
Parse Trees
Instead of writing out the productions by hand you can show the derivation as a parse tree:
\[S\rightarrow SS\vert (S)\vert\epsilon\]
To write a derivation by hand we might write the following:
\[\begin{aligned}
S&\Rightarrow(S)\\
&\Rightarrow(SS)\\
&\Rightarrow((S)S)\\
&\Rightarrow((S)(S))\\
&\Rightarrow(()(S))\\
&\overset*\Rightarrow(()())
\end{aligned}\]
We can the represent this as a parse tree:
graph
1[S] --> 2["("] & 3[S] & 4[")"]
3 --> 5[S] & 6[S]
5 --> 7["("] & 8[S] & 9[")"]
6 --> 10["("] & 11[S] & 12[")"]
8 --> 13[epsilon]
11 --> 14[epsilon]
To read the tree go from left to right and read just the leaves.
One parse tree can represent many different derivations. So matter the order of the same rules applied, you will get the same parse tree.
Precedence in Arithmetic Expressions
Consider we run the following in the python shell:
This could be represented as the following trees:
graph
subgraph 25
25*[*] --- 25+[+] & 255[5]
25+ --- 252[2] & 253[3]
end
subgraph 17
17*[*] --- 17+[+] & 175[5]
17+ --- 172[2] & 173[3]
end
We know that the 17
tree has the correct precedence.
Grammars
Consider that we have the following rules for a grammar:
EXPR -> EXPR + TERM
EXPR -> TERM
TERM -> TERM + TERM
TERM -> NUM
NUM -> 0-9
We can parse the same 2 + 3 * 5
as before like so:
graph
1[EXPR] --- 2[EXPR]
1 --- +
1 --- T1[TERM]
2 --- T2[TERM] --- N1[NUM] --- 22[2]
T1 --- T3[TERM] --- N2[NUM] --- 33[3]
T1 --- *
T1 --- N4[NUM] --- 5
By applying the rules we always get the correct precedence.
Simplistic English Grammar
Consider that we can use the following rules in English:
SENTENCE -> NOUN-PHRASE VERB-PHRASE
NOUN-PHRASE -> A-NOUN
NOUN-PHRASE -> A-NOUN PREP-PHRASE
PREP-PHRASE -> PREP NOUN-PHRASE
With these rules we can make a sentence like the one below:
\[\underbrace{\underbrace{\text{a girl }}_\text{A-NOUN}\underbrace{\underbrace{\text{with }}_\text{PREP}\underbrace{\text{a flower}}_\text{NOUN-PHRASE}}_\text{PREP-PHRASE}}_\text{SENTENCE}\]
You can go further to define all parts of a language in order to make a grammar.
- Convert from $\epsilon$-NFA to NFA:
-
First keep track of the closures (The states that have epsilon transitions):
- $CL(B)=(B,D)$
- $CL(E)=(E,B,C,D)$
Accept states include those that have an accept state in their closure.
Then make a table of the transitions including epsilon transitions:
|
0 |
1 |
$A$ |
$E,B,C,D$ |
$B,D$ |
$B$ |
$\emptyset$ |
$C$ |
$C$ |
$\emptyset$ |
$D$ |
$D$ |
$\emptyset$ |
$\emptyset$ |
$E$ |
$F$ |
$C,D$ |
$F$ |
$D$ |
$\emptyset$ |
This can then be drawn as a graph:
stateDiagram-v2
direction LR
[*] --> A
A --> E:0
A --> B:0
A --> C:0
A --> D:0
A --> B:1
A --> D:1
B --> C:1
C --> D:1
E --> F:0
E --> C:1
E --> D:1
F --> D:0
D --> [*]
B --> [*]
E --> [*]
-
Another one:
Then the table of transitions:
|
a |
b |
0 |
1,2 |
2 |
1 |
1,2 |
0 |
2 |
1,2 |
0 |
and then the associated graph:
stateDiagram-v2
direction LR
[*] --> 0
2 --> 1:a
0 --> 1:a
0 --> 2:a
0 --> 2:b
1 --> 0:b
1 --> 1:a
1 --> 2:a
2 --> 2:a
2 --> 0:b
2 --> [*]
1 --> [*]
- Convert the following to GFAs
Why Are All Distinguishable Pairs Found?
This is because we work backwards from the accept states.
There are $n^2$ pairs of states for each DFA where $n$ is the number of states. Therefore, the number of pairs of states to be searched is finite.
Why Are There No Inconsistencies When We Merge?
This is the scenario that the same transition goes to multiple state groups.
This is not an issue as we only merge indistinguishable states.
Why is There no Smaller DFA.
If there were a smaller DFA you would be able to merge two states and transitions like so:
stateDiagram-v2
state M{
direction LR
[*] --> q0
q0 --> q:v
q0 --> q':v'
}
state M'{
direction LR
[*] --> q0'
q0' --> q'':v,v'
}
However, as every pair of states is distinguishable (goes to an accept or a reject state), the above cannot happen:
stateDiagram-v2
state M{
direction LR
[*] --> q0
q0 --> q:v
q0 --> q':v'
q --> accept:w
accept --> [*]
q' --> reject:w
}
state M'{
direction LR
[*] --> q0'
q0' --> q'':v,v'
q'' --> ?:w
}
$M’$ cannot exist using this method.
Theorem: If a DFA has no inaccessible states, and no indistinguishable states, then it has a minimal number of states.
Proof: Suppose $M$ has $n$ states that are accessible and distinguishable.
Suppose $M’$ is equivalent to $M$ but only $n-1$ states. We can find $n$ words that reach each of the $n$ states of $M$ (use accessibility).
Two of these words must reach the same states of $M’$.
We can find a suffix for these two words such that $M$ would accept one but not the other (use distinguishability).
But both words, with any suffix attached, must reach the same state in $M’$. So $M’$ cannot accept the same set of words. $\blacksquare$
Finding Indistinguishable States
-
If $q$ is accepting and $q’$ is rejecting, mark $(q,q’)$ as distinguishable:
stateDiagram-v2
direction LR
[*] --> q(x)
q'(x) --> [*]
-
If $(q_1,q_1;)$ are marked, mark $(q_2,q_2’)$ as distinguishable:
stateDiagram-v2
direction LR
q2(x) --> q1(x):a
q2'(x) --> q1'(x):a
-
Unmarked pairs are indistinguishable, merge them into groups.
Example of DFA Minimisation
Consider the following DFA to be minimised:
stateDiagram-v2
direction LR
[*] --> qe
qe --> q0:0
qe --> q1:1
q0 --> q00:0
q00 --> q01:1
q0 --> q01:1
q1 --> q10:0
q1 --> q11:1
q11 --> [*]
q00 --> q00:0
q01 --> q10:0
q01 --> q11:1
q10 --> q00:0
q10 --> q01:1
q11 --> q10:0
q11 --> q11:1
This would produce the following table:
|
$q_\epsilon$ |
$q_0$ |
$q_1$ |
$q_{00}$ |
$q_{01}$ |
$q_{10}$ |
|
$q_0$ |
|
|
|
|
|
|
|
$q_1$ |
× |
× |
|
|
|
|
|
$q_{00}$ |
|
|
× |
|
|
|
|
$q_{01}$ |
× |
× |
|
× |
|
|
|
$q_{10}$ |
|
|
× |
|
× |
|
|
$q_{11}$ |
× |
× |
× |
× |
× |
× |
× |
The top-right diagonal is unused.
- To begin mark all accept states ($q_{11}$) as distinguishable.
-
Go through each state pair and mark it as distinguishable if is transitions to a distinguishable state pair.
For each state in the pair, transition via 0. The two states which each state transitions to are result of the state pair transition via 0. Repeat this for 1 to find the other state pair.
- Repeat (2.) for all states until there is no change.
- Apply rule 3 and remove indistinguishable states from the state graph.
Grouping Unmarked states
We can group unmarked states by drawing lines between each unmarked pair:
flowchart
subgraph A
qe --- q0 & q00 & q10
q0 --- q00 & q10
end
subgraph B
q1 --- q01
end
q00 --- q10
subgraph C
q11
end
Each group becomes a state. Analyse the transitions between states in order to determine the transitions between groups:
stateDiagram-v2
direction LR
[*] --> qe
state A{
qe
q0
q00
q10
}
state B{
q1
q01
}
state C{
q11
}
qe --> q0:0
qe --> q1:1
q0 --> q00:0
q00 --> q01:1
q0 --> q01:1
q1 --> q10:0
q1 --> q11:1
q11 --> [*]
q00 --> q00:0
q01 --> q10:0
q01 --> q11:1
q10 --> q00:0
q10 --> q01:1
q11 --> q10:0
q11 --> q11:1
Remember to include self loops.
This gives the following minimal state diagram:
stateDiagram-v2
direction LR
[*] --> A
A --> A:0
A --> B:1
B --> A:0
B --> C:1
C --> A:0
C --> C:1
C --> [*]
We know it is minimal as all the states are distinguishable.
Proving Minimalism
Consider the following minimal DFA for the language $L={w:w\text{ ends in }111}$:
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q0:0
q0 --> q1:1
q1 --> q2:1
q2 --> q3:1
q3 --> q3:1
q3 --> q0:0
q2 --> q0:0
q1 --> q0:0
Is it possible to do this in three states (one less)?
-
The minimum string for this language is $111$. There are four inputs to get to this string:
- $\epsilon$
- $1$
- $11$
- $111$
As there are four inputs and we want to reduce to three states, two inputs must end up in the same state (pigeon hole principle).
-
Suppose $x=1,y=11$ lead to the same state.
- Then after reading one more 1:
- The state of $x1=11$ should be rejecting.
- The state of $y1=111$ should be accepting.
This is a contradiction so these two states can’t be reduced together.
-
Suppose $x=\epsilon,y=1$ lead to the same state.
- Then after reading 11:
- The state of $x11=11$ should be rejecting.
- The state of $y11=111$ should be accepting.
This is a contradiction so these two states can’t be reduced together.
-
All other pairs should be tested. If all pairs can be brought to a contradiction, from any input, then the DFA is minimal.
DFA Minimisation
Minimal DFAs have the following properties:
- Every pair of states is distinguishable.
- Every state is accessible.
Accessible States
Strings of states that don’t have an entry transition aren’t accessible.
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q1
q00 --> q000
q000 --> q1
q1 --> [*]
In this example $q_{00}$ and $q_{000}$ aren’t accessible, so can be removed.
Distinguishable States
Two states $q$ and $q’$ are distinguishable if on the same continuation string $w_1w_2\ldots w_k$, one accepts but the other rejects.
Example
Consider the following state diagram:
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q2:0
q0 --> q3:1
q2 --> q3:0,1
q3 --> q1:0,1
q3 --> [*]
q1 --> q3:1
q1 --> q2:0
Pair of States |
Exit Transitions in Common |
$(q_0,q_3)$ |
Distinguishable by $\epsilon$ |
$(q_1,q_3)$ |
Distinguishable by $\epsilon$ |
$(q_2,q_3)$ |
Distinguishable by $\epsilon$ |
$(q_1,q_2)$ |
Distinguishable by $0$ |
$(q_0,q_2)$ |
Distinguishable by $0$ |
$(q_0,q_1)$ |
Indistinguishable |
Indistinguishable pairs can then be merged:
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q2:0
q0 --> q3:1
q2 --> q3:0,1
q3 --> q0:0,1
q3 --> [*]
Proving Non-Regularity
For all $n$ there exists $z$ longer than $n$, such that for all ways of writing $z=uvw$ where:
- $\vert uv\vert\leq n$
- $\vert v\vert\geq 1$
- For all $i\geq0$, the string $uv^iw$ is not in $L$.
This is a Ehrenfeucht-Fraïssé game between two players:
- Adam (for all)
- Eve (exists)
Round |
Adam |
Eve |
1 |
Chooses $n$ |
Chooses $z\in L(\vert z\vert >n)$ |
2 |
Writes $z=uvw$ $(\vert uv\vert \leq n,\vert v\vert\geq1)$ |
Chooses $i$ |
Eve wins if $uv^iw\notin L$ otherwise Adam wins.
Eve needs to give a strategy that, regardless of what Adam does, she always wins the game.
Example
We will use our non-regular language as an example:
\[L_0=\{0^n1^n\vert n\geq0\}\]
Round |
Adam |
Eve |
1 |
Chooses $n$ |
Chooses $z=0^n1^n\in L_0$ |
2 |
Writes $z=uvw$ as below: |
Chooses $i=2$ |
Adam is able to choose this string in the language:
\[\underbrace{00000000}_u\underbrace{00}_v\underbrace{001111111111}_w\]
The $i=2$ constraint from eve produces this string which is not in the language, which means eve wins:
\[\underbrace{00000000}_u\underbrace{00}_v\underbrace{00}_v\underbrace{001111111111}_w\]
- $uv^2w=0^{n+k}1^n\notin L_0$
Eve winning means that this is not a regular language.
There are several more examples in this video.
Non-Regular Language Example
Consider the following language:
\[L_0=\{0^n1^n\vert n\geq0\}\]
This is the language of any number of zeros followed by an equal number of ones.
To show that this is non-regular we reason by contradiction:
- Suppose we manage to construct a DFA $M$ for $L_0$ with $n$ states.
- We argue that something must be wrong with this DFA.
- In partucular $M$ must accept some strings outside $L_0$.
Consider we have this imaginary automaton:
stateDiagram-v2
state M {
direction LR
[*] --> a
a --> b:various transitions and states
b --> [*]
}
when we run it on the string $x$ where:
\[x = 0^{n+1}1^{n+1}\]
it should accept as $x\in L_0$:
-
Since $M$ has $n$ states it must revisit some state in order to be finite when reading our infinite string.
This is the pigeon hole principle as a DFA with $n$ states reading $n+1$ letters must revisit the same state twice.
-
This results in a loop which would be invalid as you could read as many ones as you want without reading the same number of ones.
General Method for Showing Non-Regularity
Every language $L$ has the following property:
stateDiagram-v2
direction LR
[*] --> 1
1 --> ...:a1
... --> 2:ak
2 --> ....:a(k+1)
.... --> 3:a(n-1)
3 --> 2:an
2 --> 4:a(n+1)...am
4 --> [*]
For every sufficiently long input $z$ in $L$, there is a middle part in $z$ that, even if repeated any number of times, keeps the input inside $L$.
Pumping Lemma
Every regular language $L$ has the property of pumping lemma:
- There exists a number $n$ such that for all strings $z$ in $L$ longer than $n$, we can write $z=uvw$ where:
- $\vert uv\vert\leq n$
- $\vert v\vert\geq 1$
- For all $i\geq0$, the string $uv^iw$ is in $L$.
stateDiagram-v2
state Z{
direction LR
[*] --> 1
1 --> 2:u
2 --> 3:v
3 --> 2:v
2 --> 4:w
4 --> [*]
}
Arguing Non-Regularity
If $L$ is regular then:
So to prove $L$ is not regular it is enough to show:
Language Classes
A language class is a set of languages. So far we have seen one language class which is the regular languages.
Language classes have two important properties:
- Decision Properties
- Closure Properties.
Decision Properties
A decision property for a class of languages is an algorithm that takes a formal description of a language (such as a DFA) and tells whether or not some property holds.
You could have several properties:
- Is the language empty?
- Are all the words in the language finite?
- Do all the words end with a given letter?
Decision Property Example
We can use a DFA to represent the behaviour of a communication protocol. We can check whether the communication is valid by examining the language of the DFA.
We can examine the following properties:
- Can the protocol succeed?
- Is the language non-empty?
- Does it always end with an acknowledgement signal?
- Do all words end with “ACK”?
Closure Properties?
A closure property of a language class says that given language in the class, an operator (such as union) produces another language in the same class.
- Closure properties give us a better understanding of a given language class and its limitations.
- They also allow use to know that a language is in a given class even if we don’t know how to represent it in the language.
Union
To prove closure under union we need to prove that you can make a union in a regular language. So therefore we want to prove:
If $L$ and $M$ are regular languages, so is $L\cup M$.
Proof: Let $L$ and $M$ be the language of regular expressions $R$ and $S$, respectively.
Then $R+S$ is a regular expression whose language is $L\cup M$.
$\blacksquare$
Complement
The complement $\bar L$ of a language $L$ contains those strings that are not in $L$.
For example ($\Sigma=\{0,1\}$):
- $L_1 =$ all strings that end in $101$
- $\bar L_1=$ all string that do not end in $101$:
- All strings that end in $000,\ldots,111$ or have length $0,1$.
Both of these are part of the language class of regular languages but the second is much more complicated.
Closure Under Complement
If $L$ is a regular language, so is $\bar L$.
To argue this we can use any definition of regular language (NFA, DFA, regex).
For any DFA, a DFA with the accept and reject states swapped is it’s complement.
Proof: For every input $x\in\Sigma^*$:
- $M$ accepts $x$.
- After processing $x,M$ end in an accepting state.
- After processing $x,m’$ ends in a rejecting state.
- $M’$ rejects $x$.
All of these statements are equivalent for DFAs.
Therefore the language of $M’$ is $\bar L$ meaning that $\bar L$ is regular.
$\blacksquare$
This is not true for NFAs.
Intersection
The intersection $L\cap L’$ is the set of strings that are both in $L$ and $L’$.
Closure Under Intersection
If $L$ and $L’$ are regular languages, so is $L\cap L’$.
To argue this we can use DFAs
Consider that we have the following DFAs ($M$ and $M’$):
We can combine the above diagrams like so:
stateDiagram-v2
direction LR
[*] --> r0,s0
r0,s0 --> r1,s0:0
r0,s1 --> r1,s1:0
r1,s0 --> r0,s0:0
r0,s0 --> r0,s1:1
r0,s1 --> r0,s0:1
r1,s1 --> r0,s1:0
r1,s1 --> r1,s0:1
r1,s0 --> r1,s1:1
r0,s1 --> [*]
This is the product construction as it simulates the behaviour of both at the same time. The product of a DFA has the Cartesian product of the sates of the parent diagrams.
Accepting states have to be accepting in both parent graphs.
Reversal
The reversal $w^R$ of a string is $w$ written backwards:
- $w=\text{dog}$
- $w^R=\text{god}$
The reversal $L^R$ of a language $L$ is the language obtinaed by reversing all its strings:
- $L=\{\text{cat, dog}\}$
- $L^R=\{\text{tac, god}\}$
Proof of Closure Under Reversal
We can use regular expressions to prove this closure. We can define a regular expression as having the following types:
- The special symbols $\emptyset$ and $\epsilon$.
- Alphabet symbols like $a,b$.
- The union, concatenation or star of simpler expressions.
Regular Expression $E$ |
Reversal $E^R$ |
$\emptyset$ |
$\emptyset$ |
$\epsilon$ |
$\epsilon$ |
$a$ (alphabet symbol) |
$a$ |
$E_1+E_2$ |
$E_1^R+E_2^R$ |
$E_1E_2$ |
$E_2^RE_1^R$ |
$E_1^*$ |
$(E_1^R)^*$ |
The Emptiness Problem
There are some languages that can’t be represented in the regular language class. One example is the language of duplicates:
\[L^\text{DUP}=\{ww:w\in L\}\]
- $L=\{\text{cat, dog}\}$
- $L^\text{DUP}=\{\text{catcat, dogdog}\}$
If you try to make an NFA for this language then it would be infinitely long.
You aren’t able to compute the full set of reachable states.
Here are the requirements for them emptiness problem:
- Given a regular language, does the language contain any string at all?
- Assuming the representation is a DFA construct a transition graph.
- Compute the set of states reachable from the start state.
- If any final state is reachable, then yes; else no.
Equivalence
Given regular languages $L$ and $M$, is $L=M$?
This algorithm involves constructing the product DFA, like above.
Consider that we want to find if the following DFAs are the same:
-
stateDiagram-v2
state L {
direction LR
[*] --> A
A --> A:0
A --> B:1
B --> A:0,1
B --> [*]
}
-
stateDiagram-v2
state M {
direction LR
[*] --> C
C --> C:1
C --> D:0
D --> D:0
D --> C:1
C --> [*]
}
We can combine the above diagrams like so:
stateDiagram-v2
direction LR
[*] --> A,C
A,C --> [*]
B,D --> [*]
A,C --> A,D:0
A,D --> A,D:0
B,D --> A,D:0
B,D --> A,C:1
B,C --> A,D:0
B,C --> A,C:1
A,C --> B,C:1
A,D --> B,C:1
The accept states are those which are accepted only in $L$ or $M$ but not both.
The product DFA’s language will be empty if $L=M$. If any accept states are reachable then the languages are different.
Containment
Given regular languages $L$ and $M$ is $L\subseteq M$?
This algorithm also uses the product automaton, like above.
Consider that we want to find if $L$ is a subset of $M$:
-
stateDiagram-v2
state L {
direction LR
[*] --> A
A --> A:0
A --> B:1
B --> A:0,1
B --> [*]
}
-
stateDiagram-v2
state M {
direction LR
[*] --> C
C --> C:1
C --> D:0
D --> D:0
D --> C:1
C --> [*]
}
We can combine the above diagrams like so:
stateDiagram-v2
direction LR
[*] --> A,C
A,C --> A,D:0
A,D --> A,D:0
B,C --> A,D:0
B,C --> A,C:1
A,C --> B,C:1
A,D --> B,C:1
B,D --> A,D:0
B,D --> A,C:1
B,D --> [*]
A state is accepting if it is accepting in $M$ and not $L$.
If the final state is unreachable then the containment holds.
In order to make the translation simpler we take intermediary steps before reaching a regular expression:
graph LR
eNFA --> GNFA --> 2GNFA[2-State NFA] --> Regex --> eNFA
A GNFA is a generalised non-deterministic finite automata.
Simplifying $\epsilon$-NFAs
To simplify an $\epsilon$-NFA we must ensure:
- It has only one accept state.
- No arrows come into the start state.
- No arrows go out of the accept state.
In order to do this we can “wrap” an $\epsilon$-NFA ($R_1$) like so:
stateDiagram-v2
direction LR
[*] --> q0
q0 --> R1:epsilon
state R1 {
[*] --> p1
p3 --> p1
p5 --> p1
p3 --> [*]
p5 --> [*]
}
R1 --> q1:epsilon
q1 --> [*]
Example
Simplify the following NFA:
stateDiagram-v2
direction LR
[*] --> q1
q1 --> q1:0
q1 --> q2:1
q2 --> q1:0
q2 --> q2:1
q2 --> [*]
This diagram is not simplified as it has:
- Arrows going into the start state.
- Arrows going out of the accept state
When simplified you get the following diagram:
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q1:epsilon
q1 --> q1:0
q1 --> q2:1
q2 --> q1:0
q2 --> q2:1
q2 --> q3:epsilon
q3 --> [*]
Generalised NFAs
A generalised NFA is an $\epsilon$-NFA whose transitions are labelled by regular expressions.
GNFA State Reduction
This is the process of removing every state by the start and accept states. When we are finished we have a two-state GNFA.
-
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q1:epsilon + 10*
q1 --> q1:0*1
q1 --> q2:0*11
q2 --> [*]
q0 --> q2:01
-
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q2:(epsilon + 10*)(0*1)*(0*11)
q2 --> [*]
q0 --> q2:01
-
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q2:((epsilon + 10*)(0*1)*(0*11)) + (01)
q2 --> [*]
General State Elimination Method
To eliminate state $q$, for every pair of states $(u,v)$:
Remember to do this even when $u=v$.
Conversion Example
Consider the following $\epsilon$-NFA
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q1:epsilon
q1 --> q1:0
q1 --> q2:1
q2 --> q1:0
q2 --> q2:1
q2 --> q3:epsilon
q3 --> [*]
-
To begin we eliminate $q_1$:
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q2:0*1
q2 --> q2:00*1 + 1
q2 --> q3:epsilon
q3 --> [*]
Observe how the loop on $q_2$ includes the loop that you could take by going back to $q_1$.
-
Then eliminate $q_2$:
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q3:0*1(00*1 + 1)*
q3 --> [*]
The order is important. Always go from start to end.
For addition examples see the lecture Examples of $\epsilon$-NFA to Regex Translation.
Complex Example
Consider the following NFA to be converted to regex:
stateDiagram-v2
direction LR
[*] --> q1
q1 --> q2:a
q1 --> q3:b
q2 --> q3:a
q3 --> q3:a
q3 --> q4:c
q3 --> q5:d
q4 --> q3:b
q4 --> q5:d
q5 --> [*]
- First the NFA needs to be simplified. This isn’t required as it satisfies all the conditions.
-
Eliminate up to $q_2$ (this has no effect as there are no loops on $q_2$ or before):
stateDiagram-v2
direction LR
[*] --> q1
q1 --> q2:a
q1 --> q3:b
q2 --> q3:a
q3 --> q3:a
q3 --> q4:c
q3 --> q5:d
q4 --> q3:b
q4 --> q5:d
q5 --> [*]
-
Eliminate up to $q_3$:
stateDiagram-v2
direction LR
[*] --> q1
q1 --> q3:aa + b
q3 --> q3:(a + cb)*
q3 --> q4:c
q3 --> q5:d
q4 --> q5:d
q5 --> [*]
Ensure to eliminate loops connecting to $q_3$ as noted in the general state elimination method.
-
Eliminate up to $q_4$:
stateDiagram-v2
direction LR
[*] --> q1
q1 --> q4:(aa + b)(a + cb)*c
q1 --> q5:d
q4 --> q5:d
q5 --> [*]
-
Eliminate $q_5$ leaving the final regex:
stateDiagram-v2
direction LR
[*] --> q1
q1 --> q5:(aa + b)(a + cb)*(cd + d)
q5 --> [*]
- Give regular expressions for the following:
-
The set of strings over alphabet $A=\{0,1\}$ that contains no two adjacent 1s:
\[(1+\epsilon)(0^*01)^*0^*\]
-
For the following state diagram:
stateDiagram-v2
direction LR
[*] --> 1
1 --> 1:a,b
1 --> 2:b
2 --> 3:a,b
3 --> 4:a,b
3 --> [*]
4 --> [*]
\[(a+b)^*b(a+b)(\epsilon +a+b)\]
- Consider the following regular expressions which are given along with three words - for each word say whether it belongs to the language of the regular expression:
- $a^*b^*c^*d^*$
- $abcd$ - Yes
- $abbc$ - Yes
- $aba$ - No
- $(ab)^*(abb)^*$
- $ababab$ - Yes
- $ababba$ - No
- $ababbab$ - No
For each pair of regular expressions below, say whether they represent the same
language. If they correspond to different languages, give an example of a word
that belongs to one language but not the other.
- $(a^*)\cup(b^*)$ and $\{a,b\}^*$ - No ($ab$)
- $\{a,ab,abc\}{d,cd}$ and $\{a,ab\}\{c,cd,d\}$ - No ($ac$)
- $(ab)^*\cup(aba)^*a$ and $a((ba)^*\cup(baa)^*)$ - No ($ab$)
Regular languages such as:
- DFA
- NFA
- $\epsilon$-NFA
- Regular Expressions
are all equally as powerful. They are called regular languages as they can be expressed in regular expressions.
In order to convert from one to the other you can use the following flowchart:
graph LR
Regex --> eNFA --> NFA --> DFA --> Regex
Examples
Here are some increasingly complex regular expressions and their associated $\epsilon$-NFAs:
-
$R_1=0$:
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q1:1
q1 --> [*]
-
$R_2=01$:
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q1:0
q1 --> q2:1
q2 --> [*]
-
$R_3=0+01$
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q1:epsilon
q0 --> q3:epsilon
q1 --> q2:0
q3 --> q4:0
q4 --> q5:1
q2 --> q6:epsilon
q5 --> q6:epsilon
q6 --> [*]
-
$R_4=(0+01)^*$
stateDiagram-v2
direction LR
[*] --> q0
q0 --> R3:epsilon
state R3 {
direction LR
[*] --> p0
p0 --> p1:epsilon
p0 --> p3:epsilon
p1 --> p2:0
p3 --> p4:0
p4 --> p5:1
p2 --> p6:epsilon
p5 --> p6:epsilon
p6 --> [*]
}
q0 --> q1:epsilon
q1 --> q0:epsilon
R3 --> q1:epsilon
q1 --> [*]
This allows for any number of repeats, including an empty word.
A regular expression over $\Sigma$ is an expression formed using the following rules:
General Method of Conversion
The following regular expressions are equivalent to their associated $\epsilon$-NFAs:
-
$\emptyset$:
stateDiagram-v2
direction LR
[*] -->
q0
-
$\epsilon$:
stateDiagram-v2
direction LR
[*] --> q0
q0 --> [*]
-
$a\in\Sigma$:
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q1:a
q1 --> [*]
This is for concatenation of symbols.
-
$RS$
stateDiagram-v2
direction LR
[*] --> q0
q0 --> R:epsilon
state R{
direction LR
[*] --> [*]
}
R --> S:epsilon
state S {
direction LR
[*] --> [*]
}
S --> q1:epsilon
q1 --> [*]
This is for concatenation of regular expressions.
-
$R+S$:
stateDiagram-v2
direction LR
state R{
direction LR
[*] --> [*]
}
state S {
direction LR
[*] --> [*]
}
[*] --> q0
q0 --> R:epsilon
q0 --> S:epsilon
R --> q1:epsilon
S --> q1:epislon
q1 --> [*]
-
$R^*$
stateDiagram-v2
direction LR
[*] --> q0
q0 --> R:epsilon
state R{
direction LR
[*] --> [*]
}
q0 --> q1:epsilon
R --> q1:epsilon
q1 --> q0:epsilon
q1 --> [*]
-
Consider the following automaton:
stateDiagram-v2
direction LR
[*] --> 1
1 --> 2:a
1 --> 4:b
2 --> 3:b
2 --> 4:b
3 --> 6:a
3 --> 5:c
4 --> 1:a
4 --> 3:b
4 --> 5:a
5 --> 6:c
4 --> [*]
6 --> [*]
- There are multiple of the same transition going out from the same state.
-
You can take the following route:
flowchart LR
1 -->|1| 2 -->|2| 4 -->|3| 1 -->|4| 4 -->|5| 5 -->|6| 6
- There is no route to an accepting state. (Go through the path to prove that you’re in a reject state.)
-
Consider the following automaton:
stateDiagram-v2
direction LR
[*] --> 1
1 --> 2:a
2:2
2 --> 3:b
3 --> 1:b
1 --> 3:a
-
You can take the following route:
flowchart LR
1 -->|1| 3 -->|2| 1 -->|3| 2 -->|4| 3 -->|5|2 -->|6| 1
-
The conversion can be completed with the following table:
|
$a$ |
$b$ |
$1$ |
$\{2,3\}$ |
$\emptyset$ |
$\{2,3\}$ |
$\emptyset$ |
$\{3,1\}$ |
$\emptyset$ |
$\emptyset$ |
$\emptyset$ |
$\{1,3\}$ |
$\{2,3\}$ |
$1$ |
stateDiagram-v2
direction LR
[*] --> 1
1 --> 2,3:a
2,3 --> 1,3:b
1 --> emptyset:b
emptyset --> emptyset:a, b
2,3 --> emptyset:a
1,3 --> 2,3:a
1,3 --> 1:b
2,3:2,3
- Language is $\{0,1\}$. End with $00$:
-
NFA diagram:
stateDiagram-v2
direction LR
[*] --> a
a --> a:0,1
a --> b:0
b --> c:0
c: c
-
DFA diagram:
stateDiagram-v2
direction LR
[*] --> a
a --> a,b:0
a --> a:1
a,b --> a,b,c:0
a,b --> a:1
a,b,c:a,b,c
a,b,c --> a:1
a,b,c --> a,b,c:0
- Language is $\{0,1\}$. Contains $01$ as a sub-string:
-
NFA diagram:
stateDiagram-v2
direction LR
[*] --> a
a --> a:0,1
a --> b:0
b --> c:1
c --> c:0,1
c:c
-
DFA diagram:
stateDiagram-v2
direction LR
[*] --> a
a --> a,b:0
a --> a:1
a,b --> a,b:0
a,b --> a,c:1
a,c --> a,b,c:0
a,c --> a,c:1
a,b,c --> a,b,c:0
a,b,c --> a,c:1
a,b,c:a,b,c
a,c:a,c
-
Consider the following NFA:
stateDiagram-v2
direction LR
[*] --> 1
1 --> 2:a
1 --> 3:a
2 --> [*]
2 --> 3:b
3 --> 1:b
This is an alternate notation where the accepting state is the arrow going off.
Only the first couple examples are included in these notes. See the lecture video for additional examples of regular expressions and their languages.
Examples
For the alphabet:
\[\Sigma=\\{0,1\\}\]
-
0 followed by any nmber of 1s:
\[01^* = 0(1^*) = \{0, 01, 011, 0111, \ldots\}\]
-
0 followed by any number of 1s then 01:
\[(01^*)(01) = \{ 001, 0101,01101,011101,\ldots\}\]
-
Strings of length 1:
\[0+1=\{0,1\}\]
-
All strings in the alphabet:
\[(0+1)^*=\{\epsilon,0,1,00,01,10,11,\ldots\}\]
-
Any string that ends with 010:
\[(0+1)^*010\]
-
Any string that contains that pattern 01:
\[(0+1)^*01(0+1)^*\]
Complex Example
Consider the following regular expression:
\[((0+1)(0+1))^*+((0+1)(0+1)(0+1))^*\]
We can split it into the following fundamental parts:
-
Strings of even length:
\[((0+1)(0+1))^*\]
This is as it is any number of strings of length 2.
-
Strings of length a multiple of 3:
\[((0+1)(0+1)(0+1))^*\]
This is because is is any number of string of length 3.
The $+$ operator combines these two languages and as such the definition is:
All strings whose length is even or a multiple of 3.
String Concatenation
This is the product of two words. It is obtained by appending them together to form one long word.
s = abb
t = bab
st = abbbab
ts = bababb
It is non-commutative.
These strings can be defined formally:
\[s=x_1\ldots x_n , t=y_1\ldots y_m\]
\[st=x_1\ldots x_ny_1\ldots y_m\]
There is also some notation:
- $s^k$ - The string $s$ repeated $k$ times.
- $\vert s\vert$ - The length of the string.
Concatenation of Languages
This operation can be extended to languages:
-
The concatenation of languages $L_1$ and $L_2$ is:
\[L_1L_2=\{st\vert s\in L_1,t\in L_2\}\]
-
The $n^{\text{th}}$ power of $L^n$ is:
\[L^n=\{s_1s_2\ldots s_n\vert s_1,s_2,\ldots,s_n\in L\}\]
-
The union of $L_1$ and $L_2$ is:
\[L_1\cup L_2=\{s\vert s\in L_1 \text{ or } s\in L_2\}\]
Example
For the following languages:
- $L_1\{0,01\}$
- $L_2\{\epsilon,1,11,111,\ldots\}$
you would get the following results:
- $L_1L_2=\{0,01,011,0111,\ldots\}$
-
$L_1^2=\{00,001,010,0101\}$
This is all the combinations of the strings 0
and 01
.
- $L_1\cup L_2=\{0,01,\epsilon,1,11,111,\ldots\}$
Operations on Languages
See Set Theory, Kleene Star & DNF for additional examples.
- $L^*=L^0\cup L^1 \cup L^2\cup\ldots$
-
This is all the strings made up of all combinations of $L$, including the empty string.
The number of words in each length of expansion follows the Fibonacci sequence.
Combining Languages
We can construct languages by starting with simple ones, like $\{0\},\{1\}$ and combining them. Here are two languages and their equivalent regular expressions:
The order of operations is like so: *
,.
,+
.
Regular Expressions
A regular expression over $\Sigma$ is an expression formed using the following rules:
- The symbols $\emptyset$ and $\epsilon$ are regular expressions.
- Ever $a$ in $\Sigma$ is a regular expression.
- If $R$ and $S$ are regular expressions, so are $R+S$, $RS$ and $R^*$.
Any language corresponding to a regular expression is called regular. The set of all of them define the class of regular languages.
In this lecture we will be doing step 1 from the previous lecture NFA to DFA Conversion (Determinisation).
Eliminating $\epsilon$-Transitions
Consider the following $\epsilon$-NFA:
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q1:Epsilon, 1
q1 --> q2:Epsilon
q1 --> q1:0
q1 --> q0:0
q2:q2
To convert this to an NFA we can keep track of the accessible states in a table:
|
0 |
1 |
$q_0$ |
$\{q_0,q_1,q_2\}$ |
$\{q_1,q_2 \}$ |
$q_1$ |
$\{q_0,q_1,q_2\}$ |
$\emptyset$ |
$q_2$ |
$\emptyset$ |
$\emptyset$ |
To calculate the accept states, we first include all prior accept states and then all states which can get to the accept state via $\epsilon$-transitions. For this diagram that includes all states:
\[\{q_0, q_1, q_2\}\]
We can then put this into a diagram:
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q0:0
q0 --> q1:0, 1
q0 --> q2:0, 1
q1 --> q0:0
q1 --> q1:0
q1 --> q2:0
q0:q0
q1:q1
q2:q2
Elimination Rules
-
Paths with several $\epsilon$s are replaced by a single transition:
flowchart
subgraph eNFA
1 -->|Epsilon| 2
2 -->|a| 3
3 -->|Epsilon| 4
4 -->|Epsilon| 5
end
eNFA --> NFA
subgraph NFA
12[1] -->|a| 15[5]
end
flowchart
subgraph eNFA
3 -->|Epsilon| 4
4 -->|a| 5
5 -->|Epsilon| 3
end
eNFA --> NFA
subgraph NFA
13[3] -->|a| 13
end
-
States that can reach final state by $\epsilon$ are all accepting.
$\epsilon$-Closure
This is part of the formal definition of the elimination rules above.
\[\text{CL}(A)=\{q\in Q:q\text{ can be reached from some state in } A \text{ using only } \epsilon\text{-transitions}\}\]
Note that always $A\subseteq \text{CL}(A)$ and $\text{CL}(Q)=Q$.
Therefore, for the following $\epsilon$-NFA:
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q1:Epsilon, 1
q1 --> q2:Epsilon
q1 --> q1:0
q1 --> q0:0
q2:q2
- $\text{CL}(\{q_0\})=\{q_0,q_1,q_2\}$
- $\text{CL}(\{q_0\})=\{q_1,q_2\}$
- $\text{CL}(\{q_0\})=\{q_2\}$
- $\text{CL}(\{q_0,q_1\})=\{q_0,q_1,q_2\}$
- $\text{CL}(\{q_0,q_1,q_2\})=\{q_0,q_1,q_2\}$
- $\text{CL}(\emptyset)=\emptyset$
Remembering that:
\[\text{CL}(A)=\{q\in Q:q\text{ can be reached from some state in } A \text{ using 0 or more } \epsilon\text{-transitions}\}\]
|
$\epsilon$-NFA |
NFA |
States |
$q_0,q_1,\ldots,q_n$ |
$q_0,q_1,\ldots,q_n$ |
Initial State |
$q_0$ |
$q_0$ |
Transitions |
$\delta$ |
$\delta’(q,a)=\text{CL}(\delta(\text{CL}(\{q\}),a))=$ $\text{CL}(\cup_{q’\in\text{CL}(\{q\})}\delta(q’,a))$ |
Accepting States |
$F\subseteq Q$ |
$F’=\{q:\text{CL}(\{q\})\cap F\neq\emptyset\}$ |
For the following example:
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q0:1
q0 --> q1:0
q1 --> q1:1
q1 --> q2:Epsilon
q2 --> q2:0
q2:q2
- $\text{CL}(\{q_0\})$ $=\{q_0\}$
- $\text{CL}(\{q_1\})$ $=\{q_1, q_2\}$
-
$\text{CL}(\{q_2\})$ $=\{q_2\}$
This is how far you can get in zero hops from any state.
- $\text{CL}(\delta(\text{CL}(\{q_0\}),0))$ $=\text{CL}(\delta(q_0,0))$ $=\text{CL}(\{q_1\})$ $=\{q_1,q_2\}$
- $\text{CL}(\delta(\text{CL}(\{q_1\}),0))$ $=\text{CL}(\delta(\{q_1,q_2\},0))$ $=\text{CL}(\{q_2\})$ $=\{q_2\}$
-
$\text{CL}(\delta(\text{CL}(\{q_2\}),0))$ $=\text{CL}(\delta(q_2,0))$ $=\text{CL}(\{q_2\})$ $=\{q_2\}$
This is how far you can get by reading a 0.
- $\text{CL}(\delta(\text{CL}(\{q_0\}),1))$ $=\text{CL}(\delta(q_0,1))$ $=\text{CL}(\{q_0\})$ $=\{q_0\}$
- $\text{CL}(\delta(\text{CL}(\{q_1\}),1))$ $=\text{CL}(\delta(\{q_1,q_2\},1))$ $=\text{CL}(\{q_1\})$ $=\{q_1,q_2\}$
-
$\text{CL}(\delta(\text{CL}(\{q_2\}),1))$ $=\text{CL}(\delta(q_2,1))$ $=\text{CL}(\emptyset)$ $=\emptyset$
This is how far you can get by reading a 1.
To calculate the accepting states we look at all the states which have existing accepting states in their closure. For this example they are:
- $\text{CL}(\{q_1\})$ $=\{q_1, \bf q_2\}$
- $\text{CL}(\{q_2\})$ $=\{\bf q_2\}$
An ($\epsilon$-)NFA can do everything a DFA can do and vice-versa.
Every ($\epsilon$-)NFA can be converted into a DFA for the same language.
Method
- Eliminate $\epsilon$-transitions ($\epsilon$-NFA to NFA).
- Convert NFA to DFA.
NFA to DFA (Intuition)
To do this you split out the possible states into sets of states. The sets of states containing the accepted state become the new accepted states.
-
NFA
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q0:0,1
q0 --> q1:1
q1 --> q2:0
q2:q2
-
DFA
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q0:0
q0 --> [q0,q1}:1
[q0,q1} --> [q0,q1}: 1
[q0,q1} --> q2:0
q2: {q0, q2}
q2 --> [q0,q1}:1
q2 --> q0:0
This explores the powerset of states. We can then graph all transitions and prune states that are unreachable.
General Method
|
NFA |
DFA |
States |
$q_0,q_1,\ldots,q_n$ |
$\emptyset,\{q0\},\{q1\},\{q_0,q_1\},\ldots,\{q0,\ldots,q_n\}$ (one for each subset of states in the NFA) |
Initial State |
$q_0$ |
$\{q_0\}$ |
Transitions |
$\delta$ |
$\delta’(\{q_{i1},\ldots,q_{ik}\},a)=$ $\delta(q_{i1},a)\cup,\ldots,\cup\delta(q_{ik},a)$ |
Accepting States |
$F\subseteq Q$ |
$F’=\{S:S\text{ contains some state in } F\}$ |
This technique is called the subset construction and is also used in AI.
This technique always produces a DFA of size $2^n$ where $n$ is the number of states in the related NFA. Hence, the intuitive method is preferred.
Additional Example
-
NFA
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q0:1
q0 --> q1:0
q0 --> q2:0
q1 --> q1:1
q1 --> q2:0,1
q2 --> q2:0
q1:q1
q2:q2
-
DFA
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q0:1
q0 --> q12:0
q12 --> q12:1
q12 --> q2:0
q2: q2
q2 --> q2:0
q2 --> emptyset:1
emptyset --> emptyset:0, 1
q12: {q1, q2}
The questions for this tutorial are available here.
Powersets
The powerset is the set of all subsets. The number of elements in the powerset is: $2^n$.
- $\{\emptyset\}, \{a\}, \{b\},\{a,b\}$
- $\{\emptyset\}, \{0\},$ $\{1\},\{2\},$ $\{0,1\},\{0,2\},$ $\{1,2\},\{0,1,2\}$
- $\{\emptyset\}, \{z\}$
- $\{\emptyset\}, \{1\},\{3\},\{1,3\}$
- $\{\emptyset\}, \{0\},\{2\},\{0,2\}$
- $\{\emptyset\}$
Kleene Star Operation
This is all the different strings that can be produced with the characters in the alphabet. This includes the empty-string.
- This is all binary strings including the empty-string:
- This is any number of the letter
a
including the empty-string:
- This is only the empty-string:
The $+$ Operation
If we have the option to choose an element we must choose at least one. This is the same as the Kleene star operation but without the empty-string.
For $\{0,1\}^+$ this would be formally written as:
\[\\{0,1\\}\\{0,1\\}^*\]
One element followed by any number of elements.
Alphabets
- $\{0,1\}$
- $\{a\}$
- $\{\epsilon\}$
Complement Langugages
Assume that we have $\{0,1\}^*$ and remove everything under the line:
- $\{0,1\}^*-\{010,101,11\}$
- $\{110\}$
- $\{\epsilon\}$
Following DFAs
stateDiagram-v2
direction LR
[*] --> A
A --> B:0
B --> D:1
D:D
D --> B:1
A --> C:1
C --> D:0
D --> A:0
0111010
- Accept
010011
- Reject
100
- Reject
-
11001
- Reject
If we are given a transition that doesn’t exist, this leads to a “dead” state. If it is not drawn acknowledge it’s existence.
Creating DFAs
-
$L1$ is a set of all words containing exactly three occurrences of $a$.
stateDiagram-v2
direction LR
[*] --> A
A --> B:a
A --> A:b
B --> C:a
B --> B:b
C --> D:a
C --> C:b
D --> D:b
D:D
-
$L2$ is a set of all words containing at least three occurrences of $a$.
stateDiagram-v2
direction LR
[*] --> A
A --> B:a
A --> A:b
B --> C:a
B --> B:b
C --> D:a
C --> C:b
D --> D:a, b
D:D
-
$L3$ is a set of words containing the sub-string $aaa$.
stateDiagram-v2
direction LR
[*] --> A
A --> B:a
A --> A:b
B --> A:b
C --> A:b
B --> C:a
C --> D:a
D --> D:a,b
D:D
Nondeterministic finite automatons allow us to make guesses. This can enable abilities such as look-ahead. They have the following properties:
- Each state can have zero, one, or more outgoing transitions labelled by the same symbol.
- The input is accepted if at least one computational path accepts.
Example
This is an automaton that accepts words that end in 101
.
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q0: 0,1
q0 --> q1: 1
q1 --> q2:0
q2 --> q3:1
q3: q3
This reads as:
- State $q_1$ has no transition labelled 1.
- Upon reading 1 in $q_1$, die; upon reading 0, continue to $q_2$.
- State $q_3$ has no transition going out.
- Upon reading 0 or 1 in $q_3$, die.
and performs the following actions:
- Gess if we are three symbols away fromt he end of the input.
- If so, guess if we will see the pattern
101
.
- Check that we are at the end of the input.
All unlabelled paths are rejected, or go to the $q_\text{die}$ state.
Running an NFA
As guessing is involved, an NFA can reach several different states after reading a given word.
To calculate the result we must keep track of all the possible states and then record if any states result in an accepting state.
A word is accepted by the example NFA if at least one of the states are final.
Computing Procedure
In order to calculate from an NFA you can back-track through the possible states. For the input 01101
on the previous example you would run through the following states:
flowchart LR
0[q0] --> 1[q0] --> 2[q0] --> 3[q1] --> q2 --> q3
1 --> 4[q1]
2 --> 5[q0] --> 6[q0] --> 7[q1] & 8[q0]
You do this by reading the last symbol from the input and following back through the symbols to find the possible states.
Definition
A nondeterministic finite automaton (NFA) is a 5-tuple ($Q,\Sigma, \delta,q_0,F$) where:
- $Q$ is a finite set of states.
- $\Sigma$ is an alphabet
- $\delta:Q\times\Sigma\rightarrow \text{subsets of }Q$ is a transition function.
- $q_0\in Q$ is the initial state.
- $F\subseteq Q$ is the set of accepting states (or final states).
The difference from DFAs is that the transition function $\delta$ can go into several states.
$\epsilon$-Transitions
These can be taken for free.
This means you can change state without reading anything.
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q1: Epsilon
q0 --> q2:a
q2:q2
q1 --> q2:b
q2 --> q1:a
This accepts:
a, b, aab, baba, aabab, ...
and rejects:
Example
This example is from the lecture Examples of $\epsilon$-NFAs:
Construct an $\epsilon$-NFA that accepts all strings with an even number of 0s or an odd number of 1s.
stateDiagram-v2
direction LR
[*] --> r0:Epsilon
[*] --> s0:Epsilon
r0: r0
r0 --> r0:1
r0 --> r1:0
s0 --> s0:0
s0 --> s1:1
s1:s1
r1 --> r1:1
r1 --> r0:0
s1 --> s1:0
s1 --> s0:1
To do this we construct DFAs for each of the conditions and then use $\epsilon$-transitions to choose between them.
Definition
An $\epsilon$-nondeterministic finite Automaton ($\epsilon$NFA) is a 5 tuple ($Q,\Sigma, \delta,q_0,F$) where:
- $Q$ is a finite set of states.
- $\Sigma$ is an alphabet
- $\delta:Q\times(\Sigma\cup\{\epsilon\})\rightarrow\text{ subsets of }Q$ is a transition function.
- $q_0\in Q$ is the initial state.
- $F\subseteq Q$ is the set of accepting states (or final states).
The difference from NFAs is that it allows $\epsilon$-transitions.
Language of an NFA or $\epsilon$-NFA
The NFA/$\epsilon$-NFA accepts string $x\in\Sigma^*$ if there is some path that, starting from $q_0$, leads to an accepting state as the string is read from left to right.
The language of an NFA/$\epsilon$-NFA is the set of all strings that the NFA/$\epsilon$-NFA accepts.
Here is an example of a finite automaton:
stateDiagram-v2
direction LR
[*] --> 0p
5p --> go: +10
0p --> 5p: +5
5p --> 10p: +5
10p --> go: +5, +10
go --> 0p: R
0p --> 0p: R
0p --> 10p: +10
5p --> 5p: R
10p --> 10p: R
go --> go: +5, +10
go: go
- There are states and the start state is $\text{0p}$.
- The automaton takes inputs from $\{+5,+10,R\}$.
- The state $\text{go}$ is an accepting state.
- There are transitions saying what to do for every state and every alphabet symbol.
Defining DFAs
A finite automaton (DFA) is a 5 tuple $Q,\Sigma,\delta,q_0,F)$ where:
Example
stateDiagram-v2
direction LR
[*] --> q0
q0 --> q1:1
q1 --> q2:0
q0 --> q0:0
q1 --> q1:1
q2 --> q2:0,1
q0: q0
q1: q1
Language of a DFA
The language of a DFA ($Q,\Sigma,\delta,q_0,F$) is the set of all strings over $\Sigma$ that, starting from $q_0$ and following tha transitions as the string is read from left to right, will reach some accepting state.
This basically means that the language of a DFA is a valid path on a DFAs state diagram that ends at an accepting state.
For additional examples for DFA languages see this video.
Ends With… Example
For examples on DFAs that accept strings that end with … see this video.
One example given ends with the string $\{0,1\}$. For this you’d need the following graph:
stateDiagram-v2
direction LR
[*] --> q0:0
[*] --> q1:1
q0 --> q00:0
q0 --> q01:1
q1 --> q10:0
q1 --> q11:1
q00 --> q00:0
q00 --> q01:1
q01 --> q10:0
q01 --> q11:1
q10 --> q01:1
q10 --> q00:0
q11 --> q10:0
q11 --> q11:1
q01:q01
To do this you make a tree of options up to the length of the given string and then loop back around.
Set Theory Recap
- $\emptyset \{\emptyset\}$
- The empty set or a set containing the empty set.
- $\{a_1,a_2,\ldots,a_n\}$
- $a\in A,B\subseteq A$
- $a$ in in $A$ and $B$ is a subset of $A$.
- $A\cup B,A\cap B,A\backslash B$
- Sum, intersection and difference.
- $2^A=\{B\vert B\subseteq A \}=\{B:B\subseteq A\}$
- Denotes all the subsets of a given set.
- $(a_1,a_2,\ldots,a_n)$
- This is sequence, not a set, due to the round brackets.
- Order and repetition doesn’t matter in sets.
Problems
Problems which have binary answers are called decisions problems. We will be focusing on decision problems not other types of problem.
We can reduce a problem to a decision problem to make it easier to solve.
Example of Reducing a Problem
Consider the following boolean formula:
\[(x'\vee y' \vee z)\wedge(\neg x \vee y \vee \neg z)\]
We want to find which values of true $\top$ and false $\bot$ give an output of true.
We can use the following methods:
- Guess and then reduce the formula.
Alphabets and Strings
Strings are a list of characters from a finite set. They can also be called words. To define strings, we start with an alphabet.
An alphabet is a finite set of symbols.
You could have the following alphabets:
- $\Sigma_1=\{a,b,\ldots,z\}$ - The set of letters in english
- $\Sigma_2=\{0,1,\ldots,9\}$ - The set of base 10 digits.
Strings
A string over alphabet $\Sigma$ is a finite sequence of symbols in $\Sigma$.
For example:
abfbz
is a string over $\Sigma_1$.
9023
is a string over $\Sigma_2$.
The empty string is denoted by $\varepsilon$
We write $\Sigma^*$ for the set of all strings over $\Sigma$.
We write $\Sigma^+$ for the set of all non-empty strings.
Languages
A language is a set of strings over the same alphabet. Languages define “yes/no” problems.
Consider the following example:
- $L_1=$ all strings that contain the sub-string
to
.
- $\Sigma_1=\{a,b,\ldots,\}$
stop
, to
and toe
are in $L_1$
$\varepsilon$, oyseter
are on in $L_1$
Therefore:
\[L_1=\{x\in \Sigma^*_1 \vert x \text{ contains the substring } \mathtt{to}\}\]
Another language could be:
\[L_2=\{s\# s \vert s\in \{a,b,\ldots,z\}*\}\]
This is any number of letters then a hash then any number of letters: ajdj#kjd
The main points of study in this module are:
- Finite Automata:
- Grammars:
- Extracting the meaning from a program.
- Turing Machines:
- The general model of computing.
Some Kinds of Machines
Type |
Description |
Finite Automata |
Devices with a small amount of memory. Used to model very simple things. |
Push-down Automata |
Devices with infinite memory that can be accessed in a very limited way. Used to parse grammars. |
Turing Machines |
Devices with infinite memory. These are at lease as powerful as computers. |
Resource-bounded Turing Machines |
Infinite memory, but bounded by running time/memory. These are computers that run reasonably fast/efficiently. |