• Quizzes - 10%
    • frequent
  • Homework - 20%
    • challenging
  • Exam - 70%
    • expected four (breaking midterm and final)
  • Office Hours:
    • Professor Tal, Thurs 8:40computable-9:50am; 3:30-4:30pm
  • Webpage:
    • http://www.cs.columbia.edu/~tal/3261/fall20/

Week 1 - Introduction

Related Reading:


  • an Alphabet $\Sigma$
    • an non-empty, finite set of symbols
    • e.g.
      • ${0,1}$, ASCII table, …
  • a String over alphabet $\Sigma$
    • a finite sequence of symbols from $\Sigma$
  • Length of string $s$, $\vert s\vert$
    • number of symbols in a string
    • e.g.
      • $\vert 1001\vert =4$
  • empty string $\epsilon$
    • string of length 0
    • you will see that $\empty^{*}=\epsilon$ or $\empty^{0} = \epsilon$ (see the Kleene Star definition in Week 3)
  • concatenation is denoted with $\circ$
  • $\Sigma^k$ means a set of strings with length $k$
    • e.g., let $\Sigma = {0,1}$, $k=2$
      • then you get $\Sigma^2 = {00,01,10,11}$
    • e.g., let $k=0$,
      • then you get $\Sigma^0 = {\epsilon}$
  • $\Sigma^*$ means a set of all strings over $\Sigma$ with length varying from $0 \to \infty$
    • therefore, it is infinite
  • a language is a subset of $\Sigma^*$
    • basically some set of strings


  • a language $L$ can be thought of as a yes/no problem solver:
    • e.g.
      • given input string $s$, does $s \in L$?

This means that you can think of a language $L$ as a (yes/no) function $f$ such that:

\[f: x \in \Sigma^* \to \{1,0\}\]


  • $f(x\in \Sigma^*)=1$ if $x \in L$
  • $f(x\in \Sigma^*)=0$ if $x \notin L$


  • You might think that a yes/no function would be quite limited in its capacity, but in later part of the course, you will see that it actually quite capable.

Deterministic Finite Automata (DFA)

For example, a DFA might look like



  • circles represent states
    • so in this case, we have four states
    • a start state always exists, with an additional arrow
    • an accepting state is denote by a double circle (the bottom right)
      • an accepting state would be the only state that the output would be true/yes (see sample usage below)
      • though it is called an accepting state, it does not have to be always accepting/being a sink
  • arrows represent transitions
    • a transition is uniquely defined/denoted by coming from a specific state + having an alphabet (in this case, $1$ and $0$)
    • for every state, there has to be a transition for every element in the alphabet
      • again, in this case the alphabet would be ${1,0}$

A sample usage would be:

  • Consider the string $101$:

By putting the string into the above DFA, you get:



  • you end up in a normal state, which means the output of $101$ into this DFA would be a false/no
  • the only state where you could reach a true/yes

DFA Def:

  • A DFA $M$ is a 5-tuple

    \[M=(Q,\Sigma, \delta, q_0, F)\]


    • $Q$ is a finite set of states
    • $\Sigma$ is a finite alphabet
    • $\delta$ is the transition function $Q \times \Sigma \to Q$
      • takes a state and a symbol to another state
    • $q_0 \in Q$ is called the start state
    • $F \subseteq Q$ is called the accepting states

For example, we can identify the previous diagram with:

  • $Q = {q_0, q_1, q_2, q_3, q_4}$

    • we see that there are four states (including accepting state) in total
  • $\Sigma = {0,1}$

  • $\delta$ is a function, but here can be represented by


  • $q_0$ is the start state at top left

  • $F={q_3}$ is the accepting state at bottom right

Computation Def:

  • If there is a string $w=w_1, w_2, …w_n$ where $w_i \in \Sigma$, then a DFA $M$ accepts $W$ iff

    \[\exist \, r_0,r_1,r_2,...r_n \in Q\]

    such that

    1. $r_0 = q_0$
      • $r_i$ you can see as the $i$th computation/step you had
    2. $\delta(r_{i-1},w_i)=r_i$, for $\forall \, i\in {1,2,3,…n}$
      • the $i-1$ comes from the notion that you take the transition from $r_{i-1}$ with $w_i$.
      • $\delta$ would be some computation on $w$
    3. $r_n \in F$

Language of a DFA Def:

  • for a DFA $M$, we define:

    \[L(M)=\{ w \in \Sigma^* | M \,\,accepts \,\,w \}\]

    to be the language of a DFA $M$, or the language that $M$ computes/recognizes.

For example, a DFA that accepts all strings/a language of a DFA that (recognizes) is a set of all strings:


another DFA that accepts all strings would be:


Regular Language

Regular Language Def

  • A language $L$ is regular if:

    \[\exists \, M,\, s.t.\, L=L(M)\]

    this says that a language is regular if there exists a DFA that computes to it. This means being “regular” is just a property.

Now, the question becomes:

  • Which languages are regular? How could you prove whether if a language is regular or not?

For example:

  • Question:

    What is the language of this DFA?


  • Solution:

    Realize that whenever you have $111$ in a row, you will get to the accepting state.

    • e.g. $0011101$, $01110$, and etc.


    \[L(M)=\{w\in\{0,1\}^*\,|\, w\,\,contains\,\,substring\,\,111\}\]

    Prove by induction on length of $w$, $\vert w\vert$

The general procedure would be:

  1. Trying manually some strings that will get accepted
  2. Try finding a pattern of the above and propose
  3. Prove the proposition

However, the hard part is the below:

For example:

  • Question

    Is such a language $L(M)={w\in{0,1}^*\,\vert \,w\,\,has\,\,an\,\,even\,\,number\,\,of\,\,0}$ regular?

  • Solution:

    First, examples such as $0110$ and $\epsilon$ passes, so if we find a DFA, it has to recognize those.

    Second, you need to come up with an finite memory algorithm to do the job.

    • For instance, the algorithm (bad for DFA) could be:
      1. go through the string, count number of $0$
      2. if even, then accept. else reject
    • however, the problem occurs as a DFA has a fixed number of states, you cannot simply use this algorithm as it would potential take an unbounded number of states

    Another better algorithm would be:

    • start with a start state being accepting
    • once you get a $1$, switch to normal state
    • once you get another $0$ switch back
      • notice that this algorithm does not care about how many zeros you have, but just knows whether if it is even or odd. This makes it efficient.

    Then convert the algorithm to DFA corresponding DFA looks like:



  • the general difference/problem with a DFA as compared to an actual piece of algorithm is that:
    • it has a fixed finite memory
      • as it has a fixed amount of states
    • its input is read only once

The last example for thinking is:

  • Question:

    Is the following language regular?



    • $xy$ means concatenating a string $x$ and a string $y$.
  • My Solution:

    The problem with this statement is that we cannot define a way to parse the string, deciding where to break it.

    • to solve this, you can try to find another equivalent way to state this problem
    • if the first part satisfied the condition, then the second part fails iff it has an odd number of 0 left over and ended with $01^n$, where $n$ is also odd
      • ?

    Using a NFA

    • see the section NFA
    • we can change the question into: given any first part of the string which contains even 0, check if the second part contains even 1


  • Solution:

    Using a DFA

    • a

    Using NFA

    • Professor had the same solution



  • If $L$ is finite, then $L$ is regular

Proof Sketch:

  • Can construct a DFA with a state for any possible string of length $\le$ max-length of any word in $L$, and mark appropriate states as accepting.

In fact, the converse is also false.


  • If $L$ is regular, $L$ can be infinite.

Remember the example of DFA with even number of $1$’s, which has infinite number of elements in the language, yet a DFA can be built to decide whether an input fits in the language or not.

More Practice Questions


My Solution:

  1. image-20200915205830506

    (This is actually simple. Since $L$ is finite, then we can just construct a state for every word in the dictionary)

Week 2

Non-deterministic Finite Automata (NFA)

An example of a NFA looks like


where, as compared to a DFA, we see:

  • not each alphabet correspond to a transition
    • therefore, if there is no matching transition for an alphabet at a state, it “dies”/”disappears”
  • there is always the option of having a $\epsilon$ transition, which gives the possibility to spontaneously jump forward one state
    • though this is not included in the example above
  • the $1$ in the first state has two possible transitions
    • therefore, you need to compute all the possible routes

For example, for the word $1010010$:



  • the string is accepted iff there is at least one “token” finishes and landing in an accepting state

Another way to represent the process traversing a NFA looks like:


NFA Definition:

  • An NFA with

    \[N=(Q,\Sigma, \delta, q_0, F)\]


    • $Q$ is a finite set of states
    • $\Sigma$ is a finite alphabet
    • $\delta$ is the transition function $Q \times {\Sigma \cup \epsilon} = Q \times \Sigma_{\epsilon} \to P(Q)$
      • $\Sigma_{\epsilon}=\Sigma \cup \epsilon$
      • $P(Q)$ is a power set/a set containing all subsets of $Q$
        • it becomes a set because you can have more than 1 transition for the same alphabet (hence landing in more than one state)
    • $q_0 \in Q$ is called the start state
    • $F \subseteq Q$ is called the accepting states


  • A power set is basically a a set containing all subsets:



  • An NFA $N$ accepts a string $w$ if $w$ can be written as $w=w_1 \circ w_2 \circ w_3…w_n$ where $w_i \in \Sigma_{\epsilon}$ and there is a sequence of states $r_0, r_1, r_2,…r_n \in Q$ such that:
    • $r_0 = q_0$
    • for all $i=1,2,3,…n$, $r_i \in \delta(r_{i-1}, w_i)$
      • the DFA version is $\delta(r_{i-1},w_i)=r_i$, and the difference is again that you can have multiple transitions for a single alphabet
    • $r_n \in F$

Interestingly, we used $\circ$ explicitly in the definition with NFA $w=w_1 \circ w_2 \circ w_3…w_n$, instead of $w=w_1, w_2, …w_n$, because technically an empty string is not a string, so it is not legal to have:

  • $100\epsilon1$

but legal to have:

  • $1\circ 0\circ 0\circ \epsilon\circ 1=1001$


  • For a NFA $N$, the language recognized by $N$ is

    \[L(N) =\{ w\in \Sigma^*\,|\, N \,\,accepts \,\,w \}\]

Regular Language


  • $L$ is a regular language iff $\exists$ NFA that recognizes L.
    • notice the other version with DFA is the same

Simple Proof:

  • If $L$ is regular, it is recognized by a DFA. Since a DFA is a special case of NFA, the above theorem also holds.
    • since NFA is just a DFA with more optional power.

Difficult Proof

  • Here we try to prove the reverse, that a NFA can also be converted to a DFA.
  • Let
\[N=(Q_N,\Sigma_N, \delta_N, q_{0_N}, F_N)\]

​ need to show that for every token/intermediate state in a NFA, there exists a state that correspond to it in the DFA.

​ One approach is to construct a DFA from

​ Since every set of states that is reached by the same substring in NFA must be a state in DFA:

\[Q_D = P(Q_N)=\{S\,|\,S \subseteq Q_N\}\]

​ and

\[q_{0_D}= \{q_{0_N}\}\cup\{q\in Q_N\,|\,q\,is\,reachable\,via\,\epsilon\,from\,q_{0_N} \}\]
  • where:
    • this means $q_{0_D}$ corresponds to a set of states in $Q_N$


\[F_D=\{S\subseteq Q_N\,|\,S \cap F_N \neq \empty\}\]
  • where
    • $F_D$ does not have to be same as as $F_N$
\[\delta_D(S,a)\to S_1 \cup \{q\,|\,q\,is\,reachable\,via\,some\,\epsilon\,transition\,from\,some\,q^{\prime}\,in\,S_1\}\]
  • where:
    • $\delta_D(S,a)$ means transition from a state $S$ (which contains a set of states) with a symbol $a$
    • $S_1={q^{\prime} \in Q_N\, \vert \, q^{\prime}\in\delta_N(q,a)\,for\,all\,q\in S}$
      • basically states reachable in the NFA, from states $S$ in a DFA, with the same symbol $a$
    • You basically get a set of states to be transitioned to, and they together forms a single state in DFA

For example:

  • If we need to convert the following NFA to a DFA


  • we first construct the following result



    • accepting states are just states that include state $q_2$, which is an accepting state in the NFA
    • size of NFA becomes bigger than DFA (though this is not the only solution)


Above we have seen that you can convert between DFA and NFA, and notice the size changed.


  • $\forall n \in N$, there is a language $L_n$ such that there is an $(n+1)$ state NFA recognizing $L_n$ but every DFA recognizing $L_n$ has at least $2^n$ states.
    • the variable $n$ could be arbitrary, but here for showing the exponential blow up
    • obviously, there are also other languages that DFA does not blow up

Proof (Sketch):

  • Define $L_n = {w \in {0,1}^*} \vert \,last\,nth\,symbol\,is\,0$


    • Claim: $L_n$ has an NFA with $n+1$ states
    • Proof: (try if yourself)


    • Claim: Any DFA for $L_n$ has at least $2^n$ states

    • Proof: Assume $\exist DFA$ for $L_n$ with fewer than $2^n$ states, by pigeon hole principle (p.h.p), there $\exists$ two different $n-bit$ strings that end on the same state.

      (continue by yourself)

Week 3

Operations on Languages and Closure Properties

Though a language is basically just a set, there are additional features on it such that we have additional operations.

This section here is to show that a class of regular languages is closed under a variety of specific operations.

  • if you take a regular language in a class and apply some operations, it is still a member of a class.

Operation Definitions

Assume we have, for the below languages, they are composed of some alphabet $\Sigma$

  • Complement

  • $\bar{L_1}={w \in \Sigma^*\vert w\notin L_1}$

  • Union

    • $L_1 \cup L_2 ={ w \in \Sigma^* \vert w\in L_1 or\, w\in L_2 }$
  • Intersection

    • $L_1 \cap L_2 ={ w \in \Sigma^* \vert w\in L_1 and\, w\in L_2 }$
  • Concatenation

    • $L_1 \circ L_2={ w \in \Sigma^* \vert w=xy\,\,s.t.\,\,\forall x\in L_1\,and\,\forall y\in L_2 }$
    • order matters, only $xy$ is allowed
      • like a cartesian product but concatenation


  • For any $L_1,L_2$:

    \[L_2 \subseteq L_1 \circ L_2 \iff \epsilon \in L_1\,or\, L_2 = \empty\]
  • Power

    • \[\begin{align*} L^0&=\{\epsilon\}\\ L^{i+1}&=L^i\circ L\,\,for\,all\,i\ge0 \end{align*}\]

      basically a recursive definition of concatenating a language like a cartesian product

  • Kleene Star

    • \[L^*=\cup_{k=0}^{\infty}L^k \\ L^*=\{\epsilon\}\cup L^{1}\cup L^{2}\cup L^{3}...\]
    • notice that this means $\empty^* = \epsilon$, or $\empty^{0} = \epsilon$

There are also other operations, but those are the ones we are interested in.

Closed Property

We need to show that the class of regular language is closed under all the above operations.

  • in other words, any language produced by those operations with a regular language is still regular.

Proof for Complement Operation

  • suppose $L$ is regular, let $M=(Q,\Sigma,\delta, q_0, F)$ be a DFA for $L$.

    we will construct a new DFA $M’$ recognizing $\bar{L}$ such that:

    • $M’=(Q,\Sigma,\delta,q_0,Q-F)$

    and we need show that it works:

    • $x\in \bar{L}$ iff $M’ \,\,accepts\,\,x$
      • trivially show that $x \in \bar{L}$ implies $M’$ accepts $x$
      • trivially show that $M’$ accept $x$ if $x\notin \bar{L}$

    This completes the Proof that complement is closed.


  • If you use a NFA for the same construction above, it will not work, because an input/string can be at multiple states, and flipping some of the non-accepting token to accepting, and other accepting token to non-accepting would still make this input accepted (at least one token being accepted)
  • However, the proof obviously work with a NFA, it’s just that we need to use another construction.

Proof for Union Operation

  • suppose $L_1$ and $L_2$ are regular, and we use NFA here, which is easier
    • we have $N_1=(Q_1,\Sigma_1, \delta_{1}, q_{01}, F_1)$ and $N_2=(Q_2,\Sigma_2, \delta{2}, q_{0_2}, F_2)$
    • this is trivial as we just need to have a start state $q_0$ that has two $\epsilon$ transaction to the $N_1$ and $N_2$ NFAs.

    Then we have the new $N$ to be defined as:

    \[N=(Q_1 \cup Q_2 \cup \{\epsilon\}, \Sigma_1 \cup \Sigma_2, \delta, q_0, F_1 \cup F_2)\]


    • we redefined some transitions $\delta$ to be


    And then we need to show (argue that it works):

    \[w\in L_1 \cup L_2 \iff N\,\,accepts\,\, w\]

    which is quite easy:


Alternative Proof for Union Operation

  • Instead of using NFA, we use a DFA for the proof. The idea would be to somehow to run a string in “parallel” in the two DFAs:

    \[D_1=(Q_1,\Sigma, \delta_1, q_1, F_1);\,D_2=(Q_2,\Sigma, \delta_2, q_2, F_2)\]

    and we construct a new DFA $D$:

    \[D=(Q,\Sigma, \delta, q_0, F)\]


    • $Q=Q_1 \times Q_2={(r_1,r_2)\,\vert \,\forall r_1\in Q_1,\,\forall r_2 \in Q_2}$
      • so basically you can run “in-parallel”
    • $q_0=(q_1,q_2)$
    • $F={(r_1,r_2)\,\vert \,r_1\in F_1 \,or\,r_2\in F_2}=F_1\times Q_1 \cup F_2 \times Q_2$
    • $\delta((r_1,r_2),a)=(\delta_1(r_1,a), \delta_2(r_2,a))$

With the above proof, we can easily prove the intersection operation

  • the only difference is defining $F={(r_1,r_2)\,\vert \,r_1\in F_1 \,AND\,r_2\in F_2}=F_1\times Q_1 \cap F_2 \times Q_2$

DFA Proof for Intersection Operation

  • The idea would be the same: to run a string in “parallel” in the two DFAs:

    \[D_1=(Q_1,\Sigma, \delta_1, q_1, F_1);\,D_2=(Q_2,\Sigma, \delta_2, q_2, F_2)\]

    and we construct a new DFA $D$:

    \[D=(Q,\Sigma, \delta, q_0, F)\]


    • $Q=Q_1 \times Q_2={(r_1,r_2)\,\vert \,\forall r_1\in Q_1,\,\forall r_2 \in Q_2}$
      • so basically you can run “in-parallel”
    • $q_0=(q_1,q_2)$
    • $F={(r_1,r_2)\,\vert \,r_1\in F_1 \,AND\,r_2\in F_2}=F_1\times Q_1 \cap F_2 \times Q_2$
    • $\delta((r_1,r_2),a)=(\delta_1(r_1,a), \delta_2(r_2,a))$

Alternative Proof for Intersection Operation

  • This basically uses De Morgan’s Law:

    • if $L_1$ and $L_2$ are regular, then we know $\bar{L_1}$ and $\bar{L_2}$ are also regular. Then this means that $\bar{L_1}\cup \bar{L_2}$ is still regular. Therefore, since:
    \[L_1 \cap L_2 = \overline{(\overline{L}_1 \cup \overline{L}_2)}\]

    Hence $L_1 \cup L_2$ is still regular.

NFA Proof for Concatenation Operation

  • Basically, we can imagine starting with $N_1$ (or in fact, $D_1$ with DFA would work as well), and make $\epsilon$ transition from every accepting state in $N_1$ to the starting state of $N_2$ (again, this could be $D_2$).
  • TODO: the formal proof


  • For those kind of closure proof, we generally find it easy to start with a strict DFA, and construct a NFA from it, which is looser in terms of transitions/states, and works as it proves a language being regular.


Proof for Kleene Star Operation

  • Let $N$ be an NFA recognizing $L$, then we should be able to construct an NFA recognizing $L^*$ (hence making $L^*$ still regular, which is what we meant by closed)

  • Since now we need to deal with accepted strings concatenated with each other (i.e. $L^k$), then we know we just need to add an $\epsilon$ transition from all accepted state to the initial state (hence allowing concatenation)
  • However, the above does not take into account the only case left $L^0=\epsilon$, which can be properly fixed by:


  • TODO: Properly prove it


  • If instead of adding a new state with $\epsilon$ transition only, you made the starting state accepting. This would cause the problem that, if there is a loop that goes from the starting state, then it will accept that string as well, even if that string might not be defined in the language $L^*$. Therefore, that approach would be wrong.

Regular Operations and Expressions

We define regular operations being:

  • union $\cup$
  • concatenation $\circ$

  • Kleene star $*$

This means that a regular expression is defined as:

Regular Expression Definition

  • An expression is a regular expression $R$ over an alphabet $\Sigma$ if:
    • $R=a$ for $a\in \Sigma$
    • $R=\epsilon$
    • $R=\empty$
    • $R=(R_1 \cup R_2)$
    • $R=(R_1 \circ R_2)$
    • $R=(R_1)^*$
      • for $R_1$ and $R_2$ being regular expressions
      • also notice that the definitions are recursive
  • In other words, a regular expression is a string together with additional symbols allowed:

    \[\Sigma \cup \{\cup, *,\circ,(,)\}\]

And regular expressions are special strings, because:

  • $L(a)={a}$ for $a\in \Sigma$
  • $L(\epsilon)={\epsilon}$
  • $L(\empty)={}$
    • interestingly, this means $L(R\circ \empty)=L(\empty\circ R)=\empty$
  • $L(R_1 \cup R_2)=L(R_1)\cup L(R_2)$
  • $L(R_1 \circ R_2)=L(R_1)\circ L(R_2)$
  • $L(R^)=L(R)^$

Therefore you can define a language to be regular if it is composed of regular expressions, beside constructing NFA/DFA.


Consider regular expressions over $\Sigma={a,b}$

  • Question

    then think about the language of the following regular expressions:

    • $(a\cup bb)^*$
    • $(abb \cup bbb)\circ (baa \cup b)$
    • $aa^*$
    • $(a \cup b)^*a(a\cup b)(a\cup b)(a\cup b)$
  • Solution

    • a language that has all contiguous blocks of $b$ of even length
    • ${abbbaa,abbb,bbbbaa,bbbb}$
    • a language that has ${a^i \vert i\ge 1}$
      • sometimes, also referred to as $a^+$
    • a language that has its 4th last symbol being $a$

Regular Language


  • A language $L$ can be generated by a Regular Expression if and only if that language is regular.


  • In the first direction, we need to show that for any regular expression $R$, $L(R)$ is regular (recognized by a NFA).

    Usually those proofs use induction:

    • Base Cases:




      where we used the closure property, which are already proven.

  • In the other direction, we show that if $L$ is regular/has an NFA, then $L=L(R)$ for some regular expression $R$.

    So that for any NFA $N$, there is a regular expression $R$ such that $L(N)=L(R)$ (basically every word in the language can be expressed as a single regular language)

    • Overview:


    where a GNFA is defined in the section Generalized NFA.

    • First, we know that a string $w$ is accepted by a GNFA, if there $\exists$ a path from start state to accept state s.t. the concatenating of regular expressions along the path gives a regular expression such that $w \in L(R)$.

    • Next, we construct a GNFA with one fewer state:

      • choose a state to remove $q_{rip}$ (obviously not a start/accept state)

      • then, for every other pairs of states $q_i,q_j$ ($q_i,q_j \neq q_{rip}$), do the following:


      ​ where:

      • it is all about recursive picking out any triplet into a doublet transitions:
        • $q_i \to q_{rip} \to q_j \Rightarrow q_i \to q_j$
      • $q_j \to q_{rip} \to q_i \Rightarrow q_j \to q_i$
        • $q_i \to q_{rip} \to q_i \Rightarrow q_i \to q_i$
      • $q_j \to q_{rip} \to q_j \Rightarrow q_j \to q_j$
      • basically compute all the possible paths in regular expression going from $q_i$ to $q_j$
    • Now, we do it recursively, such that the GNFA has only the start state and the accept state


      Then we arrive at a single regular expression. Therefore, $L(R)=$ language recognized by the original NFA.

For example:

  • Question

    Consider the language that has strings with even number of $0$, and turn it into a regular expression using the GNFA procedure.


  • Solution

    First, we build the GNFA with



    • technically we also need to add all the other transitions between any two states labelled with $\empty$, but since they are $\empty$ transitions, they do not accept the below computation.

    Now, we rip $q_1$:


    Lastly, we rip $q_0$:


    and the regular expression $(1 \cup 01^0)^$

    • for example, this works:


Generalized NFA

It is a special case of a NFA, such that:

  • it has exactly one start state, and one accept state
  • transitions are labeled with regular expressions
  • no incoming transition into start state, and no outgoing transition from accept state
  • for all the transitions are present (like the requirement for DFA)

Transform GNFA to NFA

Given an NFA, can transform to equivalent GNFA:

  • add new start state $q_{start}$, with $\epsilon$ transition to the old start state
  • add new accept state $q_{accept}$, add $\epsilon$ any old accept states to $q_{accept}$, and make $q_{accept}$ the only accept state
  • if a transition was labeled by multiple symbols (e.g. ${a,b}$), label it as a union of symbols (e.g. $a \cup b$)
  • label any previously non-existing transitions going to $\empty$

Transform GNFA to Regex

Whenever you rip one state off, you need to compute, recursively, the states that are affected by that removal.

For example:


Week 4

Up to now, we have:


However, we need to also discuss the case of proving that a language is not regular.

For example:

  • Question:

    Prove that the following language is not regular:

    \[L=\{ a^i b^i \,|\,i\ge 0\}\]
  • Solution

    First, we need to understand the language. Some examples would be:

    \[\{\epsilon, ab, aabb, aaabbb, ...\}\]

    The intuition would be that you would need an infinite number of states to know if you had the same number of $a$ and $b$.

    We do this by contradiction:

    • Assume $L$ is regular. Let $D$ be a DFA that recognizes $L$. Let $p$ be the number of states in $D$.

      Consider the string $a^pb^p \in L$.

      Consider the computation of $D$ on $a^pb^p$

      \[r_0(a) \to r_1(a) \to r_2(a) \to... \to r_p(a) \to r_{p+1}(b) \to r_{p+2}(b)\to ... r_{2p}(b)\]


      • $r_{2p}$ an accepting state
      • and those states do not have to be unique

      Yet, since we said there are only $p$ states, there must be a repetition among the states.

      • Let $r_k = r_l$ where $k <l \le p$



      • $a^pb^p = a^ia^{i-k}a^{p-l}b^p = xyz$

        • where $x=a^i, y=a^{i-k}, z=a^{p-l}b^p$

        But this means that even if we dropped $y$ from the string, $xz = a^i a^{i-k}b^p$ is still accepted.

        Therefore, as soon as you have a cycle, other strings not in the language will also get accepted. This contradicts the settings.

Proving Irregular Language - Part 1

Now, we generalize the above proof.

  1. First we want to show that all regular language satisfy some property “pumping lemma”
    • any long enough string in the language have a cycle that can be pumped
  2. Use the above to show that $L$ is not regular:
    • Assume towards contradiction, that $L$ is regular, then pumping lemma holds. Then choose long enough words, pump it in a way that we find a string $\notin L$. (basically the cycle we made)

Pumping Lemma:

  • For any regular language $L$, there exists a number $p$ (“pumping number”), such that for $\forall w \in L$, $\vert w\vert \ge p$,:

    \[\exists \,\,a\,\,way\,\,to\,\,parse\,\,w=xyz\]

    such that:

    • $\vert xy\vert \le p$
    • $\vert y\vert > 0$
    • $\forall i,\,\,xy^iz\in L$
      • the key part
      • basically parsing/transitioning the $y$ part will have a cycle, and a regular language NEEDS to take it

Now, we need to first show that all regular languages fulfill this lemma.


  • Let $L$ be regular, and let $D$ be some DFA recognizing $L$, and let $p$ be the number of states in $D$.

    \[\forall w \in L, |w|\ge p,\,\,write\,w=w_1w_2...w_p...w_m\]

    Now, consider a computation of $D$ on $w$:


    Since there are only $p$ states, then there must be a repetition/cycle in the states $r_0,r_1…r_p$. because we have only $p$ states.

    Therefore, $\exists i<j\le p,\,\,r_i=r_j$.

    • notice it is $\exists$, so we cannot freely choose where that $i$ is, but only know that $i<p$.

    Then we have:


    Then it follows that we can break down

    • $w=xyz$


      • $x=w_1..w_i$
    • $y=w_{i+1}..w_j$

      • $z=w_{j+1}..w_m$

    As a result, in this construction, we have shown that all the required properties hold:

    • $\vert xy\vert =j\le p$
    • $\vert y\vert =j-i > 0$
      • this $y$ has to be made such that $x$ CANNOT be pumped. So you actually cannot choose $y$.
    • $\exists\,i=0,1,2…\vert xy^iz\in L$


  • In general, you cannot use a specific parsing of $y$ in $w=xyz$. To reach a contradiction to the pumping lemma, you need to be able to handle (find a contradiction for) any possible parsing – you cannot choose one specific one.
  • When using the pumping lemma, all you can choose/make up is the parameter $w$ and the $i$. All the other parameters have to fall off from your choice of $w$ and $i$.

Therefore, we can use the pumping lemma to prove that some $L$ is not regular:

  1. Assume towards contradiction $L$ is regular.
  2. Then there must be $p$ being the pumping number.
  3. I choose some string $w\,\vert \,w\in L \,\,and \,\,\vert w\vert \ge p$
    • this is the tricky part that you need to figure out yourself
  4. Then, whenever you parse the string $w$, there will be a cycle due to the lemma with the pumping number.
  5. Therefore, we know $w=xyz$ where
    • $\vert xy\vert \le p$
    • $\vert y\vert >0$
    • $\exist i,\,\,s.t.\,\,xy^iz \notin L$
  6. Therefore, this contradicts the pumping lemma, and $L$ is not regular.


  • If we have a language:

    \[L=\{w \in \{ a,b\}^*\,|\,w\,\,is\,\,a\,\,palindrome\}\]

    Then $L$ is not regular.


  • Try 1:
    • Assume towards contradiction that it is regular
    • Then we choose $w=a^p$, where $\vert w\vert \ge p$ and we want to find a $w=xyz$ and by pumping, it will produce a word that is not in $L$.
    • But it seems all the words are still in $L$. Hence it does not work.
  • Try 2:
    • Assume towards contradiction that it is regular
    • Then we choose $w=a^pba^p$, where $\vert w\vert \ge p$
    • we want to find a $w=xyz$, such that $\vert xy\vert \le p$, and $\vert y\vert >1$.
      • this means that $y$ must be consisting only $a$s. This $w$ has to be constructed nicely so that this result falls out.
    • Then, we can construct a string, for example: $xyyz=a^{p+\vert y\vert }ba^p \notin L$.
      • In general $\exists i,\,\,xy^iz=a^{p+i}ba^p \notin L$
    • Hence we reach a contradiction. $L$ is not regular.

Another Example:

  • Question

    Show that the language:

    \[L = \{1^n\,|\,n\,\,is\,\,prime\}\]
  • Solution



    • basically, all you can choose is $i$.

Week 5

Proving Irregular Language - Part 2

There are also other methods to prove that a language is not regular.

Using Closure Properties

Let’s start by looking at one example:

  • Question:

    Show that $L={w\in{a,b}^*\,\vert \,#a=#b}$

  • Solution:

    Basically we need to know something else being non-regular (the ones we have seen before).

    Assume $L$ is regular, then we can take the $L_1 = a^b^$, which is regular.

    Now, since regular languages are closed under operations $\cap$, consider:

    \[L \cap L_1 = L_1 = a^pb^p\,,\,\,\forall p \in \mathbb{Z}\]

    which we have proved to be non-regular.

    Therefore, it is not closed, and we reached a contradiction.

    Therefore, $L$ is also non-regular.

In general, the template for the proof would be:

  1. Assume $L$ is regular. Then take another regular language $L_1$ (e.g. built from a regular language).
  2. Then, the following language $L \cap L_1 = L_2$, or $L \circ L_1 = L_2$, or any closed operation is should still be regular.
  3. But $L_2$ is known to be irregular.
  4. Therefore, a contradiction, as those operations should be closed. So $L$ must also be irregular.

Myhill-Nerode Theorem


  • Let $L \in \Sigma^$ be some language over $\Sigma$, and $x,y\in \Sigma^$ be strings. We say $z \in \Sigma^*$ is a distinguishing extension of $x,y$ w.r.t. $L$, if exactly one $xz$, $yz$ is in $L$ and the other string $\notin L$.

    • notice that $z$ can be $\epsilon$
  • If such $z$ exists, then we say that $x,y$ are distinguishable by $L$.

  • If no such string exists, then we say $x,y$ are indistinguishable by $L$. (So they are either both in $L$ or both not.) Denote it by

    \[x \sim_L y\]

Basically, $x$ and $y$ are indistinguishable if both of them landed at the same state in a DFA/NFA, so that any ending you add to them will affect both the same way.

  • if obviously $x$ and $y$ are in different states (e.g. $x$ is accepted and $y$ is not), then you simply have $z=\epsilon$


  • Then $\sim_L$ is an equivalence relation.

    \[[x] = \{y \in \Sigma^*\,|\,x\sim_Ly\}\]
    • where you basically have a set of those strings being indistinguishable in the same manner
  • This means that, if $L \subseteq \Sigma^*$ is a regular language, and $D$ is a DFA recognizing it, then $x \sim_L y$ also means that $x$ and $y$ end in the same state.

    • so if $x$ and $y$ ended at different state, then automatically they are distinguishable.


  • If $L$ is regular, and $D$ is a DFA for $L$, then it has to be that $#$ of equivalent classes induced by $\sim_L$ must be $\le#$ states in $D$.
    • kind of obvious because the equivalent class will land on the same state

Myhill-Nerode Theorem:

  • For any $L \subseteq \Sigma^*$, $L$ is regular $\iff$ $#$ of equivalent classes for $\sim_L$ is finite.
    • Obvious since a regular language must have a finite number of states.
  • If $L$ is regular, then $#$ of equivalence classes for $\sim_L$ is exactly the $#$ of states in the minimal DFA recognizing $L$.


  • This means that we can prove how many states the minimized DFA for language has.
    • In fact, there is an efficient algorithm, s.t. given a DFA, it minimizes it to be the smallest DFA for $L$.
  • This also means that we can prove that a language is not regular, by showing infinitely many strings, where no two strings $x,y$ are equivalent to each other $\sim_L$, then there must be infinitely many equivalence classes. Hence it is not regular.

For example:

  • Question:

    Consider the palindrome example:

    \[L_{pali} = \{ w\in\{a,b\}^*\,|\,w=w^R \}\]
  • Solution:



    • remember that all we need to show is that there exists an infinite number of “palindromes” that are not similar.
      • the proof above works because $a^n$ is a palindrome itself

For example

  • Question:

    Consider the language:


    is not regular.

  • Solution:


In general, the template would be:

  1. Think about two words $w(n),w(n+k)$, and $k\neq 0$
  2. Find a distinguishing extension that would make exactly one of them in the language and the other not
    • so that the two words that are not equivalent
  3. Make $n\to \infty$. You have found an infinite number of equivalent sets, because $w(0)\neq w(1) \neq w(2) \neq w(3) \neq …$
    • we don’t care if $w(n)$ is accepted or not at this point, we just need to show that they are all different from each other (lands in different states)
  4. Finished, there are infinite number of equivalent class, hence infinite number of states needed.

Other Usage:

  • Question:

    Show that there are exactly $2^n$ equivalent classes (states) in a language:

    \[L = \{0,1\}^*\,|\,nth\,\,last\,\,symbol\,\,is\,\,0\]
  • Solution:

    First, we can show that there are at least $2^n$ equivalent classes:


    Then, we show that there are exactly $2^n$ equivalent classes.


Context Free Language

Basically, all we have discussed were the regular languages, but there are other wider/richer class of languages out there:


Alike regular languages, you can also define context free languages in two ways:

  • language generated (alike regex), this would be the most common for context free language
  • algorithm generated (alike DFA/NFA)

One example of a context-free language is:

\[\Sigma = \{0,1\};\,\,\\ V=\{S\};\\ \begin{cases} S\to \epsilon\\ S\to 0S1 \end{cases}\]

Then, a word in the language would be:

\[S \Rightarrow 0S1 \Rightarrow 00S11 \Rightarrow 0011 \in L(G)\]


  • you started with a start variable $S$, and then generated to $0S1$, and etc.

And, in general, you have:

\[L(G) = \{ 0^n1^n \,|\,n\ge0 \}\]

and you get a sense that context free languages are basically production rules:

  • strings you can produced from those rules are in the language.


Definition of Context Free Grammar:

  • A context-free grammar is a tuple $(V, \Sigma, R, S)$,


    • $V$ is a finite set of single variables

      • it is exactly because it only allow a single variable, it is context free because it only depends on that variable without context
    • $\Sigma$ is an alphabet (finite)

    • $R$ is a finite set of rules of the form:

      \[A \to w;\,\,A\in V;\,\,w\in(V\cup\Sigma)^*\]

      so basically, you can have more than one variables, and each variable points to either a string or a string+another variable.

      • in general, for generating an non-empty language, you need to have at least one rule that points to a symbol only.
    • $S \in V$ is the start variable

This means the following:


  • If two string $u,v \in (V \cup \Sigma)^*$, and there is a rule $A \to w \in R$, then we can write:

    \[uAv \Rightarrow uwv\]

    which reads: $uAv$ yields $uwv$.

  • If two string $u,v\in (V \cup \Sigma)^$, we say that $u \Rightarrow ^* v$ if some $u_1, u_2, …,u_n \in (V \cup \Sigma)^$ we have:

    \[u \Rightarrow u_1 \Rightarrow u_2 \Rightarrow ... \Rightarrow v\]


    • $u_1,u_2,…,u_n$ are just strings that you can reach from its previous string.

    This basically means that $u$ generates to $v$ if there is an arbitrary number of step that can make we start with $u$ and end in $v$.

    • e.g. $S \Rightarrow^* 0011$ in the example above

Notice that:

  • when you specify rules, you use $\to$, when you show derivations, you use $\Rightarrow$

And correspondingly:

Context Free Language Def:

  • A language $L$ is a context-free language if there exists a context free grammar $G$ such that $L(G)=L$.



  • We say context free language $G = (V, \Sigma, R, S)$ has the language $L(G)$ if: $$ {mathtools} L(G) = {w \in \Sigma^* \,|\,S\Rightarrow ^* w}


For example:

  • Question:

    \[G=(\{S\},\{a,b\},R,S)\\ R:\begin{cases} S \to \epsilon \\ S \to aSa\,|\,bSb \end{cases}\]
  • Solution:

    We see that it is the language of:

    \[L(G) = \{w \in \{a,b\}^*\,|\,w\,\,in\,\,an\,\,even\,\,length\,\,palindrome\}\]

    To prove it, we need to show the $\iff$ relationship:

    1. That all $S \Rightarrow^* w$ is in fact an even length palindrome
    2. All even length palindromes are captured by $L$/can be derived in $L$.

    TODO: look at the proof in the book

For example:

  • Question:

    \[G=\{ \{G\},\{a,b,+,*,(,)\},R,E \}\\ E: \begin{cases} E \to E+E\,|\,E*E\,|\,(E)\,|\,a\,|\,b \end{cases}\]

    Derive the string $a+b*a$

  • Solution:

    The solution is trivial:

    \[E \Rightarrow E+E \Rightarrow E+E*E \Rightarrow a+E*E\Rightarrow a+b*E \Rightarrow a+b*a\]

    in fact, its corresponding derivation tree looks like:


    or, another way is:

    \[E \Rightarrow E*E \Rightarrow E+E*E \Rightarrow a+E*E\Rightarrow a+b*E \Rightarrow a+b*a\]

    and its derivation tree looks like:


    which means that you can interpret as both:

    • $a+(b*a)$
    • $(a+b)*a$


In fact, the above two examples leads to two different types of derivations as well:

Rightmost Derivation

  • A derivation where, in each step, the rightmost variable gets replaced.
    • the first derivation tree above is not a rightmost derivation tree

Leftmost Derivation

  • A derivation where, in each step, the leftmost variable gets replaced.
    • the second derivation tree above is also not a leftmost derivation tree
    • this is also a left most derivation: $E \Rightarrow E+E \Rightarrow a+E\Rightarrow a+E*E \Rightarrow …$
  • There exists a 1:1 correspondence between a derivation tree and a leftmost derivation.


  • A context free grammar is ambiguous if $\exist w\in L(G)$ s.t. the same string $w$ has two different derivation trees (not derivations)
  • Equivalently, it is also ambiguous if it has two or more leftmost derivations (not derivation trees).
    • an application to think about it is how compiler compiles this, there should be only one way to compile/parse it.
    • the example above is ambiguous.

A inambiguous context-free language would be:

\[G=G'(\{E,F,G\},\{a,b,+,*,(,)\},R,E)\\ R:\begin{cases} E \to E+F \,|\, F\\ F \to E*G \,|\, G\\ G \to (E) \,|\, a\,|\,b \end{cases}\]


  • the strings generated by this $G’$ is the same as $G$ defined above ($G={ {G},{a,b,+,*,(,)},R,E }$), but this one is inambiguous

  • in general, it is difficult to prove that any language is inambiguous
  • an inambiguous language might not exists for a specific context-free grammar


  • A context free language $L$ is inherently ambiguous, if every context free grammar that generates $L$ is ambiguous.

An example would be:

\[L =\{a^nb^nc^md^m\,|\,n,m\ge0\} \cup \{a^nb^mc^md^n\,|\,n,m\ge0\}\]


  • again, this would be hard to prove, but it is inherently ambiguous.
  • in fact, if you take the string $a^+b^+c^+d^+$ always have two derivation trees, no matter what $G$ you have that generates $L$.

Proving the Language of CFG

In general, the proof has to do in both directions, and usually induction will be helpful.

For example:

  • Question:


​ and we want to prove the claim.

  • Solution






Week 6

Regular and Context Free Language

Consider the set of all valid regular expression over $\Sigma = {a,b}$.

That is, take alphabet ${a,b,(,),\cup,\circ,,\empty, \epsilon}$ and the language $REGEX:{w \in {a,b,(,),\cup,\circ,,\empty, \epsilon}^*\vert \text{$w$in in the regular expression}}$

  • e.g. $w=((a\circ*$ is not a regular expression, so it is not in $REGEX$, but $w=((a))$ is.


  • $REGEX$ defined above is not a regular language, but context free.
  • basically, there is no DFA/NFA to tell if a regular expression is regular.

Proof: (Using pumping lemma)

  • Assume it was (so have a DFA/NFA for it), let $p$ be the pumping number.

  • Choose $w=((((..(a^)^..)^*)))$ which is a valid regular expression, and $((((..($ is length $p$.

    • so we have $\vert w\vert > p$
  • For any possible parsing $w=xyz$, where $\vert xy\vert \le p$, $\vert y\vert >0$.

  • Then it has to be the cases that $y$ contains only $($.

  • Now, we can simply choose any $i$, for example $i=2$

    • $xyyz=(^{p+\vert y\vert } a)^)^…)$ is not balanced in brackets, so it is not regular.
  • Now, we show that it is context-free:

    • \[G=(\{S\},\{a,b,(,),\cup,\circ,*,\empty, \epsilon\},R,S)\\ R: \begin{cases} S \to a|b|\epsilon|\empty \\ S\to (S \cup S) | (S \circ S)|(S^*) \end{cases}\]
    • and from here, we can prove that this generates the language $w \in {a,b,(,),\cup,\circ,}^$

Therefore, the set of regular expressions are context-free.

  • The class of context-free languages is closed under $\circ,\cup,*$
  • every regular language is context free

Proof for Context Free Grammar Closure


  • The class of context-free languages is closed under $\circ,\cup,*$
    • actually other operations such as complement is not closed

Proof for Union:

  • Start by constructing the grammars


    Now, we need to show that it works.


    • $L(G) = L(G_1) \cup L(G_2)$

      so we need to show that $S \Rightarrow ^* w \iff S_1 \Rightarrow^w \vert S_2 \Rightarrow^w$

      • this will be trivial as if you start with $S$, it must go either first to $S_1$ or $S_2$. Then the rest follows.

Proof for Concatenation

  • Again, start by the definition of the grammars:


    Now, we need to show that it works:


    • $L(G) = L(G_1) \circ L(G_2)$



  • usually, the trick is to see that a starting variable $S_i$ of a grammar $G_i$ is representative of the entire context-free language that it can reach

Proof for Kleene Star

  • Same thing, start with


    Now, we need to show that it work:


    • $L(G)=L(G_1)^*$

    seems to be true by definition


Regular and Context-Free Language


  • Every regular language is context free

There are also other ways to prove it, for example using a DFA. Look at the book to see more.


  • Let $L$ be regular, and generated from a regular expression $R$. We want to show that $L(R)$ is also generated by some CFG $G$.

  • basically, we want to show that all words in the $L(R)$ is generated by a grammar $G$.

  • We prove this by induction:



    • this is also the same as mentioned in the tips section

    Lastly, it works out by the closure property, because for each:

    • $R_1,R_2$ involved above, we can use a $G_1$ and $G_2$ which is true by induction.
    • now, since $G_1$ and $G_2$ using the operations are closed, hence this completes the proof that you can have a $G$.

Not Context-Free Grammar

There is a lemma similar to the pumping lemma, called the “tandem pumping lemma”

Tandem Pumping Lemma:

  • If $L$ is a CFL, then there exists a constant $p$, s.t. $\forall w,\,w\in L$, and $\vert w\vert \ge p$, there exists a way to parse $w$ into:



    • this is the different part from the normal pumping lemma
    • $\vert vxy\vert \le p$
      • different from the normal pumping lemma as well
    • $\vert vy\vert >0$
      • again, this and the above has to come out by themselves. You cannot choose them.
      • this is the part where we can pump
        • notice that $v$ or $y$ themselves could be empty
    • $uv^i xy^i z \in L,\,\,\forall i$
      • notice that here you can pump two of them at the same time

Proof Idea:

  • Let $A$ be a CFL and let $G$ be a CFG that generates it. We must show that any sufficiently long string $s$ in $A$ can be pumped and remain in $A$.

  • Let $s$ be a very long string in $A$. (We make clear later what we mean by “very long.”) Because $s$ is in $A$, it is derivable from $G$ and so has a parse tree. The parse tree for $s$ must be very tall because $s$ is very long. That is, the parse tree must contain some long path from the start variable at the root of the tree to one of the terminal symbols at a leaf. On this long path, some variable symbol $R$ must repeat because of the pigeonhole principle.

    • e.g.
    • image-20201021004016727
  • As the following figure shows, this repetition allows us to replace the subtree under the second occurrence of $R$ with the subtree under the first occurrence of R and still get a legal parse tree (and once it repeated once, you can repeat infinite number of times). Therefore, we may cut s into five pieces $uvxyz$ as the figure indicates, and we may repeat the second and fourth pieces and obtain a string still in the language. In other words, $uv^ixy^iz$ is in $A$ for any $i\ge0$.


    • where $v$ or $y$ could be empty, but at least one of them is not.
    • we don’t care what $u,z$ are, we just know that you can repeat $R$ many times and get infinite $R\to vRy$, and finish with $R\to x$.

Template for proving non-context-free grammar:

  1. Assume CF, then you get a pumping number $p$.

  2. Choose a string $w \in L$, such that $\vert w\vert$ is long enough hence $\vert w\vert \ge p$

  3. Think about all possible parsing of $v,y$, such that:

    • $\vert vxy\vert \le p$
      • essentially, since $x$ can be $\epsilon$, we need to make sure $\vert vy\vert \le p$
    • and $\vert vy\vert > 0$
      • as otherwise, you wouldn’t have an really long word shown in the proof sketch above
  4. Show that for all possible parsing, you can choose an $i$ such that:

    \[uv^i xy^i z \notin L\]

Example: Tandem Pumping Lemma

The following two languages are not context-free (hence not regular as well):

For example:

  • Question:

    Show the following language is not CF:

    \[L=\{a^i b^i c^i | i \ge 0\}\]
  • Solution:



    • so since $vxy$ can only hold up to two different characters, $v,y$ to be pumped can be at most two different characters. Therefore, since $\vert vy\vert >0$, you will have at least one character unchanged.

For example:

  • Question:

    Show that the following language is not context-free:

    \[L=\{ ww \ | \ w \in\{a,b\}^* \}\]
  • Solution:

    The draft attempts are:


    • the correct way is that we need to show that for every parsing of $vxy$, it is not in the language

    The attempt is:


    • and here, you will see that in all three cases/parsing, the pumping will work:


Week 7

Other Closure Properties on CFL


  • The class of context-free languages is not closed under intersection.


  • We will show two specific languages, $L_1,L_2$ being context free, but $L_1 \cap L_2$ is not. This is already enough to prove the theorem.

    • this means that there are some CFL being closed under intersection, but some not.
  • Define:

    \[L_1 =\{ a^ib^ic^j \,|\, i,j\ge 0 \}\\ L_2 =\{ a^ib^jc^j \,|\, i,j\ge 0 \}\]
    • Claim 1: $L_1,L_2$ are context free.

      • Quite simple, for $L_1$ just need:

        \[S\to \epsilon\,|\,aSbV \\ V \to c \,|\,VV\]
      • For $L_2$, you need:

        \[S\to \epsilon\,|\,VbSc \\ V \to a \,|\,VV\]
    • Claim 2: $L_1 \cap L_2 = { a^ib^ic^i \,\vert \, i\ge 0 }$

      • which is not context free.

    This completes the proof, since we just need to show one “counter”-example.


  • The class of context-free languages is not closed under complement.


  • Assume towards contradiction, that it was closed.

  • Now, using De Morgen:

    \[L_1 \cap L_2 = \overline{(\overline{L_1} \cup \overline{L_2})}\]

    However, we know that:

    • $LHS$ is not closed
    • union operation IS closed
    • Therefore, it cannot be that complement is closed.

    This completes the proof.

Push-Down Automata of CFL

(This part is not on the syllabus, will not be tested.)

  • A pushdown automata is like an NFA, but also has a stack memory.

This part is interesting, because it can be shown that:


  • $L$ is a $CFL$ $\iff$ it is recognized by a $PDA$.
    • then we can also show that every regular language is context-free, since a NFA is just a special case of a PDA without a stack.


  • If we have 0 stack of a PDA, we have a NFA
  • If we use 1 stack, we have a PDA.
  • If we have 2 stacks, or if we use a queue, we get something equivalent to a Turing Machine.

There will be no exam on CFL

Exam 2 Starts Here

Turing Machine

Basically, this is the model that captures what computers can do.

The definition of a Turing Machine is as follows:

Definition Sketch:

  • image-20201023012210964


    • input is on a tape, which is infinite (s.t. after the finite input, it is filled with “blanks”)
    • at each step of computation:
      • depending on:
        • the current state
        • symbol on the tape
      • can do:
        • switch to another state
        • write on the tape at the current head position
          • (different from DFA)
        • move the head to the left or to the right
          • (different from DFA)
        • e.g. from state $q_1$ with input $a$, you could write a $b$, go to $q_2$, and move the head to the left.
    • there are two special states $q_{acc}, q_{rej}$ (different from DFA):
      • $q_{acc}$: accept and halt/exit
      • $q_{rej}$: reject and halt/exit

Hence, the summaries of difference between a Turing Machine and a DFA is:

  • reading input can go left or right
  • infinite memory
    • yet still finite amount of states in the state control
  • can write as well
  • immediately halts upon entering $q_{acc}$ or $q_{rej}$

Formal Definition of a TM

  • A Turing Machine is a 7-tuple:

    \[M=(Q,\Sigma, \Gamma, \delta, q_0, q_{acc},q_{rej})\]


    • $Q$: a finite set of states
    • $\Sigma$: a finite set of alphabet (for input)
    • $\Gamma$: a finite set of alphabet (for output/writing)
      • this means $\Sigma \subseteq \Gamma$, and blank $\textvisiblespace \in \Gamma$ (but $\textvisiblespace \notin \Sigma$)
    • $q_0, q_{acc},q_{rej} \in Q$
      • $q_0$ is the starting state
      • $q_{acc}$ is the accepting state
      • $q_{rej}$ is the rejecting state
    • $\delta: Q \times \Gamma \to Q \times \Gamma \times {L,R}$
      • i.e. it takes a state an a symbol on the tape, and do a state transition, write a symbol, move head to left or right
      • e.g. $\delta(q_2,a) = (q_4,b,R)$

Computation Def Sketch:

  • A computation of a TM on input $w \in \Sigma^*$ (at the start, before modification/write) can:
    • start with $w$ followed by infinitely many $\textvisiblespace$ on the tape, with reading head pointing to first character
    • apply $\delta$ repeatedly
      • read, transition, move, and write
    • if ever enter $q_{acc}$ or $q_{rej}$, halt


  • If head is pointing to left most location, and $\delta$ says to move left, we assume that it stays remained.

For example:

  • Consider a TM with $\Sigma={a,b}$:


  • This recognized $L={w \in {a,b}^*\,\vert \, w \,\text{contains a$b$} }$

For example:

  • Question:

    Is the following TM the same as before?


  • Solution:

    It is the same language, but not the same program (e.g. once reached the end, meet $\textvisiblespace$ and terminate. But the other one in black will hang forever).


  • Since a TM $M$, on input $w$, may be:
    • accepted - met accept state
    • not accepted - met reject state or run forever
  • Then the language of a TM $M$ is defined as:

    \[L(M)=\{w \,|\,\text{$M$ accepts $w$}\}\]

Configuration Def:

  • A configuration represents the status of computation at a certain point. The configuration is:



    • $u,v \in \Gamma^*$
    • $q\in Q$

    and it means:

    • the tape currently has $uv$ followed by infinitely many $\textvisiblespace$
      • $u,v$ could also contain $\textvisiblespace$
    • the current state is $q$
    • reading head is pointing to the first symbol of $v$
      • haven’t read it yet, it is point

Therefore, it means:

Initial Configuration:

  • the initial configuration would always be:



    • $w$ is the input string

Yielding Configuration:

  • $C_1 \vdash C_2$ ($C_1$ configuration yields $C_2$ configuration) if applying a $\delta$ on $C_1$ gives $C_2$

    • e.g. $C_1 =aq_1bab$, and $\delta(q_1,b)=(q_0,a,R)$, then $C_2=aaq_0ab$

    • schematically:


Accepting and Rejecting Configuration:

  • Any $C=uq_{acc}v$ is an accepting configuration, $u,v\in \Gamma^*$
  • Any $C=uq_{rej}v$ is an rejecting configuration, $u,v\in \Gamma^*$
    • and there will be no further configurations once the above two is reached

Therefore, this means that:

Accepted String Def:

  • For a TM $M=(Q,\Sigma, \Gamma, \delta, q_0, q_{acc},q_{rej})$, a (input) string $w \in \Sigma^*$ is accepted by $M$ if there exists a sequence of configurations, $C_1,C_2,…C_n$, such that:
    • $C_1=q_0 w$
    • $C_i \vdash C_{i+1},\, \forall\,i=1,2,…n-1$
    • $C_n=uq_{acc}v,$ and $u,v\in\Gamma^*$

TM Recognizable Language:

  • A language $L$ is TM-recognizable if there exists a Turing Machine $M$ such that $L(M)=L$.
    • sometimes, people also call this $L$ recursively-enumerable
    • $x\in L \to$ $M$ accepts $x$
    • $x\notin L \to$ $M$ rejects $x$ or run forever

For example:

  • Consider the non-regular but context-free language:

  • The TM for it would be:

    • the idea is to write a $X$ for a $0$, and move to the right
    • to write a $Y$ for a $1$, and move to the left




  • Some transitions are omitted in the above diagram, and those transitions will be assumed to go to $q_{rej}$.

For example:

  • Question:

    Construct the TM for:

  • My Solution:

    • the idea is to cut off every pair of $1$ at left and right end, recursively
      • if it is power of 2, then you should end put exactly cutting off everything on the last iteration

Week 8

Turing Machine (Continued)


  • A TM $M$ is a decider if $M$ halts for every $w \in \Sigma^*$
    • i.e. every word can either be accepted, or rejected (cannot have an infinite loop)

This means that:


  • A language $L$ is a TM decidable if if there exist a TM $M$ such that :
    • $M$ recognizes $L$
    • $M$ is a decider
      • $x\in L \to$ $M$ accepts $x$
      • $x\notin L \to$ $M$ rejects $x$

So the relationship look like:



  • one important thing implied from the above (which is true), is that some sets (regular and CF) are actually proper subsets:
\[\{ L \subseteq \Sigma^* | L\,\text{is regular}\} \subsetneq\\ \{ L \subseteq \Sigma^* | L\,\text{is CFL}\}\subsetneq\\\\ \{ L \subseteq \Sigma^* | L\,\text{is TM-Decidable}\}\subseteq\\\\ \{ L \subseteq \Sigma^* | L\,\text{is TM-Recognizable}\}\subseteq\\ \{L \subseteq \Sigma^*\}\]
  • this will be proven later in the course

Input-Output TM


  • An input-output TM:

    \[\{Q, \Sigma, \Gamma, \delta, q_{start}, q_{halt}\}\]

    starts with input written on tape ($C_1=q_{start}w$). If/when it changes to state $q_{halt}$, then the machine stops, and the string written on tape (w/o the trailing infinite blanks) is the output.

    • in general, those TM could also go into infinite loop

Computable Function Definition

  • A function:

    \[f: \Sigma^* \to \Sigma^*\]

    is computable if there exists a TM $M$, such that $\forall w \in \Sigma^*$ (notice the for all here) , $M$ halts with output $f(w)$ written on the tape.

    • this also means it does not go into loop

For example:

  • A computation function could be: insert $#$ symbol in the beginning of the input (i.e. $w\to #w$)
    • TODO: implement this
  • another one would be: insert first block of string to end of second block (i.e. $w_1#w_2\to w_1#w_2w_1$)
    • TODO: implement this
  • a more useful example would be unary addition: $1^n # 1^m \to 1^{n+m}$
    • TODO: implement this

In fact: anything you can implement in your computer programming language can be expressed via such a function.

Variants of TM

Essentially, they are the same thing, but there could be useful for solving problems.


  • Two TM are equivalent, if for $\forall x \in \Sigma^*$, both TMs do the same for each string, s.t. it either:
    • halt + accept for both
    • halt + reject for both
    • run forever for both
  • TM with Doubly Infinite Tape


    • where left side of the head in the beginning will just be blanks

    we claim that this is the same as a one-sided tape.

    Proof Idea:

    • First, we need to have a 1-sided TM, and show that we can construct an equivalent two-sided one

      • this will be easy
    • Second, we need to show that if we have a 2-sided TM, we can construct an equivalent one-sided one

      • this is more trickly, but the idea will be folding the tape:


  • TM with command of not moving:

    So you have:

    \[\delta : Q\times \Gamma \to Q\times \Gamma \times \{L,R,S\}\]


    • $S$ means not to move/stay

    Proof Idea:

    • Easy to simulate staying at the same place by moving right and then moving left
  • Multi-tape TM

    This will be very useful. Basically it is a TM with a fixed number of tape (i.e. you can have 3 tapes, or 6 tapes, etc.). This means the following change:

    • you will have three reading heads:


      in the beginning, all heads will point to the left most character

    • transition/movement

      • \[\delta: Q \times \Gamma^k \to Q \times \Gamma^k\times \{L,R\}^k,\,\,\text{where $k$ is the number of tapes}\]
    • and at the beginning of the computation, input $w$ is on first tape, every thing else is $\textvisiblespace$ (space)

    • accept state and reject state is the same as the one-tape definition

    Proof Sketch:

    • We simulate one-tape using the $k$ tape TM

      • this is trivial, done by just ignoring the other tapes.
    • Conversely, we can simulate a $k$ tape TM using a 1-tape TM as follows:


      where, in the end it looks like:

      • keep content of all tapes separator $#$
      • keep a special marked version of a symbol (e.g. using a star, $$, so $w_1 \to w_1 ^$), representing the head position
      • transition function:
        • scan input from left to right on the above, recording the state of each special marked symbol (basically the head).
          • for example, in the above example, you have state of $w_2 ^, x_1^$
        • then update the tape based on the transition function of multi-tape TM, since now you know exactly current state of the multi-tape, and the input (which you need to come back to the first part of the tape)

    In the end, you will have a quadratic overhead in runtime, since you will need to run back and forth to know what is the current state and what is the next input symbol.

Non-Deterministic TM


  • Basically, it is the same as a TM, except for the transition function: $$

    \delta : Q \times \Gamma \to \mathbb{P}(Q \times \Gamma \times {L,R})

    $$ so that:

    • a string is accepted by a NTM $N$, if there exists at least one accepting computation.

This means, given a string $w$ for a NTM, you can draw a configuration tree:



  • each node is a configuration


  • For every non-deterministic TM, $N$, there exists an equivalent (deterministic) TM, $M$.
    • i.e. they are the same


  • the direction to simulate a TM using a NTM is simple.

  • The reverse would be: given $N$, construct a $M$ as follows:

    • the idea is to simulate a $N$ using $M$, by trying to simulate the tree (so we simulate all computations of $N$).
    • If the word $w$ ever gets accepted on the tree, we accept.
      • this will be done using a BFS manner. Because if you use DFS, then you might get into an infinite loop due to the possible existence of an infinitely deep branch. (Notice that one challenge here is to find a way to “backtrack” in the tree).
  • The solution is to use a 3-Tape TM (which is the same as a normal TM $M$ as proven before):

    • the first tape has the input (we won’t write on this tape)

    • the second tape has the content of the tape of $N$ in the current execution point

      • i.e. contains what is written
    • the third tape keeps track of the path of the node on the tree we are simulating.

    • and the detailed steps are as follows:

      • initially, we have only tape 1 being non-empty, and tape 3 initialized with filling in a $\epsilon$
        • the $\epsilon$ would represent exploring the root node, as you will see in the behavior of tape 3 below
      1. then, we copy tape 1 to tape 2
        • erase whatever we had before on tape 2
        • basically, tape 1 would be intact, and will be our backup
      2. simulate the computation of $N$ on tape 2, so that:
        • each time we meet a non-deterministic choice, we look at tape 3 to determine which one to take.
        • if choice is not valid from tape 3, or the content of the tape 3 ran out (or rejected already), it means we are done with this path. So we replace string on tape 3 with next string telling the next path to explore (see tape 3 behavior below)
      3. go back to step 1)
      • now, for tape 3:
        • Since the total number of states is finite (bounded by the power set), then we can let $b$ being the maximum branching factor (i.e. maximum number of children a node can have).
        • This means you have the alphabet $\Gamma_b={1,2,…,b}$
          • then, at each level of tree, we basically contains an address, so that:
            • for the first level, you can have address ${1},{2},…{b}$
              • where ${2}$ would mean start by the root, going to its $2nd$ child, and finish
              • so if you finished with ${2}$, your next string is ${3}$.
            • for the second level, you can have address ${11},{12},…{bb}$
              • where ${12}$ would mean start by the root, going to its $1st$ child, then go to that node’s $2nd$ child, then finish
            • at most have up to $b$ level, we you have address ${111…1},{111…2},…{bbb…b}$ (each of length $b$).


  • Though the above algorithm would work, the overhead in the running time is exponential.
    • this is because we backtracked and recomputed every step from the root to the next node/choice, by our construction.
    • however, we don’t know, until today, whether if such an overhead is inherent (necessary) or not. (P vs NP)

Historical Background of TM

Turing wanted to show that there exists a mathematical problems that are not computable.

  • then, he found that at the time, there is no definition of a computation (there were no computers even)
  • therefore, in the end he came up with his model of a TM defining a computation

Later on, there comes more models, but it turns out all of them are equivalent of a TM:

  • NFA + 2 stacks
  • NFA + 1 queue
  • general grammar
  • lambda calculus
    • invented by Church
  • random access machines
  • any programming language
    • either equivalent in power, or weaker

Church-Turing Thesis

Church-Turing Thesis Definition:

  • Every reasonable model of computation (i.e. algorithm) can be simulated by a TM.
    • note that this is not a theorem, because we don’t know what will happen in the future about our computer/computation.
    • however, up to today, pretty much everyone takes the thesis as correct/sensible.

Therefore, from now on, we will specify every algorithm in “pseudo code”, such that:

  • each line needs to be computable/implementable by a TM
    • equivalently, each instruction can be implemented in a programming language
  • if you want a TM decider, then you need to make sure that each step of computation is finite, and your description is finite
    • i.e. you cannot run to infinity


  • Every algorithm (TM) can be encoded as a string/finite symbol description. We will denote: $$

    \langle M \rangle

    $$ being the string representing the TM, $M$.

    • also implies that we can make a DFA/NFA/TM/Graph/etc. themselves being an input

Week 9

Closure Property for TM


  • The class of TM-Decidable language is closed under complement.


  • This works only with a TM-Decidable language:
    • Set $M’$ to be the same as $M$, but flip the $q_{acc}$ and $q_{rej}$ states.
    • Then, whenever $w \in L(M) \iff \text{$M$accepts$w$} \iff \text{$M’$rejects$w$}\iff w \notin L(M’)$
    • and conversely $w \notin L(M) \iff \text{$M$rejects$w$} \iff \text{$M’$accepts$w$}\iff w \in L(M’)$
      • note, the line above only works if $M$ is a decider. If it is not, we also need to think about the cases of rejected due to running forever


  • The language $L$ is TM Decidable $\iff$ both $L$ and $\bar{L}$ are TM recognizable.


  • The forward direction is easy:
    • Assume $L$ is TM decidable
      • then $\exists M$ s.t. it is a decider.
      • then $L$ is recognizable because $L$ is decidable
      • and by the above theorem, we can make a decidable $\bar{L}$ from flipping the states in $M$. So $\bar{L}$ is recognizable.
  • The reverse direction is harder:
    • Assume $L$ and $\bar{L}$ are both TM recognizable.
      • then there $\exists M_1, M_2$ s.t.
        • $M_1$ recognizes $L$ $\begin{cases}x\in L \Rarr \text{$M_1$accepts$x$}\x\notin L \Rarr \text{$M_1$rejects or run forever for$x$}\end{cases}$
        • $M_2$ recognizes $\bar{L}$ $\begin{cases}x\in \bar{L} \Rarr \text{$M_2$accepts$x$}\x\notin \bar{L} \Rarr \text{$M_2$rejects or run forever for$x$}\end{cases}$
      • Use $M_1, M_2$ to construct a decider for $L$:
        • on input $x$, run $M_1$ on $x$, and run $M_2$ on $x$ in parallel
          • e.g. run them on two tapes simultaneously. (Copy input on the second tape, and run on both)
        • Now, we construct $M$ such that:
          • if $M_1$ accepts $x$, return accept
          • if $M_2$ accepts $x$, return reject
        • Therefore, we have $\begin{cases}x\in L\Rarr \text{$M_1$will accept} \Rarr \text{$M$accepts}\x\notin L\Rarr x\in \bar{L}\Rarr\text{$M_2$will accept} \Rarr \text{$M$rejects}\end{cases}$
      • Hence we have found a decider for $L$ using $M$ above. So $L$ is TM Decidable.

TODO: Union of TM

Application of TM Decidable Languages

Now, we can use a decider TM for simulating the output of a DFA/NFA/etc.


  • The acceptance problem of performing the DFA task is decidable: $$

    A_{DFA}={ \langle D,w \rangle \text{$D$ is a DFA and $D$ accepts $w$}}

    $$ where:

    • now, the “input” is a tuple of DFA $D$ and a word $w$.
      • notice this means $w$ could be either recognized by $D$ but being rejected, or $w$ could be in the wrong format that $D$ does not even recognized, and hence rejected.
  • The construction of the TM would be:



    • if given the encoding of a DFA, $\langle D\rangle$, it will be easy to check if that input is in the right format of $w$.
    • this also means we have a way to check if $\langle D\rangle$ is a DFA or not (by definition).

High Level Proof

  • image-20201106134714599


  • Similarly, doing the job of a NFA: $$

    A_{NFA}={ \langle N,w \rangle \text{$N$ is a NFA and $N$ accepts $w$}}

    $$ is also decidable.

  • Construction of decidable TM:



    • “on $\langle D,w\rangle$” just means the same thing as “checking if input $x$ is a valid encoding of $\langle D,w\rangle$. If not, reject.”
    • and this will be a decider for $A_{NFA}$

Now, a more interesting example would be:


  • We can use a TM decider to check whether if a DFA has no acceptations: $$

    E_{DFA}={\langle D\rangle \text{$D$ is a DFA and $L(D)=\empty$}}


  • Construction of decidable TM:

    • the idea is to check whether or not a DFA $D$ has any reachable accept states.
      • took care of the edge cases that you might have an accept state that is not reachable
    • basically, this will be done backwards: we start with the accepting states:



    • remember, when met statements like “repeat …”, make sure it is in finite time. (In this cases, it is)
    • and this algorithm is decidable.

One take home message from the above is that the below construction will not work:



  • the step of checking $w \in \Sigma^$ will run forever, which means it is a *recognizer but not a decider.


  • We can use a TM decider to check whether if two DFA are the same: $$
EQ_{DFA}={\langle D_1, D_2\rangle \text{$D_1,D_2$ are DFA and $L(D_1)=L(D_2)$}}


  • Construction:
  • Using the idea of symmetric difference, and construct $L(C)$ as follows.
    • since we have $DFA$s, using the closure property, we can get a $L(C)$ using DFA $C$.
    • to see if $C$ is empty, we use the above $E_{DFA}$
      • if $C$ is empty, then $L(D_1)=L(D_2)$, otherwise, false



  • We can also use a decider to do the job of a CFG: $$

    A_{CFG}={ \langle G,w \rangle \text{$G$ is a CFG and $G$ accepts $w$}}


  • Construction skipped.


Therefore, this means we can also construct a decider for:

  • $A_{Regular Language}$ since $A_{CFG}$ works

Not TM decidable Languages

  • $EQ_{CFG} = { \langle G_1,G_2 \rangle \vert \text{$G_1$,$G_2$are CFG,$L(G_1)=L(G_2)$} }$
  • $AMB_{G} = { \langle G\rangle \vert \text{$G$is a CFG that is ambiguous} }$

Week 10

TM Recognizable Languages

Now, for languages that are recognizable, means that there exists a TM that solved the problem.

  • note that now, we don’t need to have finite steps as compared to before, because this is about recognizers.
  • hence we could have the case of infinite loop, as long as the correct output could be reached (e.g. accept=halt)

Examples of Recognizable Languages

  • The acceptance problem of TM, such that using this I get whether the input is accepted or not (including running forever): $A_{TM}$ $$

    A_{TM}= { \langle M,w\rangle \text{$M$ is a TM, and $M$ accepts $w$} }

    $$ where:

    • note that this $A_{TM}$ could run forever. As we are dealing with recognizers, and acceptance is purely dependent on accepting + halting.
    • $A_{TM}$ could basically be understood as a general Input-Output TM, such that we have $1$ output as accepted and $0$ output as rejected, infinite loop as infinite loop.
  • Proof/Construction




    • notice that if $M$ is running forever, then $U$ also runs forever. However, this is fine because if $U$ runs forever and is a recognizer, $U$ rejects by definition
      • $U$ is not a decider, but a recognizer.
      • this $U$ is also called a Universal Turing Machine, because it is general in the sense that this TM can run any other algorithm

Interestingly, we can also prove that the above is not decidable:

Examples of TM Non-decidable Languages

  • The same acceptance problem of TM, $A_{TM}$, is not decidable $$

    A_{TM}= { \langle M,w\rangle \text{$M$ is a TM, and $M$ accepts $w$} }



  • Assume towards contradiction, that it is decidable. Then we have a decider $D$ for $A_{TM}$, such that it is a TM decider and hence always halts.

  • image-20201113011316669


    • we see that $T$ is a decider, and this is because $D$ is a decider.
    • Now, consider the output of $T$ on $\lang T\rang$:
      • If $T$ accepts $\lang T\rang$:
        • then by definition of $D$ for $A_{TM}$, it means $D$ accepts $\lang T, \lang T\rang\rang$.
        • however, by construction of $T$, it means $D$ rejects. Hence contradiction
      • If $T$ rejects $\lang T\rang$:
        • then by definition of $D$ for $A_{TM}$, it means $D$ rejects $\lang T, \lang T\rang\rang$.
        • however, by construction of $T$, it means $D$ accepts. Hence contradiction

    Therefore, $D$ as a decider cannot exist.

  • In fact, this is the same as doing a diagonalization:


Examples of Recognizable TM

  • The TM that tells me if a TM is not empty $$

    NE_{TM} = { \langle M \rangle \text{$M$ is a $TM$ and $L(M)\neq \empty$} }


  • Proof/Construction

    • first attempt

      • on input $\langle M\rangle$, where $M$ is a TM:
        • For all strings $s \in \Sigma^*$ (this infinite loop is fine, as it is implementable for a recognizer)
        • Run $M$ on $s$
          • if accepts, accept
          • if rejects, go on to the next string (might be stuck here)

    where this does not work because:

    • is implementable for recognizers, because our recognizer can run forever
      • but the problem is that some cases of accepted string cannot be accepted by this construction, because the earlier string will loop forever
    • therefore, it does not recognize $NE_{TM}$

    • second attempt

      • on input $\langle M\rangle$:
      • for $i = 1,2,3…$
        • for all $s \in \Sigma^*$, $\vert s\vert \le i$
        • Run $M$ on $s$ for $i$ steps (hence this step is always finite)
        • if accepts, accept (so this step can always be reached)

    where this works because:

    • the idea is that now, everything inside the loop will always be finite, so if a string will be accepted, then it will be accepted
      • remember, the fact that this loop is infinite does not matter

      This will work because:

      • first, on input $\langle M\rangle$

      • image-20201110223621127

        $\Rarr$ then $R$ does not accept

Undecidability and Unrecognizability

In fact, many languages are not recognizable.

  • here, we show some examples of language that are not decidable/not recognizable
  • and that many natural examples (in real life) are not decidable/not recognizable



  • Two sets $A,B$ have the same cardinality ($\vert A\vert =\vert B\vert$) if there is a bijection function (one-to-one and onto function) $$

    f: A \to B



  • A set $A$ is countable, if $A$ is either finite, or there is a bijection function: $$

    f:\N \to A

    $$ where:

    • $\N$ is the set of natural numbers (which is infinite)


  • If $A \subseteq B$, then $\vert A\vert \le \vert B\vert$
  • If $A \subseteq B$, and $B$ is countable, then $A$ is also countable.
  • A countable or finite union of countable sets is countable.

For example,

  • a countable set could be all even positive integers
  • for a finite alphabet $\Sigma$, $\Sigma^*$ is countable
    • we could use the lexicographical order, and order strings into length $i=0,1,2,3…$
    • hence, we can map them to the set of natural numbers

Countability with TM

  • The set of all TM is countable, because each TM can be encoded into a string, which is countable
  • The set of all TM recognizable languages is countable.
    • obviously, same for decidable
  • basically, if you can go through everything in some defined order, then you can map it to real numbers.

Uncountable Languages

However, there are many things that are uncountable.

Using Diagonalization for TM Proofs


  • For all the reals between $(0,1)$, it is not countable.


  • Assume towards contradiction that it is countable.

  • Then we can write a all members into a set, such that we get a one-to-one mapping with the natural numbers: $$

    {r_1, r_2,r_3, … }\text{, members of $r_i$, $\forall i \in \N$}


    • Then suppose we have the following order for numbers:



      • it does not matter what number we pick for each $r_i$, they can be any number
    • Now, we construct a real number $r$ that is $100\%$ not in the set because it is different from every one of them by diagonalization

      • picking $r=0.d_1d_2d_3…$, such that $d_i \neq r_{i,i}$, where $r_{i,i}$ means the $ith$ digit of $r_i$.

        • one edge case to care is to pick $d_i \in {1,2,…,8}$, so we take care of the case that $0.0999… = 0.1000…$, even if the digit is different
      • so the $r$ basically picks a different number than all the highlighted ones:


        (e.g. you can have $r=0.253…$)

    • Therefore, we obtain a $r$ that is different from all the members of $r_i$, because $r$ is different from each one of them as at least one digit is different ($d_i \neq r_{i,i}$).

The idea of diagonalization works often when:

  • the representation of each member of the set is infinite.
    • As a result, you can always construct a member that is different from each of them by not picking the diagonal members

Remember, the idea is that if it is countable, then there is a one-ton-one and onto MAPPING (ORDERING) to the $\N$ (bijection), or whatever is countable.


  • For finite $\Sigma$, the set of all languages $$

    { L \subseteq \Sigma^* } = \mathbb{P}(\Sigma^*)

    $$ is not countable

  • Therefore, it means a lot of languages are not TM-recognizable (if all are recognizable, then this set would also be countable since a TM-recognizable language is countable)


  • Again, assume towards contradiction that it is countable, so you can write ${L_1, L_2, L_3,…}$ is a way that covers all languages.

  • Since any language is essentially telling whether if a string is accepted, we construct:

    • break $\Sigma^$ into $\Sigma^={ s_1, s_2, s_3, … }$

      • remember, that $\Sigma^*$ is countable, since each member ($s_i$) is finite and we can put them in lexicographical order
    • Then a language is just telling you the T/F output of each string $s_i$:

      • notice that the representation of each $L_i$ is infinite, hence hints us of diagonalization.


      and the diagonal way to construct $L$ is evident from above, as:

      • Construct $L = { s_i \in \Sigma^* \vert s_i \notin L_i }$
      • Then, there is at least one string that:
        • was accepted by $L_i$ but not accepted by $L$
        • was not accepted by $L_i$ but accepted by $L$
    • Therefore, we have found a $L$ that differs for every one of $L_i$, and it means it is not countable.

    Therefore, the set of all languages is not countable.

    • there exists some countable languages that are not TM recognizable (e.g. the $L$ we have constructed above)


  • Since we know:
    • the set of all languages is not countable
    • the set of all TM-recognizable language is countable
  • Therefore, there is at least one language (actually uncountable many of them) that is not TM recognizable.


  • We say that the problem: $$

    \overline{A_{TM}} \text{ is not recognizable}



  • This is purely logical

  • Since we know if $L$ is decidable, then $L$ and $\overline{L}$ are both recognizable.

    • We know that $A_{TM}$ is recognizable, and $A_{TM}$ is not decidable
    • Therefore, if $\overline{A_{TM}}$ is recognizable, then there is a contradiction as it means $A_{TM}$ is decidable
      • since it would mean both $A_{TM}$ and $\overline{A_{TM}}$ being recognizable
    • Therefore, we have a contradiction

    Hence $\overline{A_{TM}}$ is not recognizable.

One consequence of the above is that:


  • The class of TM-recognizable languages is not closed under complement.

Using Reduction for TM Proofs

Basically, the idea of reduction is that, we can construct (towards contradiction) a TM $R$, and use it to build up a TM that contradicts what we have known so far:

  • $A_{TM}$ is not decidable
  • $\overline{A_{TM}}$ is not recognizable
  • etc.


  • The halting problem: $$

    \text{$Halt_{TM}$ = { $\lang M,w\rang$ $M$ is a TM, $M$ halts on $w$ }}

    $$ is undecidable


  • Again, assume towards contradiction that $Halt_{TM}$ is decidable, so has a decider $R$.

  • Now, the reduction method is to build up towards something we know would contradict

    • we use $R$ to construct a decider $S$, such that



      • each step is a deciding (finite) step, so $S$ is a decider
    • Now, notice we have constructed a TM $S$ that is decidable for $A_{TM}$.

      • but we know $A_{TM}$ is not decidable. Hence a contradiction.
  • In more formal detailed words:


Week 11

Using Reduction for TM Proofs (Continued)

Now, we come to the formal definition of reduction:


  • We say a language $A$ is Turing-reducible to a language $B$, written $A \le_T B$, if given a subroutine/decider/oracle that decides $B$, there exists a decider for $A$.
    • i.e. if we know how to solve $B$ (harder part), then we also know how to solve $A$ (easier part).
    • i.e. $A$ would be easier to decide than $B$, such that when we decide $B$, we have already decided $A$.


  • If $A \le_T B$ and $B$ is decidable, then $A$ is decidable.
    • i.e. if we can solve the hard part $B$, then we can decide the easy part $A$ using it.


  • image-20201117221019640

Equivalently, the opposite is also useful for proofs:

Equivalent Theorem:

  • If $A \le_T B$ and $A$ is undecidable, then $B$ is undecidable.
    • intuitively, if $A$ is the smaller part of $B$, and $A$ is undecidable already, then $B$ is hard/undecidable.
    • logically, if we can solve $B$, then we should be able to solve $A$. But since $A$ cannot be solved, then $B$ cannot be solved.

Therefore, the above would be used as follows:

  • To prove $B$ is undecidable, we can prove using contradiction:

    1. Assume $B$ is decidable (harder problem), then we get a decider for $B$.

    2. Then we can use $B$ to solve the problem $A$, such that $A$ would be decidable as well

      • obviously, we need to know $A$ to be undecidable for contradiction

      • the construction of using $B$ to solve $A$ takes work

    3. Since $A$ is undecidable, this contradicts the assumption that $B$ decides.

    4. Hence $B$ is undecidable.

  • The idea is that if you can solve $B$, the harder problem, then you can use $B$ to easily solve $A$.

    • e.g. if we can solve $B=Halt_{TM}$, then we can use to solve $A=A_{TM}$ (look at the proof in the previous section). But $A_{TM}$ is not decidable, then $Halt_{TM}$ is not decidable.

For example:

  • Claim: $$

    E_{TM}={ \lang M\rang\, \, \text{$M$ is a TM, and $L(M)\neq \empty$} }

    $$ is not decidable.

  • Proof using reduction:

    This will be a proof by reduction from $A_{TM}$, namely, we will show that $A_{TM}$ is Turing reducible to $E_{TM}$, i.e. $A_{TM} \le_TE_{TM}$.

    • i.e. if we can decide $E_{TM}$, then we can also decide $A_{TM}$ (contradiction)

    Assume towards contradiction that we have a decider $O$ that solves $E_{TM}$. Then we construct a $R$ that uses $O$ as follows:



    • the general format for $O$ (assumed decider) would always be:
      • if accepts, (here you need to implement)
      • if rejects, (here you need to implement)
      • think of this as: we are given the library of $O$, such that we can use it right away

    This $R$ would be a decider for $A_{TM}$:


    So we showed that we can use $E_{TM}$ to decide $A_{TM}$, such that $A_{TM} \le_T E_{TM}$. But $A_{TM}$ is not decidable, this means $E_{TM}$ cannot be decidable.

For example:

  • Claim: $$

    EQ_{TM}={ \lang M_1, M_2 \rang \text{$M_1, M_2$ are TMs and $L(M_1)=L(M_2)$} }

    $$ is not decidable.

    • this means that, in general, we cannot have a way to decide whether if the two programs are the same or not.
  • Proof Using Reduction:

    We do it with a reduction from $E_{TM}$.

    1. We assume $O$ being a decider for the language $EQ_{TM}$.
    2. We want to use $O$ to decide $E_{TM}$, which we know is not decidable.
    3. image-20201117225446342

As a result, we have used $O$ to make a decider $R$ for the language $E_{TM}$, because:


so we have showed that $E_{TM} \le_T EQ_{TM}$, since $E_{TM}$ is not decidable, then $EQ_{TM}$ is also not decidable.

For example:

  • Claim: $$

    REG_{TM}={ \lang M\rang \text{$M$ is a TM, $L(M)$ is regular} }

    $$ is not decidable

  • Proof using Reduction

    Assume that we have a decider $O$ for $REG_{TM}$, then we can construct a decider for $A_{TM}$ using the algorithm below:


    and we see this $R$ being the decider for $A_{TM}$ because:

    • recall that $0^n1^n$ is not regular due to pumping lemma


One definition not explicitly addressed is as follows:

TM Implementable:

  • An iteration “for $s$ in $S$, do…” is implementable if and only if:
    • the set $S$ is countable
    • each member of the set $S$ is computable (can be found using a decider TM/halting TM)

For example:

  • for every $w$ in $\Sigma^*$, it works trivially and is implementable
  • for every $\lang M\rang$ for all TM, where $M$ is a decider, this is not implementable
    • the set of all TM is a countable set, but
    • $\lang M\rang$ being a decider is not computable, because it is not decidable

In summary, we have the following UNDECIDABLE languages:

  • $A_{TM}$
  • $Halt_{TM}$
  • $E_{TM}$
  • $EQ_{TM}$

  • $REG_{TM}$

Rice’s Theorem

(Informal) Rice’s Theorem:

  • Any non-trivial language of the form: $$

    { \lang M\rang \text{M is a TM and $L(M)$ satisfies…}}

    $$ is not decidable

    • i.e. we cannot decide the properties of a language of a TM machine.


  • $P$ is a non-trivial property of a recognizable languages:
    • $P \subseteq {\lang M\rang \vert \text{$M$is a TM }}$ (defining property)
      • then if $M_1, M_2$ are TMs and $L(M_1)=L(M_2)$, then either:
        • both have the property $\lang M_1 \rang, \lang M_2 \rang \in P$
        • or neither has the property $\lang M_1 \rang, \lang M_2 \rang \notin P$
      • so the property is a property of the LANGUAGE of the TM, not of the implementation of TM
    • and $P \neq \empty$, $P \neq { \lang M \rang \vert \text{$M$is a TM} }$ (defining non-trivial)
      • i.e. $P$ is a proper, non-empty subset of all TM $\lang M\rang$
      • i.e. there exists at least one $\lang M \rang \in P$, and at least one $\lang M \rang \notin P$

Rice Theorem:

  • Let $P$ be a non-trivial property of recognizable languages. Then $P$ is not decidable.
    • e.g. a non-trivial property could be $L(M)$ being regular

Therefore, it means:

  1. once you know a property $P$ of a TM language is non-trivial
  2. the language is not decidable by Rice’s Theorem


  • We will use reduction from $A_{TM}$.

    • the idea is that all those proofs use the same logic/construction.
  • Assume $O$ is a decider for (Turing machine of a language with) non-trivial property $P$. We will construct a decider for $A_{TM}$.

  • image-20201125143044644

  • Now, we construct the decider $R$:


  • Now, we prove that this is a decider for $A_{TM}$:


  • Therefore, since $A_{TM}$ is undecidable, then $P$ is also undecidable (by contradiction).

So we see that the general idea for reduction to $A_{TM}$ is that:

  • We can construct the $M’$ such that it:
    • is empty and $\notin P$, if $M$ rejects or run forever on $w$
    • is non-empty and $\in P$, if $M$ accepts $w$
    • or vice versa for $P$ in the above
  • Then, we can run $O(\lang M’\rang)$ to decide $A_{TM}$


  • The proof above for Rice’s Theorem, which showed $A_{TM} \le_T P$ can be rephrased to show that $$

    \begin{cases} \overline{A_{TM}} \le_M P & \text{if $\lang M_{\empty}\rang \in P$}
    \overline{A_{TM}} \le_M \overline{P} & \text{if $\lang M_{\empty}\rang \notin P$} \end{cases}

    $$ which is mapping reduction, which shows the refined Rice’s Theorem:

    • If $P$ is a non-trivial property of a recognizable language, then:
      • if $\lang M_{\empty}\rang \in P$, $P$ is not recognizable (because $\overline{A_{TM}}$ is not recognizable)
      • if $\lang M_{\empty}\rang \notin P$, $\overline{P}$ is not recognizable (because $\overline{A_{TM}}$ is not recognizable)

For example:

  • Question:

    The language: $$

    P={\lang M\rang \text{$M$ is a TM, and every string accepted by $M$ starts with $0$} }

    $$ is undecidable by Rice’s Theorem

  • Solution

    We just need to prove that this $P$ is a property, and is non-trivial:

    • This $P$ is a property because:
      • if $M_1, M_2$ are TMs such that $L(M_1)=L(M_2)$, then it means $\lang M_1\rang \in P \iff \lang M_2\rang \in P$
    • This $P$ is non-trivial because:
      • we can find a $M$ such that $\lang M\rang \in P$
        • e.g. accepted the string if the first character is $0$
      • and we can find a $M$ such that $\lang M\rang \notin P$
        • e.g. accepted the string if the first character is $1$

    Therefore, by Rice’s Theorem, $P$ is not decidable.

When does Rice’s Theorem not work?

  • If we have a property, but:
    • it is a property of a DFA, graph, integers, etc. such that it cannot be rephrased as a TM property
    • it is a property that depends on implementation
      • e.g. ${ \lang M\rang \vert \text{$M$is a TM,$M$accepts$\epsilon$, and$M$has at least 17 states} }$

In summary, to show that a language is undecidable:

  • Diagonalization
  • Turing reduction
  • Rice’s Theorem

Next, we move on to prove whether if $L$ is not recognizable.

Week 12

Unrecognizable Languages

One way to prove that $L$ is not recognizable would be:

  • $L$ is not decidable, but $\bar{L}$ is recognizable, then $L$ cannot be recognizable
    • this we already know, because if $L$ and $\bar{L}$ is recognizable, then $L$ is decidable.

One application of the above would be:


  • $E_{TM}$ is not recognizable


  • We have seen that $E_{TM}$ is not decidable, but $\overline{E_{TM}}$ is recognizable.
    • in fact, for $\overline{E_{TM}}$, we have already proved with $NE_{TM}$, which is recognizable.
  • Therefore, $E_{TM}$ is not recognizable

Other ways include:

  • Diagonalization (possible for some languages)
    • not on exam
  • Refined Rice’s Theorem
    • not on exam
  • Mapping Reduction
    • on exam

In fact, just TM reduction would not work for recognizability, because:

  • if $A \le_T B$, and $B$ is recognizable, we cannot conclude that $A$ would always have a working recognizer:
    • the idea is that if $w \in A$, but we called $B$ multiple times and $w$ hangs forever for $B$, then we have a trouble producing a correct recognizer.
      • basically our program could not reach the end/hangs in middle
  • however, the above behavior would work, if and only if:
    • $B$ is run exactly once
    • and $A$ outputs the same as $B$ (i.e. if $w$ hangs on $B$, then it should have also hangs on $A$)
      • i.e. if we hang, we need to hang in the end

Formal Definition of Mapping Reduction

  • A language $A$ is mapping reducible to a language $B$, denoted as $A \le_{M} B$, if there exists a computable function $$

    f:\Sigma^* \to \Sigma^*

    $$ such that:

    • $w \in A \iff f(w) \in B$

    • i.e. every input in $A$ maps to $f(w)$ also in $B$, and every input not in $A$, then $f(w)$ also not in $B$.


For example:

  • Question: $$

    \overline{A_{TM}} \le_M REG_{TM}

    $$ where:

    • recall that $REG_{TM}={ \lang M\rang \vert \text{$M$is a TM,$L(M)$is regular} }$
  • Solution:

    We first figure out the mapping function $f$:



    • this $f$ is computable

    then, we need to prove that:

    • $w \in A \iff f(w) \in B$:



    • we basically mapped $\lang M,w\rang$ to $\lang M’\rang$


  • If $A \le_M B$, then $A \le_T B$
    • note that the reverse might not be true
  • if $A \le_M B$, then $\bar{A} \le_M \bar{B}$
    • follows trivially from the definition of mapping reducible


  • Proving: if $A \le_M B$, then $A \le_T B$

  • Suppose $A \le_M B$. The we get a computable mapping function $f$, such that $w \in A \iff f(w) \in B$

  • Now, using that $f$, we need to show $A \le_T B$

    • we basically construct decider for $A$ if we know $B$ is a decider and $f$ maps the inputs of $A$ to $B$:


    • and the above would be a decider for $A$, because:



  • Proving: if $A \le_M B$, then $\bar{A} \le_M \bar{B}$
  • image-20201124223306787

The more relevant theorems are the follows:

Using Mapping Reduction for Recognizability:

  • If $A \le_M B$, and $B$ is recognizable, then $A$ is recognizable
  • If $A \le_M B$, and $A$ is not recognizable, then $B$ is not recognizable.
    • i.e. if $A$ called $B$ exactly once in the subroutine, and $A$ outputs the same as $B$:
      • then if $A$ is not recognizable, $B$ cannot be recognizable. Otherwise, we would have solved $A$ to be recognizable.


  • If $A \le_M B$, and $B$ is recognizable, then $A$ is recognizable

  • Suppose $A \le_M B$. The we get a computable mapping function $f$, such that $w \in A \iff f(w) \in B$

  • If $B$ is recognizable, then we get a TM recognizer $M$ for $B$. We use the the $M$ and $f$ to construct a recognizer for $A$:

    Then simply:


    And $R$ would be a recognizer for $A$, because:

    • image-20201124223656292

Proving “If $A \le_M B$, and $A$ is not recognizable, then $B$ is not recognizable.” would just work the same way as above.

  • For instance, prove by contradiction.


  • $L$ is co-recognizable if $\bar{L}$ is recognizable.
    • there is no co-decidable, because if $L$ is decidable, then $L$ is co-decidable automatically.

For example:

  • Theorem: $$

    ALL_{TM} = { \lang M\rang \text{$M$ is a TM, and $L(M)=\Sigma^*$ } }

    $$ is neither recognizable nor co-recognizable

  • Proof:

    First, we show that $ALL_{TM}$ is not co-recognizable, namely, $\overline{ALL_{TM}}$ is not recognizable.


    In particular:



    • we would like to map:
      • if $M$ accepts $w$, then $M’$ accepts all possible strings
      • if $M$ does not accept $w$, then $M’$ rejects at least one string

    Now, we construct the mapping as follows:


    and this mapping works as we wanted, because:




    Next, we will show that $ALL_{TM}$ is not recognizable. This means we need to construct a mapping $\overline{A_{TM}} \le_M ALL_{TM}$:

    Again, the idea is to map (the reverse):

    • if $M$ accepts $w$, then $M’$ rejects at least one string
    • if $M$ does not accept $w$, then $M’$ accepts all possible strings

    and the mapping functions is as follows:



    • notice we cannot write: “if $M$ does not accept, accept $x$”. This is because we might deal with the infinite loop.

    Now, we claim that this mapping function works:




    • notice that we just needed to construct $M’$ that rejects at least one string if $\lang M,w\rang \in A_{TM}$

In summary, the unrecognizable languages we know so far are:

  • $\overline{A_{TM}}$
    • but $A_{TM}$ is recognizable
  • $ALL_{TM}$

In general:

  • If we have a TM reducibility, then: $$

    A \le_T B \iff \bar{A} \le_T B \iff A \le_T \bar{B} \iff \bar{A} \le_T \bar{B}


    • since we have decidability, then we can easily flip them to still be decidable
  • If we have Mapping reducibility, we only have: $$

    A \le_M B \iff \bar{A} \le_M \bar{B}


    \bar{A} \le_M B \iff A \le_M \bar{B}


    • since now, we are talking about mapping function $f$

Week 13

Computation History (Extension)

This will be another technique we can use to prove:

  • undecidability

This computation history technique will be useful if

  • The language you need to prove to be recognizable/unrecognizable/decidable/undecidable might not have an obvious TM.
  • Then, you cannot use what we have done before: TODO: finish this
    • need to use this technique

Another technique to prove unrecognizability.


Using this, we can prove the following are not decidable


Introduction to Complexity Theory

Now, our question comes:

How efficiently can we solve problems that we know are decidable?

In the context of complexity, we will only look at decidable problems, for computing their running time/complexity.


  • Given a TM $M$, the running time/time complexity of $M$ is a function: $$

    f: \N \to \N

    $$ where:

    • $f(n)$ is the max number of steps taken by $M$ on any input of length $n$.
      • e.g. $f(12)$ is the maximum number of steps that a decider $M$ can run on all input of length $12$
    • a step would be defined as a delta function


  • Running time is a function on input length
  • It is to deal with worst-case for each input length

However, this means we need to figure out the formal definition of a TM machine, which will take quite a lot of time.

  • so we need a way to still work with this even with high level definitions.


  • We define a set of language: $$

    \text{TIME}(t(n)) = { L L \text{ is decidable by a TM that runs in time } O(t(n)) }


    • note that:
      • e.g. $L \in \text{TIME}(n^3+n)$ if we can find one implementation of $M$ that has running time of $O(n^3)$.


  • For functions $f,g: \N \to \R^+$, we say that: $$


    $$ if there exists a positive constant $c, n_0$ such that:

    • $\forall n \ge n_0$, $f(n) \le c \cdot g(n)$

    • basically, $g(n)$ grows at least as fast as $f(n)$ for $n$ being large

As a result:

From the definition of $\text{TIME}(t(n))$:

  • it follows that if $t(n)=O(t’(n))$, then: $$

    \text{TIME}(t(n)) \subseteq \text{TIME}(t’(n))


    • i.e. obviously, if a TM $M$ runs in $t(n)$, then it also runs in $O(t’(n))$


  • The class poly-time computable languages is defined as: $$

    P = \bigcup_{k=1}^\infty \text{TIME}(n^k)


    • note that this means:

      • e.g. a language $L \in P$ if $L$ runs in time $O(n^{100})$

      • since $log(n) \subseteq O(n)$, we would also have $\text{TIME}(log(n)) \in P$

Now, this means:

  • if we have different models of computation/implementation of TM, we might have different running time.
    • therefore $\text{TIME}(t(n))$ is sensitive to our implementation.
    • e.g. RAM (random access TM, being able to jump with its head) being transformed into an equivalent single tape, standard TM requires cubic overhead.
  • However, the notion of poly-time is robust to the above example, and many other variations.
    • therefore, we will identify $P$ (poly-time computable) with “efficient computation”.
      • this might sound to be too broad, because if you have $n^{100}$, it does not sound very efficient
      • however, it has the broad use that if $L \notin P$, then it is definitely not efficient
      • even if we have $n^{100}$ to a high power, we then at least know it is solvable, and over time, could come up with a better solution.

In fact, one important property of $P$ would be:

  • $P$ is invariant for all models of computation that are polynomially equivalent to the deterministic single-tape Turing machine
    • this covers, if not all, most decidable TMs we have, due to the strong Church-Turing Thesis.

Strong Church-Turing Thesis:

  • Every reasonable/natural notion of algorithm (model of computation) can be simulated by a TM with polynomial overhead.
    • i.e. all reasonable deterministic computational models are polynomially equivalent.
      • That is, any one of them can simulate another with only a polynomial increase in running time
    • this is more controversial than the original Church-Turing thesis
    • the original Church-Turing thesis was that: every reasonable model of algorithm can be modelled by a TM machine.
      • this is quite widely accepted


  • An algorithm is poly-time if it’s poly-time in terms of its input length: $$

    f(n) \in P

    $$ however:

    • this means that the representation of an input matters:
      • in turns out that the following representation only adds a polynomial overhead:
        • binary
        • hex
        • decimal
      • except for unary representation


  • If you have an algorithm that takes an integer as input, and it runs poly-time of length of input.

  • This means that if input is a number $N$, then if you want to run poly-time in terms of $N$ as well, we need to only take up to $log(N)$ bits to represent it.

    • proof sketch:

      we know: $$

      O(n^k), \text{where $k$ is some constant}

      \[to represent a number in base $k$, and $k \neq 1$, we have\]

      N = k^n



    • $k$ is the base. (e.g. base 2 for binary)
    • therefore, if $n=log(N)$, we have: $$


\[however, if we use unary: so $n=N$, (unary) then:\]



  • still works? TODO

For example:

  • Question

    For a grade-school algorithm for checking if a given input is prime, is it a poly-time algorithm?

    • assuming that a number $a$ divides another $b$ can be done in poly-time, which is true.
    • assuming the input format is not unary
  • Solution

    A simple algorithm would be:



    • first, prove that the above is implementable (each step is computable), which is in this case

    Running time in $\vert x\vert$:

    • Each Iteration of loop is poly-time in $\vert x\vert$ (since division is polytime)

    • the number of iterations is $\sqrt{x}$

      • since $x = 2^{\vert x\vert }$ in binary

      • or $x=k^{\vert x\vert }=2^{log_k(2)\cdot \vert x\vert }=2^{O(\vert x\vert )}$, for $k$ being the base except for unary

      • hence: \(\sqrt{x}=2^{O(x)/2}=2^{O(x)}\) Therefore, the runtime for this algorithm is in fact exponential: \(\text{TIME}=2^{O(|x|)}\cdot O(|x|^k) \notin P\)

However, for this problem of $\text{PRIME}$, it turns out that there is actually a poly-time algorithm, which is very sophisticated so not shown here.

  • so in fact, $\text{PRIME} \in P$ even if the above construction doesn’t work to be in $P$.

Examples of Poly-Computable

Basically, from now on we are:

  • only talking about decidable languages
    • i.e. by default, our algorithm will be boolean outputs
    • however, in the end it does not matter, as long as it halts
  • algorithms are efficient if they run within poly-time

P vs NP

  • P: intuitively, language that can be decided in polytime
  • NP: intuitively, language that can be verified in polytime

Verifier Definition

  • A verifier for a language $A$, is a TM (algorithm) $V$, s.t. $V$ gets input $(x,c)$ and outputs accept/reject, where: $$

    A={ x \text{$V(x,c)$ accepts for some $c$} }
    \[- i.e. $V$ is a verifier for $A$, if $A$ consists of all strings $x$ where there **exists some $c$ (certificate/"solution")**, such that $V(x,c)$ accepts. - so\]
    x \in A \Rarr \exists c, & V(x,c)\,accepts\\
    x \notin A \Rarr \forall c, & V(x,c)\,rejects

Polytime Verifier

  • A polytime verifier for a language $A$ is a verifier $V$, such that $V$ runs in time $O(\vert x\vert ^k)$, i.e. polytime in length


Polytime Verifiable

  • A language $L$ is polytime verifiable if $\exists$ a polytime verifier $V$ for $L$.

Hence the formal definition of P and NP are:

Formal Definition of P and NP $$

NP = { L \text{$L$ is polytime verifiable in length of input} }
P = { L \text{$L$ is polytime computable in length of input} }


For example:

  • Consider the language: $$

    Composite={ \lang x\rang \text{$x$ is not prime} }

    $$ where:

    • I claim that this language is NP
  • Proof:

    There is a simple polytime verification algorithm:

    • recall that the verifier will get $\lang x, c\rang$
    • where $c$ would be some “proof/solution”, which might or might not be correct
      • if verifier works:
      • if $x$ is not prime then there will be some “solution” $c$ that works
      • if $x$ is prime, then all “solutions” $c$ will not work


    Now, we need to show 2 things:

    • $V$ runs in time $poly(\vert x\vert )$

      • every step is in polytime
    • and prove that such a construction works


For example:

  • Question

    Is the algorithm $PRIME$ also NP?

  • Solution

    It turns out yes, because we know that $PRIME$ can be solved in polytime, so we just need to use that solution wired in:



    • we basically ignored $c$, and solved it directly.
    • but this is like a “trick” $NP$ problem, because every solvable $P$ is also a $NP$ problem.

Claim: $$

P \subseteq NP


  • where:
    • we basically see this in the example above


  • If $L \in P$, there is a polytime decider for $L$



  • However, this is not known: $$

    NP \subseteq P \,\,??\,\,NP \nsubseteq P

    $$ where:

    • if $NP \subseteq P$, then $P=NP$
      • one consequence would be that cryptography is dead
    • if $NP \nsubseteq P$, then $P \neq NP$


  • exponential time is defined as:
\[\text{EXP} = \bigcup_{k=1}^\infty \text{TIME}(2^{k\cdot n})\]

Claim: $$

P \subseteq NP \subseteq \text{EXP}



  • We have already seen that $P \subseteq NP$

  • We then just need to show $NP \subseteq \text{EXP}$

    • If $L \in NP$, then there exists a polytime verifier $V$ for $L$, such that $V(x,c)$ runs in time $O(\vert x\vert ^k)​$
      • note that, since $V(x,c)$ is polytime, is immediately made $\vert c\vert \le O(\vert x\vert ^k)$. Because if it were, then simply reading the input $c$ would have exceeded the polytime, then this contradicts the verifier $V$
    • Let us fix a constant $a$, such that $V$ runs in $a\vert x\vert ^k$ time, for some constant $a,k$.
    • Then, we just check all $c$ up to $\vert c\vert \le a\vert x\vert ^k$



    • this would be loop for all numbers $c$ up to $2^{O(\vert c\vert )}=2^{a\cdot \vert x\vert ^k}= O(2^{\vert x\vert ^k})$ times

    where, the runtime analysis would be:



Claim: $$

P \subsetneq \text{EXP}


  • this is provable. However, we will not prove them

Consequently, we now do not know where $NP$ fits:

  • the following cases could all be possible $$

    \begin{cases} P = NP \subsetneq EXP
    P \subsetneq NP \subsetneq EXP
    P \subseteq NP = EXP
    … \end{cases}


Week 14

NP Complete Class


  • NP Complete Class are languages that has the property: if the language $L$ is in NP Complete, then we can use it to show that:
    • if $L \in NP-Complete$, and that language can be solved in polynomial time $P$, then $P = NP$

The NP Complete Class plays such a role:



  • $\text{NP-Complete}$ is of course in $NP$, and we have only two possible cases
  • if we show that $L \in \text{NP-Complete}$ is also $L \in P$ (can be solved in polytime) then every other language in $NP$ would also be in $P$
    • intuition: those $L \in \text{NP-Complete}$ would be languages that are “hardest” in $NP$. If you can solve the hardest one, then you can use it to solve the easier ones/all in $NP$.

Using Reductions


  • A language $A$ is polytime reducible to $B$, ($A \le_P B$), if $\exist$ polytime computable function $f: \Sigma^: \Sigma^$, such that $$

    x \in A \iff f(x) \in B

    $$ where again we are just mapping inputs from one into another:


    • i.e. this is basically the same as mapping reduction


  • If $A \le_P B$ and $B \in P$, then $A \in P$.
    • i.e. then by definition, we can map $A$ to $B$, and since $B$ is easy (polytime solvable)
  • If $A \le_P B$, and $B \in NP$, then $A \in NP$


  • For If $A \le_P B$ and $B \in P$, then $A \in P$.

  • Suppose $A \le_P B$, then there is a polytime algorithm $F$ computing a function $f$ such that $$

    x \in A \iff f(x) \in B

    $$ Suppose $B \in P$, then there is a polytime algorithm $M_B$ that decides $B$.

    I use the two elements to construct an algorithm that decides $A$

    • then the construction would be simply using $M_B$ and the function $f$


    Now, we need to show that:

    • the construction is in polynomial time
      • since every step above is polytime, and there is no loop
    • show that the construction decides $A$:
      • image-20201208222654632

    This completes the proof.


  • If $A \le_P B$, and $B \in NP$, then $A \in NP$

  • This is simple, in fact similar to above

  • Suppose $A \le_P B$, then I get a polytime decidable function $f$ $$

    w \in A \iff f(w) \in B


  • Suppose $B \in NP$, then I get a polytime verifier $V_B(x,c)$ that verifies $B$.

  • I construct a verifier for $A$:

    $V_A:$ On input $\lang w, c\rang$,

    1. Compute $f(w)$
    2. Run $V_B(f(w), c)$
      • output same
  • Now, I prove that this is polytime and this works.

    • this is skipped
  • Note:

    • all that matters for a verifier is that, for an input $w \in L$, there $\exists c$ (no matter what format) that works

What is the use of this:

  • This means that if a language $L$ is harder or equal than every other language $\in NP$, and we show that that language can be solved in polytime, then $P = NP$.


  • A language $L$ is $\text{NP-Hard}$ if $\forall A \in NP$, $A \le_P L$.
    • so that if we can solve $L$ in polytime, then all the $A$ can be reduced to solve in polytime

Definition of NP Complete

  • A language $L$ is $\text{NP-Complete}$ if:
    1. $L \in NP$
    2. $L \in \text{NP-Hard}$


  • If $A, B$ are $\text{NP-Complete}$, then $A \le_P B$ and $B \le_P A$
    • this follows from the definition of NP-Completeness


  • If $A,B$ are both NP-Complete, then
    • $A,B \in NP$
    • $A,B$ both NP-Hard
  • I show first $A \le_P B$:
    • by definition of NP-Hard, since $A \in NP$ and $B$ is NP-Hard, then I can reduce $B$ to $A$
  • similarly, show $B \le_P A$:
    • by definition of NP-Hard, since $B \in NP$ and $A$ is NP-Hard, then I can reduce $A$ to $B$


  • If $A \le_P B$, and $A$ is $\text{NP-Hard}$, then $B$ is $\text{NP-Hard}$
    • i.e. if $B$ is harder (at least as hard) as $A$, which is already $\text{NP-Hard}$, then $B$ is $\text{NP-Hard}$
    • this will be mostly used for proving NP-Complete problems
      • so that we first how $B \in NP$, then we use this for reduction


  • Suppose \(A \le_P B\). Then I can have a polytime computable function $f$ such that: $$

    x \in A \iff f(x) \in B


  • Suppose $A$ is $\text{NP-Hard}$

  • Now, to prove $B$ is $\text{NP-Hard}$:

    • Let any arbitrary language $C \in NP$
    • need to show $C \le_P B$

    Because $A$ is $\text{NP-Hard}$, then $C \le_P A$. This means that I get a polytime computable function $g$: $$

    w \in C \iff g(w) \in A

    \[Now, since I know $C \le_P A$ and $A \le_P B$, I can show $C \le_P B$ by constructing the **mapping (using transitivity)**\]

    w \in C \iff f(g(w)) \in B


  • this completes the proof

This means that, to find another language is $\text{NP-Hard}$:

  • reduce from $Ham_{cycle}$ to that language $L$
  • then $L$ is also $\text{NP-Hard}$

But what is the hardest was

  • how to prove the first language (e.g. $Ham_{cycle}$ here ) is $\text{NP-Hard}$

In general, to prove a language is $\text{NP-Complete}$

  1. Show that the language $L \in NP$, by showing a polytime verifier
  2. Show that, for some $A \in \text{NP-Hard}$, I can reduce $A\le_P L$
    • i.e. $L$ is hard er (at least as hard as) the $\text{NP-Hard}$ problems


  • $Ham_{cycyle}$ is $\text{NP-Complete}$
    • proof skipped.

Closure Properties


  • $P$ is closed under completement


  • If $L \in P$, then there is a polytime decider $M_L$ for $L$.
  • I just flip the accept/reject of the decider to get a polytime decider for $\overline{L}$


  • we do not know whether $NP$ is closed under complement
    • i.e. don’t know whether $NP=Co-NP$
    • e.g. come up with a verifier to tell whether a graph $G$ does not have a Hamiltonian cycle

Examples of NP-Complete

For example:

  • Question

    Given an undirected graph $G$, tells whether $G$ has a Hamiltonian Cycle (a cycle that goes through each nodes exactly once): $$

    Ham_{cycyle}={ \lang G\rang \text{$G$ is an undirected graph and $G$ has a Hamiltonian Cycle} }

    $$ this problem is NP

  • Proof

    Consider the following algorithm $V$:


    • where it basically it needs $c$ to contain all vertices
      • assuming $c$ contains the “solution” as an ordered list of vertices


    • where now, we just need to check if we can follow the sequence of vertices given the $E$ edge configuration


    Therefore, we have showed that this is a NP.

    (In fact, $Ham_{cycle}$ is NP-Complete)


  • We do not know if $Ham_{cycyle} \in P$, but we do know that $Ham_{cycyle} \in EXP$

Proof Sketch for $Ham_{cycyle} \in EXP$

  • In short, we need to give an algorithm that solves it in $EXP$
  • The idea is simple, try all permutations of $c$, which has $\vert V\vert !$ of it, which is exponential, and use the $V$ we had above

Another Example

  • Question

    The sudoku problem $$

    sudoku:{ P \text{$P$ is a $n^2 \times n^2$ grid, partially filled with integers from $1$ to $n^2$, such that there exists a valid solution} }

    $$ is NP-Complete.

  • Solution


Another Example

  • Question

    The problem $$

    \text{SUBSET-SUM}: { (x_1, x_2, …x_n, t) x_1, …,x_n,t \text{ are integers such that there exists non-empty $S \subseteq {1,2,…,n}$ such that $\sum_{i\in S} x_i = t$} }

    $$ where:

    • since $S$ is a set, the index inside cannot be repeated.
    • However, the numbers could be repeated, such that $x_2=x_5$, for example

    this problem is NP-Complete

  • Solution

Another Example

  • Question

    The problem $$

    SAT={ \varphi \varphi \text{ is a satifiable Boolean formula} }

    $$ where:

    • a boolean formula includes
      • variables $(x_1, x_2, …x_n)$
      • operators $(\land, \lor, \neg)$
    • $\varphi$ is satisfiable if there exists a $T/F$ assignment to each variable, such that the output $\varphi$ will be $True$

    this problem is NP-Complete

Another Example

  • Question

    The variation of $SAT$ $$

    \text{CNF-SAT}={ \varphi \varphi \text{ is a CNF of a Boolean formula and is a satifiable} }

    $$ where:

    • a CNF (conjunctive normal form) has:
      • variables $(x_1, x_2, …x_n)$
      • literals $(x_i, \overline{x_i})$
      • clauses: composed of literals $l_i$
        • takes the form $(l_1 \lor l_2 \lor l_3 …)$
        • only $\lor$ (OR) are included
    • $\varphi$ is a CNF form is it is composed of clauses $C_i$
      • $\varphi = C_1 \land C_2 \land C_3 …$
      • only $\land$ (AND) are included between clauses
    • for example:
      • $(x_1 \lor \overline{x_2}) \land (x_2 \lor \overline{x_3} \lor x_4)$ is a CNF

    this problem is also NP-Complete

Another Example

  • Question:

    The problem/variation of $SAT$: $$

    3SAT = { \varphi \varphi \text{ is a 3-CNF of a Boolean formula and is a satifiable} }

    $$ where:

    • $3CNF$ is basically the CNF of a boolean formula, except that:
      • each clauses of $\varphi = C_1 \land C_2 …$ consists of exactly $3$ literals

    is also NP-Complete

Examples using NP-Complete Reductions


  • $SAT$ is NP-Complete
  • $SAT \le_P \text{CNF-SAT} \le_P \text{3SAT}$
    • as a result, all of them is NP-Hard
    • in fact, proving them to be $\in NP$ is easy (easy to built a polytime verifier for them)
      • therefore, from the above, all of them can be proved to be NP-Complete


  • $\text{CNF-SAT} \le_P \text{3SAT}$

  • Remember, we are essentially doing a mapping reduction, so all we need to show is that

    • given a $\varphi \in CNF$, I can convert/map it using a polytime computable $f$ to $\varphi’ \in$ $3SAT$ form, such that
      • if $\varphi$ satisfiable $\iff$ $f(\varphi)=\varphi’$ satisfiable
  • Now, the function would be:



    Therefore, since this function is polytime computable, and it works, we have completed the proof.

Non-Deterministic TM with Complexity


  • A non-deterministic TM machine runs in time $t(n)$, if for every input $x$ of length $n$, all possible computations on $x$ take at most $t(n)$ steps.

    • i.e. the worst guess you can have takes $t(n)$ steps, so that the machine always halts in $O(n^k)$



  • A non-deterministic time: $$

    \text{NTIME}(t(n))={ L \exists \text{ a NTM $N$ that decides $L$, and runs in time $O(t(n))$} }


Theorem $$



  • i.e. to show that a language is in $NP$, instead of giving a verifier, we could show by giving a non-deterministic TM that actually solves it, with longest path being the order of $O(n^k)$.


  • this only works for proving that a language is in $NP$.

Proof Sketch

  • forward direction $NP\implies \bigcup_{k=1}^{\infty}\text{NTIME}(n^k)$

    • The idea is that if $L \in NP$, then I can construct a NTM that runs in polynomial time:

      • If $L \in NP$, then I get a verifier $V$ that verifiers $L$ in polynomial time

      • Construct:

        N = “On input $w$ of length $n$:

        1. Non-deterministically select string $c$ of length at most $O(n^k)$
        2. Run $V$ on input $\lang w, c \rang$
          • this is in fact same as testing all possible $c$ one by one, then running $V$ for a deterministic TM
        3. If V accepts, accept ; otherwise, reject .”


      • basically you are just trying different $c$ simultaneously, on the premise that $\exist c$ that works
      • by definition of $\text{NTIME}(n^k)$, this works because the longest branch will have ran the same time as $V \in O(n^k)$
        • however, if we convert it to a deterministic TM, then the total run time would be $EXP$, which makes sense since $NP \subseteq EXP$
  • backward direction $\bigcup_{k=1}^{\infty}\text{NTIME}(n^k) \implies NP$

    • Assume you have a NTM $N$ deciding $L$, then construct verifier $V$ as follows

      $V$ = “On input $\lang w, c \rang$, where $w$ and $c$ are strings:

      1. Simulate $N$ on input $w$, treating each symbol of $c$ as a description of the nondeterministic choice to make at each step (as we did for the proof of simulating multi-tape TM to single-tape TM).
        • i.e. $c$ tells us which branch to take
      2. If this branch of $N$’s computation accepts, accept ; otherwise, reject .”

Examples of NP using NTM

Though building a verifier $V$ that proves $NP$ would usually be more easier, this construction of NTM is sometimes useful. The overall idea is:

  • the NTM $N$ will non-deterministically try all $c$
  • if there is a certificate $c$ that works, then $\vert c\vert \in O(n^k)$
  • hence we are just testing all branches up to depth $O(n^k)$, which would be $\in \text{NTIME}(n^k)$

For example

  • Question

    The problem $\text{SUBSET-SUM}$ is in NP, using a NTM.

  • Solution

    $N$ = “On input $\lang S, t \rang$:

    1. Non-deterministically select a subset $c$ of the numbers in $S$.
    2. Test whether $c$ is a collection of numbers that sum to $t$.
    3. If the test passes, accept ; otherwise, reject .”

    In fact, the verifier would also be simple, and perhaps more intuitive

    $V$ = “On input$\lang \lang S, t \rang, c\rang$

    1. Test whether $c$ is a collection of numbers that sum to $t$.
    2. Test whether $S$ contains all the numbers in $c$.
    3. If both pass, accept ; otherwise, reject .”

However, this NTM technique, when applied with depth, could be used for proving NP-Complete problems without reduction

Example of NP-Complete using NTM

Cook-Levin Theorem

  • $SAT$ is NP-Complete

Proof Sketch

  • This uses the non-deterministic TM
  • And they proved that $\forall L \in NP$, $L \le_P SAT$
    • if $L \in NP$, then there is a NTM $N$ that decides $x \in L \iff$ $\exists$ an accepting computation/configuration of $N$ on $x$
    • let $\varphi$ be the formula that is satisfiable $\iff$ $\exist$ an accepting sequence of configuration (an accepting branch in configuration tree)

More detailed Proof Sketch

  • Showing that $SAT$ is in $NP$ is easy, and we do so shortly. The hard part of the proof is showing that any language in $NP$ is polynomial time reducible to $SAT$.
  • To do so, we construct a polynomial time reduction for each language $A$ in $NP$ to $SAT$. The reduction for $A$ takes a string $w$ and produces a Boolean formula $\varphi$ that simulates the NP machine for $A$ on input $w$. If the machine accepts, $\varphi$ has a satisfying assignment that corresponds to the accepting computation. If the machine doesn’t accept, no assignment satisfies $\varphi$.
    • Therefore, $w$ is in $A$ if and only if $\varphi$ is satisfiable.
  • Actually constructing the reduction to work in this way is a conceptually simple task, though we must cope with many details. A Boolean formula may contain the Boolean operations AND, OR, and NOT, and these operations form the basis for the circuitry used in electronic computers. Hence the fact that we can design a Boolean formula to simulate a Turing machine isn’t surprising. The details are in the implementation of this idea.


General Tips:

  • During proofs, the length of a string/word could be a useful parameter
    • e.g. assuming $w$ is a word accepted having the smallest length in $L$.

Tips on Proving Regularity with DFA/NFA

If you are given a regular language, and asked to prove that some modification of that language is also regular:

  1. List out the constraints/rules you have to fulfill
    • e.g. at the second state, I should have had stored the information/string of $a\circ b$.
  2. Try to sum them up in one line
  3. Using the modification summarized above, draw the modified DFA/NFA
    • one thing that is usually a correct step is to make sure the original DFA/NFA is intact
      • this is to take care that they might be some potential loops/cycles
      • more than one connection to a specific state
  4. Write out formally

Tips on Proving If and Only If

If proving the claim, $\text{Claim 1} \iff \text{Claim/Condition 2}$

  1. Show that, if I have $\text{Claim/Condition 2}$, then putting it into $\text{Claim 1}$, it is true.
    • this part usually is trivial, and sometimes can be proved by words/logics
  2. Conversely, if I have $\text{Claim 1}$, then it implies/derives to $\text{Claim 2}$.
    • this part usually needs the help of variables, such as a word $w$, and splitting it up into $w_1 \circ w_2$, and etc.

Tips on Converting Regex to NFA

If given a regular expression, and need to convert to NFA:

  1. For example, if the alphabet is ${a,b}$
  2. Start with drawing NFAs $L(a)$ and $L(b)$ for each symbol in the alphabet as the base case
  3. Draw a NFA with regular expression only using one of $\circ, \cup,*$. Such as $(a \cup b)$
  4. Doing the above recursively
    • here, you could either choose to process it in one go, straight from left to right
    • or, you could split the expression into smaller ones such as $R_1 \circ R_2$, and then assemble them

Tips on Writing Regular Expressions

If given a language in words, you need to write a regular expression for it:

  1. Break the conditions down to smaller conditions.
    • for example, no two consecutive symbols are the same $=$ alternating + all 1s + all 0s
  2. Write out the smaller conditions in diagram/figure out the expression
  3. Assemble the expressions together

Tips on Proving Regular Expressions

If given a regular expression $\alpha$, and we need to prove some properties of it:

  1. Usually, using Induction will work
  2. Show that it is true for $\vert \alpha\vert =1$
    • $L(a)$
    • $L(\epsilon)$
    • $L(\empty)$
  3. Assume that the property is true for $\vert \alpha\vert = k$
  4. Show that, for $\vert \alpha\vert =k+1$, we can have:
    • $\alpha = (\alpha_1 \circ \alpha_2)$, $L(\alpha) = L(\alpha_1)\circ L(\alpha_2)$ still has that property
      • notice that $\vert \alpha_1\vert ,\vert \alpha_2\vert$ are now smaller than $k$, so assumption from (3) can be used
    • $\alpha = (\alpha_1 \cup \alpha_2)$, $L(\alpha) = L(\alpha_1)\cup L(\alpha_2)$ still has that property
    • $\alpha = (\alpha_1)^$, $L(\alpha) = L(\alpha_1)^$ still has that property
      • for this, you can split the word into $w = w_1 \circ w_2 \circ…\circ w_n$, where each $w_i \in L(\alpha_1)$, and $n$ can grow up to infinity by definition

If given a regular expression $\alpha$, and we need to prove that it is the same as $\beta$

  1. Unless you can manipulate the expression directly, you will need to do the following
  2. Prove that $\alpha \subseteq \beta$ (strings generated from $\alpha$ is a subset of strings generated from $\beta$)
    • sometimes, one of the two will be straightforward and you can say by words
  3. Prove that $\beta \subseteq \alpha$
    • usually, along the way you will need to create several other $\subseteq$ sets for reaching the conclusion

Tips on Constructing Context-Free Grammar

If given a language that has some requirement on the length of the string or its sub-substrings, we can think about the following:

  1. Split the question into its essential, smaller parts
    • e.g. if we needed the string $0^i1^{i+j}0^j$, then we could use the idea:
      • first part is to get $0^i1^i$, then get $1^j0^j$,
      • so you have $\begin{cases}S \to UT\ U\to 0U1 \vert \epsilon \ T\to 1T0 \vert \epsilon\end{cases}$
  2. Since each variable is context-free, what can a variable know?
    • e.g. $w$ starts with a $1$ and has a middle element $0$
      • this implies the $1$ and $0$ needs to be hardcoded, since the variable has no idea if it is at the front or in the middle of the string
  3. What must happen when you add exactly one more variable? (Think about it in terms of symmetry, what part does not matter (could be any symbol), and what part matters)
    • e.g. $w$ starts with a $1$ and has a middle element $0$ and is of odd length
      • when one more variable is added, it must exactly add one symbol to the front and back:
      • $\begin{cases}S \to 1T0 \vert 1T1\ T\to UTU \vert 0 \ U \to 1\vert 0\end{cases}$
      • where $T\to UTU\vert 0$ populates the string to have $UUU…0UUU…$

Tips on Constructing TM Reduction

If we need to reduce a language $B$ to a language $A$:

  • if reducible, get a decider for $B$.
  • know the input of $A$ and $B$
  • show towards contradiction that you can construct decider for $A$, which is not decidable
    • i.e. having a decider for $B$ means too much power

Remember, the construction will look like:

R: on <input of A>:
	1. Construct <input of B>:
		- ...
	2. Run decider for B on <input of B>:
		- if B accept, ...
		- if B reject, ...

  1. First think about the input of <input of A>​
    • e.g. reduce $E_{TM}$ to $A_{TM}$, (construct a decider for $A_{TM}$) then input is for $A_{TM}$
  2. Think about input of <input of B>, which is the assumed decider
  3. Think about what we need to map:
    • e.g. if we need to reduce $E_{TM}$ to $A_{TM}$
      • map $\lang M, w\rang \in A_{TM}$ to
        • construct $\lang M’\rang \in E_{TM}$ or
        • construct $\lang M’\rang \notin E_{TM}$ (easier in this case)
      • map $\lang M, w\rang \notin A_{TM}$ to
        • construct $\lang M’\rang \notin E_{TM}$ or
        • construct $\lang M’\rang \in E_{TM}$ (easier in this case)

Tips on Diagonalization for Undecidability

It will only work nicely on some problems, and the idea is just three steps:

  1. Assume that a decider exists. Then we can get a table.
  2. Formalize the input/output in to a table
  3. Flip the diagonal, and verify this is not in the table
  4. prove that what you produced is constructable/exists

For example:

  • assume that such an oracle exists, then we have a decider and a table

  • To prove $A_{TM}$ is not recognizable, we first constructed a table

    • where instead of putting $w$, we used $\lang M\rang$
  • Then we flipped it, and verify it is not in the table


  • We show that the new program exists, by constructing it


Tips on Constructing Mapping Reduction

Think in the reverse. If we have $L_1 \le_M L_2$:

  1. We need to construct a $f$ that:
    • $input \in L_1$ $\implies$ $output \in L_2$
    • $input \notin L_1$ $\implies$ $output \notin L_2$


List of Known TM Languages

So far, we have encountered the following:

TM recognizable

  • $A_{TM} = { \langle M,w\rangle \vert \text{$M$is a TM, and$M$accepts$w$} }$
  • $NE_{TM}$

TM decidable

  • $A_{DFA}$
  • $A_{NFA}$
  • $E_{DFA}$
  • $EQ_{DFA}$
  • $A_{CFG}$
  • $A_{regular}$

TM not recognizable

  • $\overline{A_{TM}}$
  • $E_{TM}$
  • $ALL_{TM}$
  • $\overline{ALL_{TM}}$

TM not decidable

  • $A_{TM}$
  • $Halt_{TM} = {\lang M,w\rang\vert M \text{ halts on input$w$} }$
  • $E_{TM}$
  • $EQ_{TM}$
  • $REG_{TM}$
  • $EQ_{CFG}$
  • Language that depends on the non-trivial property of a $TM$ (recognizable) machine