COMS3261 CS Theory
 Logistics
 Week 1  Introduction
 Week 2
 Week 3
 Week 4
 Week 5
 Week 6
 Week 7
 Week 8
 Week 9
 Week 10
 Week 11
 Week 12
 Week 13
 Week 14
 Tips
 Tips on Proving Regularity with DFA/NFA
 Tips on Proving If and Only If
 Tips on Converting Regex to NFA
 Tips on Writing Regular Expressions
 Tips on Proving Regular Expressions
 Tips on Constructing ContextFree Grammar
 Tips on Constructing TM Reduction
 Tips on Diagonalization for Undecidability
 Tips on Constructing Mapping Reduction
 Appendix
Logistics
 Quizzes  10%
 frequent
 Homework  20%
 challenging
 Exam  70%
 expected four (breaking midterm and final)
 Office Hours:
 Professor Tal, Thurs 8:40computable9:50am; 3:304:30pm
 Webpage:
 http://www.cs.columbia.edu/~tal/3261/fall20/
Week 1  Introduction
Related Reading:
Definitions
 an Alphabet $\Sigma$
 an nonempty, finite set of symbols
 e.g.
 ${0,1}$, ASCII table, …
 a String over alphabet $\Sigma$
 a finite sequence of symbols from $\Sigma$
 Length of string $s$, $\vert s\vert$
 number of symbols in a string
 e.g.
 $\vert 1001\vert =4$
 empty string $\epsilon$
 string of length 0
 you will see that $\empty^{*}=\epsilon$ or $\empty^{0} = \epsilon$ (see the Kleene Star definition in Week 3)
 concatenation is denoted with $\circ$
 $\Sigma^k$ means a set of strings with length $k$
 e.g., let $\Sigma = {0,1}$, $k=2$
 then you get $\Sigma^2 = {00,01,10,11}$
 e.g., let $k=0$,
 then you get $\Sigma^0 = {\epsilon}$
 e.g., let $\Sigma = {0,1}$, $k=2$
 $\Sigma^*$ means a set of all strings over $\Sigma$ with length varying from $0 \to \infty$
 therefore, it is infinite
 a language is a subset of $\Sigma^*$
 basically some set of strings
Usages
 a language $L$ can be thought of as a yes/no problem solver:
 e.g.
 given input string $s$, does $s \in L$?
 e.g.
This means that you can think of a language $L$ as a (yes/no) function $f$ such that:
\[f: x \in \Sigma^* \to \{1,0\}\]where:
 $f(x\in \Sigma^*)=1$ if $x \in L$
 $f(x\in \Sigma^*)=0$ if $x \notin L$
Notice:
 You might think that a yes/no function would be quite limited in its capacity, but in later part of the course, you will see that it actually quite capable.
Deterministic Finite Automata (DFA)
For example, a DFA might look like
where:
 circles represent states
 so in this case, we have four states
 a start state always exists, with an additional arrow
 an accepting state is denote by a double circle (the bottom right)
 an accepting state would be the only state that the output would be true/yes (see sample usage below)
 though it is called an accepting state, it does not have to be always accepting/being a sink
 arrows represent transitions
 a transition is uniquely defined/denoted by coming from a specific state + having an alphabet (in this case, $1$ and $0$)
 for every state, there has to be a transition for every element in the alphabet
 again, in this case the alphabet would be ${1,0}$
A sample usage would be:
 Consider the string $101$:
By putting the string into the above DFA, you get:
where:
 you end up in a normal state, which means the output of $101$ into this DFA would be a false/no
 the only state where you could reach a true/yes
DFA Def:
A DFA $M$ is a 5tuple
\[M=(Q,\Sigma, \delta, q_0, F)\]where:
 $Q$ is a finite set of states
 $\Sigma$ is a finite alphabet
 $\delta$ is the transition function $Q \times \Sigma \to Q$
 takes a state and a symbol to another state
 $q_0 \in Q$ is called the start state
 $F \subseteq Q$ is called the accepting states
For example, we can identify the previous diagram with:

$Q = {q_0, q_1, q_2, q_3, q_4}$
 we see that there are four states (including accepting state) in total

$\Sigma = {0,1}$

$\delta$ is a function, but here can be represented by

$q_0$ is the start state at top left

$F={q_3}$ is the accepting state at bottom right
Computation Def:
If there is a string $w=w_1, w_2, …w_n$ where $w_i \in \Sigma$, then a DFA $M$ accepts $W$ iff
\[\exist \, r_0,r_1,r_2,...r_n \in Q\]such that
 $r_0 = q_0$
 $r_i$ you can see as the $i$th computation/step you had
 $\delta(r_{i1},w_i)=r_i$, for $\forall \, i\in {1,2,3,…n}$
 the $i1$ comes from the notion that you take the transition from $r_{i1}$ with $w_i$.
 $\delta$ would be some computation on $w$
 $r_n \in F$
Language of a DFA Def:
for a DFA $M$, we define:
\[L(M)=\{ w \in \Sigma^*  M \,\,accepts \,\,w \}\]to be the language of a DFA $M$, or the language that $M$ computes/recognizes.
For example, a DFA that accepts all strings/a language of a DFA that (recognizes) is a set of all strings:
another DFA that accepts all strings would be:
Regular Language
Regular Language Def
A language $L$ is regular if:
\[\exists \, M,\, s.t.\, L=L(M)\]this says that a language is regular if there exists a DFA that computes to it. This means being “regular” is just a property.
Now, the question becomes:
 Which languages are regular? How could you prove whether if a language is regular or not?
For example:

Question:
What is the language of this DFA?

Solution:
Realize that whenever you have $111$ in a row, you will get to the accepting state.
 e.g. $0011101$, $01110$, and etc.
Propose:
\[L(M)=\{w\in\{0,1\}^*\,\, w\,\,contains\,\,substring\,\,111\}\]Prove by induction on length of $w$, $\vert w\vert$
The general procedure would be:
 Trying manually some strings that will get accepted
 Try finding a pattern of the above and propose
 Prove the proposition
However, the hard part is the below:
For example:

Question
Is such a language $L(M)={w\in{0,1}^*\,\vert \,w\,\,has\,\,an\,\,even\,\,number\,\,of\,\,0}$ regular?

Solution:
First, examples such as $0110$ and $\epsilon$ passes, so if we find a DFA, it has to recognize those.
Second, you need to come up with an finite memory algorithm to do the job.
 For instance, the algorithm (bad for DFA) could be:
 go through the string, count number of $0$
 if even, then accept. else reject
 however, the problem occurs as a DFA has a fixed number of states, you cannot simply use this algorithm as it would potential take an unbounded number of states
Another better algorithm would be:
 start with a start state being accepting
 once you get a $1$, switch to normal state
 once you get another $0$ switch back
 notice that this algorithm does not care about how many zeros you have, but just knows whether if it is even or odd. This makes it efficient.
Then convert the algorithm to DFA corresponding DFA looks like:
 For instance, the algorithm (bad for DFA) could be:
Note:
 the general difference/problem with a DFA as compared to an actual piece of algorithm is that:
 it has a fixed finite memory
 as it has a fixed amount of states
 its input is read only once
The last example for thinking is:

Question:
Is the following language regular?
where:
 $xy$ means concatenating a string $x$ and a string $y$.

My Solution:
The problem with this statement is that we cannot define a way to parse the string, deciding where to break it.
 to solve this, you can try to find another equivalent way to state this problem
 if the first part satisfied the condition, then the second part fails iff it has an odd number of 0 left over and ended with $01^n$, where $n$ is also odd
 ?
Using a NFA
 see the section NFA
 we can change the question into: given any first part of the string which contains even 0, check if the second part contains even 1

Solution:
Using a DFA
 a
Using NFA
 Professor had the same solution
Theorems
Theorem:
 If $L$ is finite, then $L$ is regular
Proof Sketch:
 Can construct a DFA with a state for any possible string of length $\le$ maxlength of any word in $L$, and mark appropriate states as accepting.
In fact, the converse is also false.
Theorem:
 If $L$ is regular, $L$ can be infinite.
Remember the example of DFA with even number of $1$’s, which has infinite number of elements in the language, yet a DFA can be built to decide whether an input fits in the language or not.
More Practice Questions
My Solution:

(This is actually simple. Since $L$ is finite, then we can just construct a state for every word in the dictionary)
Week 2
Nondeterministic Finite Automata (NFA)
An example of a NFA looks like
where, as compared to a DFA, we see:
 not each alphabet correspond to a transition
 therefore, if there is no matching transition for an alphabet at a state, it “dies”/”disappears”
 there is always the option of having a $\epsilon$ transition, which gives the possibility to spontaneously jump forward one state
 though this is not included in the example above
 the $1$ in the first state has two possible transitions
 therefore, you need to compute all the possible routes
For example, for the word $1010010$:
where:
 the string is accepted iff there is at least one “token” finishes and landing in an accepting state
Another way to represent the process traversing a NFA looks like:
NFA Definition:
An NFA with
\[N=(Q,\Sigma, \delta, q_0, F)\]where:
 $Q$ is a finite set of states
 $\Sigma$ is a finite alphabet
 $\delta$ is the transition function $Q \times {\Sigma \cup \epsilon} = Q \times \Sigma_{\epsilon} \to P(Q)$
 $\Sigma_{\epsilon}=\Sigma \cup \epsilon$
 $P(Q)$ is a power set/a set containing all subsets of $Q$
 it becomes a set because you can have more than 1 transition for the same alphabet (hence landing in more than one state)
 $q_0 \in Q$ is called the start state
 $F \subseteq Q$ is called the accepting states
Note:
A power set is basically a a set containing all subsets:
Definition:
 An NFA $N$ accepts a string $w$ if $w$ can be written as $w=w_1 \circ w_2 \circ w_3…w_n$ where $w_i \in \Sigma_{\epsilon}$ and there is a sequence of states $r_0, r_1, r_2,…r_n \in Q$ such that:
 $r_0 = q_0$
 for all $i=1,2,3,…n$, $r_i \in \delta(r_{i1}, w_i)$
 the DFA version is $\delta(r_{i1},w_i)=r_i$, and the difference is again that you can have multiple transitions for a single alphabet
 $r_n \in F$
Interestingly, we used $\circ$ explicitly in the definition with NFA $w=w_1 \circ w_2 \circ w_3…w_n$, instead of $w=w_1, w_2, …w_n$, because technically an empty string is not a string, so it is not legal to have:
 $100\epsilon1$
but legal to have:
 $1\circ 0\circ 0\circ \epsilon\circ 1=1001$
Definition
For a NFA $N$, the language recognized by $N$ is
\[L(N) =\{ w\in \Sigma^*\,\, N \,\,accepts \,\,w \}\]
Regular Language
Theorem:
 $L$ is a regular language iff $\exists$ NFA that recognizes L.
 notice the other version with DFA is the same
Simple Proof:
 If $L$ is regular, it is recognized by a DFA. Since a DFA is a special case of NFA, the above theorem also holds.
 since NFA is just a DFA with more optional power.
Difficult Proof
\[N=(Q_N,\Sigma_N, \delta_N, q_{0_N}, F_N)\]
 Here we try to prove the reverse, that a NFA can also be converted to a DFA.
 Let
need to show that for every token/intermediate state in a NFA, there exists a state that correspond to it in the DFA.
One approach is to construct a DFA from
Since every set of states that is reached by the same substring in NFA must be a state in DFA:
\[Q_D = P(Q_N)=\{S\,\,S \subseteq Q_N\}\] and
\[q_{0_D}= \{q_{0_N}\}\cup\{q\in Q_N\,\,q\,is\,reachable\,via\,\epsilon\,from\,q_{0_N} \}\]\[F_D=\{S\subseteq Q_N\,\,S \cap F_N \neq \empty\}\]
 where:
 this means $q_{0_D}$ corresponds to a set of states in $Q_N$
and
\[\delta_D(S,a)\to S_1 \cup \{q\,\,q\,is\,reachable\,via\,some\,\epsilon\,transition\,from\,some\,q^{\prime}\,in\,S_1\}\]
 where
 $F_D$ does not have to be same as as $F_N$
 where:
 $\delta_D(S,a)$ means transition from a state $S$ (which contains a set of states) with a symbol $a$
 $S_1={q^{\prime} \in Q_N\, \vert \, q^{\prime}\in\delta_N(q,a)\,for\,all\,q\in S}$
 basically states reachable in the NFA, from states $S$ in a DFA, with the same symbol $a$
 You basically get a set of states to be transitioned to, and they together forms a single state in DFA
For example:
 If we need to convert the following NFA to a DFA

we first construct the following result
where
 accepting states are just states that include state $q_2$, which is an accepting state in the NFA
 size of NFA becomes bigger than DFA (though this is not the only solution)
DFA and NFA
Above we have seen that you can convert between DFA and NFA, and notice the size changed.
Theorem:
 $\forall n \in N$, there is a language $L_n$ such that there is an $(n+1)$ state NFA recognizing $L_n$ but every DFA recognizing $L_n$ has at least $2^n$ states.
 the variable $n$ could be arbitrary, but here for showing the exponential blow up
 obviously, there are also other languages that DFA does not blow up
Proof (Sketch):

Define $L_n = {w \in {0,1}^*} \vert \,last\,nth\,symbol\,is\,0$
Then:
 Claim: $L_n$ has an NFA with $n+1$ states
 Proof: (try if yourself)
And:

Claim: Any DFA for $L_n$ has at least $2^n$ states

Proof: Assume $\exist DFA$ for $L_n$ with fewer than $2^n$ states, by pigeon hole principle (p.h.p), there $\exists$ two different $nbit$ strings that end on the same state.
(continue by yourself)
Week 3
Operations on Languages and Closure Properties
Though a language is basically just a set, there are additional features on it such that we have additional operations.
This section here is to show that a class of regular languages is closed under a variety of specific operations.
 if you take a regular language in a class and apply some operations, it is still a member of a class.
Operation Definitions
Assume we have, for the below languages, they are composed of some alphabet $\Sigma$

Complement

$\bar{L_1}={w \in \Sigma^*\vert w\notin L_1}$

Union
 $L_1 \cup L_2 ={ w \in \Sigma^* \vert w\in L_1 or\, w\in L_2 }$

Intersection
 $L_1 \cap L_2 ={ w \in \Sigma^* \vert w\in L_1 and\, w\in L_2 }$

Concatenation
 $L_1 \circ L_2={ w \in \Sigma^* \vert w=xy\,\,s.t.\,\,\forall x\in L_1\,and\,\forall y\in L_2 }$
 order matters, only $xy$ is allowed
 like a cartesian product but concatenation
Theorem/Facts:
For any $L_1,L_2$:
\[L_2 \subseteq L_1 \circ L_2 \iff \epsilon \in L_1\,or\, L_2 = \empty\]

Power

\[\begin{align*}
L^0&=\{\epsilon\}\\
L^{i+1}&=L^i\circ L\,\,for\,all\,i\ge0
\end{align*}\]
basically a recursive definition of concatenating a language like a cartesian product

\[\begin{align*}
L^0&=\{\epsilon\}\\
L^{i+1}&=L^i\circ L\,\,for\,all\,i\ge0
\end{align*}\]

Kleene Star
 \[L^*=\cup_{k=0}^{\infty}L^k \\ L^*=\{\epsilon\}\cup L^{1}\cup L^{2}\cup L^{3}...\]
 notice that this means $\empty^* = \epsilon$, or $\empty^{0} = \epsilon$
There are also other operations, but those are the ones we are interested in.
Closed Property
We need to show that the class of regular language is closed under all the above operations.
 in other words, any language produced by those operations with a regular language is still regular.
Proof for Complement Operation
suppose $L$ is regular, let $M=(Q,\Sigma,\delta, q_0, F)$ be a DFA for $L$.
we will construct a new DFA $M’$ recognizing $\bar{L}$ such that:
 $M’=(Q,\Sigma,\delta,q_0,QF)$
and we need show that it works:
 $x\in \bar{L}$ iff $M’ \,\,accepts\,\,x$
 trivially show that $x \in \bar{L}$ implies $M’$ accepts $x$
 trivially show that $M’$ accept $x$ if $x\notin \bar{L}$
This completes the Proof that complement is closed.
Note:
 If you use a NFA for the same construction above, it will not work, because an input/string can be at multiple states, and flipping some of the nonaccepting token to accepting, and other accepting token to nonaccepting would still make this input accepted (at least one token being accepted)
 However, the proof obviously work with a NFA, it’s just that we need to use another construction.
Proof for Union Operation
 suppose $L_1$ and $L_2$ are regular, and we use NFA here, which is easier
 we have $N_1=(Q_1,\Sigma_1, \delta_{1}, q_{01}, F_1)$ and $N_2=(Q_2,\Sigma_2, \delta{2}, q_{0_2}, F_2)$
 this is trivial as we just need to have a start state $q_0$ that has two $\epsilon$ transaction to the $N_1$ and $N_2$ NFAs.
Then we have the new $N$ to be defined as:
\[N=(Q_1 \cup Q_2 \cup \{\epsilon\}, \Sigma_1 \cup \Sigma_2, \delta, q_0, F_1 \cup F_2)\]where:
we redefined some transitions $\delta$ to be
And then we need to show (argue that it works):
\[w\in L_1 \cup L_2 \iff N\,\,accepts\,\, w\]which is quite easy:
Alternative Proof for Union Operation
Instead of using NFA, we use a DFA for the proof. The idea would be to somehow to run a string in “parallel” in the two DFAs:
\[D_1=(Q_1,\Sigma, \delta_1, q_1, F_1);\,D_2=(Q_2,\Sigma, \delta_2, q_2, F_2)\]and we construct a new DFA $D$:
\[D=(Q,\Sigma, \delta, q_0, F)\]where:
 $Q=Q_1 \times Q_2={(r_1,r_2)\,\vert \,\forall r_1\in Q_1,\,\forall r_2 \in Q_2}$
 so basically you can run “inparallel”
 $q_0=(q_1,q_2)$
 $F={(r_1,r_2)\,\vert \,r_1\in F_1 \,or\,r_2\in F_2}=F_1\times Q_1 \cup F_2 \times Q_2$
 $\delta((r_1,r_2),a)=(\delta_1(r_1,a), \delta_2(r_2,a))$
With the above proof, we can easily prove the intersection operation
 the only difference is defining $F={(r_1,r_2)\,\vert \,r_1\in F_1 \,AND\,r_2\in F_2}=F_1\times Q_1 \cap F_2 \times Q_2$
DFA Proof for Intersection Operation
The idea would be the same: to run a string in “parallel” in the two DFAs:
\[D_1=(Q_1,\Sigma, \delta_1, q_1, F_1);\,D_2=(Q_2,\Sigma, \delta_2, q_2, F_2)\]and we construct a new DFA $D$:
\[D=(Q,\Sigma, \delta, q_0, F)\]where:
 $Q=Q_1 \times Q_2={(r_1,r_2)\,\vert \,\forall r_1\in Q_1,\,\forall r_2 \in Q_2}$
 so basically you can run “inparallel”
 $q_0=(q_1,q_2)$
 $F={(r_1,r_2)\,\vert \,r_1\in F_1 \,AND\,r_2\in F_2}=F_1\times Q_1 \cap F_2 \times Q_2$
 $\delta((r_1,r_2),a)=(\delta_1(r_1,a), \delta_2(r_2,a))$
Alternative Proof for Intersection Operation
This basically uses De Morgan’s Law:
\[L_1 \cap L_2 = \overline{(\overline{L}_1 \cup \overline{L}_2)}\]
 if $L_1$ and $L_2$ are regular, then we know $\bar{L_1}$ and $\bar{L_2}$ are also regular. Then this means that $\bar{L_1}\cup \bar{L_2}$ is still regular. Therefore, since:
Hence $L_1 \cup L_2$ is still regular.
NFA Proof for Concatenation Operation
 Basically, we can imagine starting with $N_1$ (or in fact, $D_1$ with DFA would work as well), and make $\epsilon$ transition from every accepting state in $N_1$ to the starting state of $N_2$ (again, this could be $D_2$).
 TODO: the formal proof
Notice:
 For those kind of closure proof, we generally find it easy to start with a strict DFA, and construct a NFA from it, which is looser in terms of transitions/states, and works as it proves a language being regular.
Lastly:
Proof for Kleene Star Operation
Let $N$ be an NFA recognizing $L$, then we should be able to construct an NFA recognizing $L^*$ (hence making $L^*$ still regular, which is what we meant by closed)
 Since now we need to deal with accepted strings concatenated with each other (i.e. $L^k$), then we know we just need to add an $\epsilon$ transition from all accepted state to the initial state (hence allowing concatenation)
 However, the above does not take into account the only case left $L^0=\epsilon$, which can be properly fixed by:
 TODO: Properly prove it
Note:
 If instead of adding a new state with $\epsilon$ transition only, you made the starting state accepting. This would cause the problem that, if there is a loop that goes from the starting state, then it will accept that string as well, even if that string might not be defined in the language $L^*$. Therefore, that approach would be wrong.
Regular Operations and Expressions
We define regular operations being:
 union $\cup$

concatenation $\circ$
 Kleene star $*$
This means that a regular expression is defined as:
Regular Expression Definition
 An expression is a regular expression $R$ over an alphabet $\Sigma$ if:
 $R=a$ for $a\in \Sigma$
 $R=\epsilon$
 $R=\empty$
 $R=(R_1 \cup R_2)$
 $R=(R_1 \circ R_2)$
 $R=(R_1)^*$
 for $R_1$ and $R_2$ being regular expressions
 also notice that the definitions are recursive
In other words, a regular expression is a string together with additional symbols allowed:
\[\Sigma \cup \{\cup, *,\circ,(,)\}\]
And regular expressions are special strings, because:
 $L(a)={a}$ for $a\in \Sigma$
 $L(\epsilon)={\epsilon}$
 $L(\empty)={}$
 interestingly, this means $L(R\circ \empty)=L(\empty\circ R)=\empty$
 $L(R_1 \cup R_2)=L(R_1)\cup L(R_2)$
 $L(R_1 \circ R_2)=L(R_1)\circ L(R_2)$
 $L(R^)=L(R)^$
Therefore you can define a language to be regular if it is composed of regular expressions, beside constructing NFA/DFA.
Examples
Consider regular expressions over $\Sigma={a,b}$

Question
then think about the language of the following regular expressions:
 $(a\cup bb)^*$
 $(abb \cup bbb)\circ (baa \cup b)$
 $aa^*$
 $(a \cup b)^*a(a\cup b)(a\cup b)(a\cup b)$

Solution
 a language that has all contiguous blocks of $b$ of even length
 ${abbbaa,abbb,bbbbaa,bbbb}$
 a language that has ${a^i \vert i\ge 1}$
 sometimes, also referred to as $a^+$
 a language that has its 4th last symbol being $a$
Regular Language
Theorem:
 A language $L$ can be generated by a Regular Expression if and only if that language is regular.
Proof:
In the first direction, we need to show that for any regular expression $R$, $L(R)$ is regular (recognized by a NFA).
Usually those proofs use induction:
Base Cases:
Advancement
where we used the closure property, which are already proven.
In the other direction, we show that if $L$ is regular/has an NFA, then $L=L(R)$ for some regular expression $R$.
So that for any NFA $N$, there is a regular expression $R$ such that $L(N)=L(R)$ (basically every word in the language can be expressed as a single regular language)
 Overview:
where a GNFA is defined in the section Generalized NFA.
First, we know that a string $w$ is accepted by a GNFA, if there $\exists$ a path from start state to accept state s.t. the concatenating of regular expressions along the path gives a regular expression such that $w \in L(R)$.
Next, we construct a GNFA with one fewer state:
choose a state to remove $q_{rip}$ (obviously not a start/accept state)
then, for every other pairs of states $q_i,q_j$ ($q_i,q_j \neq q_{rip}$), do the following:
where:
 it is all about recursive picking out any triplet into a doublet transitions:
 $q_i \to q_{rip} \to q_j \Rightarrow q_i \to q_j$
 $q_j \to q_{rip} \to q_i \Rightarrow q_j \to q_i$
 $q_i \to q_{rip} \to q_i \Rightarrow q_i \to q_i$
 $q_j \to q_{rip} \to q_j \Rightarrow q_j \to q_j$
 basically compute all the possible paths in regular expression going from $q_i$ to $q_j$
Now, we do it recursively, such that the GNFA has only the start state and the accept state
Then we arrive at a single regular expression. Therefore, $L(R)=$ language recognized by the original NFA.
For example:

Question
Consider the language that has strings with even number of $0$, and turn it into a regular expression using the GNFA procedure.

Solution
First, we build the GNFA with
where
 technically we also need to add all the other transitions between any two states labelled with $\empty$, but since they are $\empty$ transitions, they do not accept the below computation.
Now, we rip $q_1$:
Lastly, we rip $q_0$:
and the regular expression $(1 \cup 01^0)^$

for example, this works:
Generalized NFA
It is a special case of a NFA, such that:
 it has exactly one start state, and one accept state
 transitions are labeled with regular expressions
 no incoming transition into start state, and no outgoing transition from accept state
 for all the transitions are present (like the requirement for DFA)
Transform GNFA to NFA
Given an NFA, can transform to equivalent GNFA:
 add new start state $q_{start}$, with $\epsilon$ transition to the old start state
 add new accept state $q_{accept}$, add $\epsilon$ any old accept states to $q_{accept}$, and make $q_{accept}$ the only accept state
 if a transition was labeled by multiple symbols (e.g. ${a,b}$), label it as a union of symbols (e.g. $a \cup b$)
 label any previously nonexisting transitions going to $\empty$
Transform GNFA to Regex
Whenever you rip one state off, you need to compute, recursively, the states that are affected by that removal.
For example:
Week 4
Up to now, we have:
However, we need to also discuss the case of proving that a language is not regular.
For example:

Question:
Prove that the following language is not regular:
\[L=\{ a^i b^i \,\,i\ge 0\}\] 
Solution
First, we need to understand the language. Some examples would be:
\[\{\epsilon, ab, aabb, aaabbb, ...\}\]The intuition would be that you would need an infinite number of states to know if you had the same number of $a$ and $b$.
We do this by contradiction:

Assume $L$ is regular. Let $D$ be a DFA that recognizes $L$. Let $p$ be the number of states in $D$.
Consider the string $a^pb^p \in L$.
Consider the computation of $D$ on $a^pb^p$
\[r_0(a) \to r_1(a) \to r_2(a) \to... \to r_p(a) \to r_{p+1}(b) \to r_{p+2}(b)\to ... r_{2p}(b)\]where:
 $r_{2p}$ an accepting state
 and those states do not have to be unique
Yet, since we said there are only $p$ states, there must be a repetition among the states.

Let $r_k = r_l$ where $k <l \le p$
Then:

$a^pb^p = a^ia^{ik}a^{pl}b^p = xyz$
 where $x=a^i, y=a^{ik}, z=a^{pl}b^p$
But this means that even if we dropped $y$ from the string, $xz = a^i a^{ik}b^p$ is still accepted.
Therefore, as soon as you have a cycle, other strings not in the language will also get accepted. This contradicts the settings.

Proving Irregular Language  Part 1
Now, we generalize the above proof.
 First we want to show that all regular language satisfy some property “pumping lemma”
 any long enough string in the language have a cycle that can be pumped
 Use the above to show that $L$ is not regular:
 Assume towards contradiction, that $L$ is regular, then pumping lemma holds. Then choose long enough words, pump it in a way that we find a string $\notin L$. (basically the cycle we made)
Pumping Lemma:
For any regular language $L$, there exists a number $p$ (“pumping number”), such that for $\forall w \in L$, $\vert w\vert \ge p$,:
\[\exists \,\,a\,\,way\,\,to\,\,parse\,\,w=xyz\]such that:
 $\vert xy\vert \le p$
 $\vert y\vert > 0$
 $\forall i,\,\,xy^iz\in L$
 the key part
 basically parsing/transitioning the $y$ part will have a cycle, and a regular language NEEDS to take it
Now, we need to first show that all regular languages fulfill this lemma.
Proof:
Let $L$ be regular, and let $D$ be some DFA recognizing $L$, and let $p$ be the number of states in $D$.
\[\forall w \in L, w\ge p,\,\,write\,w=w_1w_2...w_p...w_m\]Now, consider a computation of $D$ on $w$:
Since there are only $p$ states, then there must be a repetition/cycle in the states $r_0,r_1…r_p$. because we have only $p$ states.
Therefore, $\exists i<j\le p,\,\,r_i=r_j$.
 notice it is $\exists$, so we cannot freely choose where that $i$ is, but only know that $i<p$.
Then we have:
Then it follows that we can break down
$w=xyz$
where:
 $x=w_1..w_i$
$y=w_{i+1}..w_j$
 $z=w_{j+1}..w_m$
As a result, in this construction, we have shown that all the required properties hold:
 $\vert xy\vert =j\le p$
 $\vert y\vert =ji > 0$
 this $y$ has to be made such that $x$ CANNOT be pumped. So you actually cannot choose $y$.
 $\exists\,i=0,1,2…\vert xy^iz\in L$
Note:
 In general, you cannot use a specific parsing of $y$ in $w=xyz$. To reach a contradiction to the pumping lemma, you need to be able to handle (find a contradiction for) any possible parsing – you cannot choose one specific one.
 When using the pumping lemma, all you can choose/make up is the parameter $w$ and the $i$. All the other parameters have to fall off from your choice of $w$ and $i$.
Therefore, we can use the pumping lemma to prove that some $L$ is not regular:
 Assume towards contradiction $L$ is regular.
 Then there must be $p$ being the pumping number.
 I choose some string $w\,\vert \,w\in L \,\,and \,\,\vert w\vert \ge p$
 this is the tricky part that you need to figure out yourself
 Then, whenever you parse the string $w$, there will be a cycle due to the lemma with the pumping number.
 Therefore, we know $w=xyz$ where
 $\vert xy\vert \le p$
 $\vert y\vert >0$
 $\exist i,\,\,s.t.\,\,xy^iz \notin L$
 Therefore, this contradicts the pumping lemma, and $L$ is not regular.
Theorem:
If we have a language:
\[L=\{w \in \{ a,b\}^*\,\,w\,\,is\,\,a\,\,palindrome\}\]Then $L$ is not regular.
Proof:
 Try 1:
 Assume towards contradiction that it is regular
 Then we choose $w=a^p$, where $\vert w\vert \ge p$ and we want to find a $w=xyz$ and by pumping, it will produce a word that is not in $L$.
 But it seems all the words are still in $L$. Hence it does not work.
 Try 2:
 Assume towards contradiction that it is regular
 Then we choose $w=a^pba^p$, where $\vert w\vert \ge p$
 we want to find a $w=xyz$, such that $\vert xy\vert \le p$, and $\vert y\vert >1$.
 this means that $y$ must be consisting only $a$s. This $w$ has to be constructed nicely so that this result falls out.
 Then, we can construct a string, for example: $xyyz=a^{p+\vert y\vert }ba^p \notin L$.
 In general $\exists i,\,\,xy^iz=a^{p+i}ba^p \notin L$
 Hence we reach a contradiction. $L$ is not regular.
Another Example:

Question
Show that the language:
\[L = \{1^n\,\,n\,\,is\,\,prime\}\] 
Solution
where:
 basically, all you can choose is $i$.
Week 5
Proving Irregular Language  Part 2
There are also other methods to prove that a language is not regular.
Using Closure Properties
Let’s start by looking at one example:

Question:
Show that $L={w\in{a,b}^*\,\vert \,#a=#b}$

Solution:
Basically we need to know something else being nonregular (the ones we have seen before).
Assume $L$ is regular, then we can take the $L_1 = a^b^$, which is regular.
Now, since regular languages are closed under operations $\cap$, consider:
\[L \cap L_1 = L_1 = a^pb^p\,,\,\,\forall p \in \mathbb{Z}\]which we have proved to be nonregular.
Therefore, it is not closed, and we reached a contradiction.
Therefore, $L$ is also nonregular.
In general, the template for the proof would be:
 Assume $L$ is regular. Then take another regular language $L_1$ (e.g. built from a regular language).
 Then, the following language $L \cap L_1 = L_2$, or $L \circ L_1 = L_2$, or any closed operation is should still be regular.
 But $L_2$ is known to be irregular.
 Therefore, a contradiction, as those operations should be closed. So $L$ must also be irregular.
MyhillNerode Theorem
Definitions:
Let $L \in \Sigma^$ be some language over $\Sigma$, and $x,y\in \Sigma^$ be strings. We say $z \in \Sigma^*$ is a distinguishing extension of $x,y$ w.r.t. $L$, if exactly one $xz$, $yz$ is in $L$ and the other string $\notin L$.
 notice that $z$ can be $\epsilon$
If such $z$ exists, then we say that $x,y$ are distinguishable by $L$.
If no such string exists, then we say $x,y$ are indistinguishable by $L$. (So they are either both in $L$ or both not.) Denote it by
\[x \sim_L y\]
Basically, $x$ and $y$ are indistinguishable if both of them landed at the same state in a DFA/NFA, so that any ending you add to them will affect both the same way.
 if obviously $x$ and $y$ are in different states (e.g. $x$ is accepted and $y$ is not), then you simply have $z=\epsilon$
Claim:
Then $\sim_L$ is an equivalence relation.
\[[x] = \{y \in \Sigma^*\,\,x\sim_Ly\}\]
 where you basically have a set of those strings being indistinguishable in the same manner
This means that, if $L \subseteq \Sigma^*$ is a regular language, and $D$ is a DFA recognizing it, then $x \sim_L y$ also means that $x$ and $y$ end in the same state.
 so if $x$ and $y$ ended at different state, then automatically they are distinguishable.
Corollary:
 If $L$ is regular, and $D$ is a DFA for $L$, then it has to be that $#$ of equivalent classes induced by $\sim_L$ must be $\le#$ states in $D$.
 kind of obvious because the equivalent class will land on the same state
MyhillNerode Theorem:
 For any $L \subseteq \Sigma^*$, $L$ is regular $\iff$ $#$ of equivalent classes for $\sim_L$ is finite.
 Obvious since a regular language must have a finite number of states.
 If $L$ is regular, then $#$ of equivalence classes for $\sim_L$ is exactly the $#$ of states in the minimal DFA recognizing $L$.
Note:
 This means that we can prove how many states the minimized DFA for language has.
 In fact, there is an efficient algorithm, s.t. given a DFA, it minimizes it to be the smallest DFA for $L$.
 This also means that we can prove that a language is not regular, by showing infinitely many strings, where no two strings $x,y$ are equivalent to each other $\sim_L$, then there must be infinitely many equivalence classes. Hence it is not regular.
For example:

Question:
Consider the palindrome example:
\[L_{pali} = \{ w\in\{a,b\}^*\,\,w=w^R \}\] 
Solution:
where:
 remember that all we need to show is that there exists an infinite number of “palindromes” that are not similar.
 the proof above works because $a^n$ is a palindrome itself
 remember that all we need to show is that there exists an infinite number of “palindromes” that are not similar.
For example

Question:
Consider the language:
\[L=\{a^nb^n\,\,n\ge0\}\]is not regular.

Solution:
In general, the template would be:
 Think about two words $w(n),w(n+k)$, and $k\neq 0$
 Find a distinguishing extension that would make exactly one of them in the language and the other not
 so that the two words that are not equivalent
 Make $n\to \infty$. You have found an infinite number of equivalent sets, because $w(0)\neq w(1) \neq w(2) \neq w(3) \neq …$
 we don’t care if $w(n)$ is accepted or not at this point, we just need to show that they are all different from each other (lands in different states)
 Finished, there are infinite number of equivalent class, hence infinite number of states needed.
Other Usage:

Question:
Show that there are exactly $2^n$ equivalent classes (states) in a language:
\[L = \{0,1\}^*\,\,nth\,\,last\,\,symbol\,\,is\,\,0\] 
Solution:
First, we can show that there are at least $2^n$ equivalent classes:
Then, we show that there are exactly $2^n$ equivalent classes.
Context Free Language
Basically, all we have discussed were the regular languages, but there are other wider/richer class of languages out there:
Alike regular languages, you can also define context free languages in two ways:
 language generated (alike regex), this would be the most common for context free language
 algorithm generated (alike DFA/NFA)
One example of a contextfree language is:
\[\Sigma = \{0,1\};\,\,\\ V=\{S\};\\ \begin{cases} S\to \epsilon\\ S\to 0S1 \end{cases}\]Then, a word in the language would be:
\[S \Rightarrow 0S1 \Rightarrow 00S11 \Rightarrow 0011 \in L(G)\]where:
 you started with a start variable $S$, and then generated to $0S1$, and etc.
And, in general, you have:
\[L(G) = \{ 0^n1^n \,\,n\ge0 \}\]and you get a sense that context free languages are basically production rules:
 strings you can produced from those rules are in the language.
Definitions
Definition of Context Free Grammar:
A contextfree grammar is a tuple $(V, \Sigma, R, S)$,
where
$V$ is a finite set of single variables
 it is exactly because it only allow a single variable, it is context free because it only depends on that variable without context
$\Sigma$ is an alphabet (finite)
$R$ is a finite set of rules of the form:
\[A \to w;\,\,A\in V;\,\,w\in(V\cup\Sigma)^*\]so basically, you can have more than one variables, and each variable points to either a string or a string+another variable.
 in general, for generating an nonempty language, you need to have at least one rule that points to a symbol only.
$S \in V$ is the start variable
This means the following:
Definitions
If two string $u,v \in (V \cup \Sigma)^*$, and there is a rule $A \to w \in R$, then we can write:
\[uAv \Rightarrow uwv\]which reads: $uAv$ yields $uwv$.
If two string $u,v\in (V \cup \Sigma)^$, we say that $u \Rightarrow ^* v$ if some $u_1, u_2, …,u_n \in (V \cup \Sigma)^$ we have:
\[u \Rightarrow u_1 \Rightarrow u_2 \Rightarrow ... \Rightarrow v\]where:
 $u_1,u_2,…,u_n$ are just strings that you can reach from its previous string.
This basically means that $u$ generates to $v$ if there is an arbitrary number of step that can make we start with $u$ and end in $v$.
 e.g. $S \Rightarrow^* 0011$ in the example above
Notice that:
 when you specify rules, you use $\to$, when you show derivations, you use $\Rightarrow$
And correspondingly:
Context Free Language Def:
 A language $L$ is a contextfree language if there exists a context free grammar $G$ such that $L(G)=L$.
Therefore:
Definition:
We say context free language $G = (V, \Sigma, R, S)$ has the language $L(G)$ if: $$ {mathtools} L(G) = {w \in \Sigma^* \,\,S\Rightarrow ^* w}
$$
For example:

Question:
\[G=(\{S\},\{a,b\},R,S)\\ R:\begin{cases} S \to \epsilon \\ S \to aSa\,\,bSb \end{cases}\] 
Solution:
We see that it is the language of:
\[L(G) = \{w \in \{a,b\}^*\,\,w\,\,in\,\,an\,\,even\,\,length\,\,palindrome\}\]To prove it, we need to show the $\iff$ relationship:
 That all $S \Rightarrow^* w$ is in fact an even length palindrome
 All even length palindromes are captured by $L$/can be derived in $L$.
TODO: look at the proof in the book
For example:

Question:
\[G=\{ \{G\},\{a,b,+,*,(,)\},R,E \}\\ E: \begin{cases} E \to E+E\,\,E*E\,\,(E)\,\,a\,\,b \end{cases}\]Derive the string $a+b*a$

Solution:
The solution is trivial:
\[E \Rightarrow E+E \Rightarrow E+E*E \Rightarrow a+E*E\Rightarrow a+b*E \Rightarrow a+b*a\]in fact, its corresponding derivation tree looks like:
or, another way is:
\[E \Rightarrow E*E \Rightarrow E+E*E \Rightarrow a+E*E\Rightarrow a+b*E \Rightarrow a+b*a\]and its derivation tree looks like:
which means that you can interpret as both:
 $a+(b*a)$
 $(a+b)*a$
Ambiguity
In fact, the above two examples leads to two different types of derivations as well:
Rightmost Derivation
 A derivation where, in each step, the rightmost variable gets replaced.
 the first derivation tree above is not a rightmost derivation tree
Leftmost Derivation
 A derivation where, in each step, the leftmost variable gets replaced.
 the second derivation tree above is also not a leftmost derivation tree
 this is also a left most derivation: $E \Rightarrow E+E \Rightarrow a+E\Rightarrow a+E*E \Rightarrow …$
 There exists a 1:1 correspondence between a derivation tree and a leftmost derivation.
Definition
 A context free grammar is ambiguous if $\exist w\in L(G)$ s.t. the same string $w$ has two different derivation trees (not derivations)
 Equivalently, it is also ambiguous if it has two or more leftmost derivations (not derivation trees).
 an application to think about it is how compiler compiles this, there should be only one way to compile/parse it.
 the example above is ambiguous.
A inambiguous contextfree language would be:
\[G=G'(\{E,F,G\},\{a,b,+,*,(,)\},R,E)\\ R:\begin{cases} E \to E+F \,\, F\\ F \to E*G \,\, G\\ G \to (E) \,\, a\,\,b \end{cases}\]where:

the strings generated by this $G’$ is the same as $G$ defined above ($G={ {G},{a,b,+,*,(,)},R,E }$), but this one is inambiguous
 in general, it is difficult to prove that any language is inambiguous
 an inambiguous language might not exists for a specific contextfree grammar
Definition:
 A context free language $L$ is inherently ambiguous, if every context free grammar that generates $L$ is ambiguous.
An example would be:
\[L =\{a^nb^nc^md^m\,\,n,m\ge0\} \cup \{a^nb^mc^md^n\,\,n,m\ge0\}\]where:
 again, this would be hard to prove, but it is inherently ambiguous.
 in fact, if you take the string $a^+b^+c^+d^+$ always have two derivation trees, no matter what $G$ you have that generates $L$.
Proving the Language of CFG
In general, the proof has to do in both directions, and usually induction will be helpful.
For example:
 Question:
and we want to prove the claim.

Solution
Week 6
Regular and Context Free Language
Consider the set of all valid regular expression over $\Sigma = {a,b}$.
That is, take alphabet ${a,b,(,),\cup,\circ,,\empty, \epsilon}$ and the language $REGEX:{w \in {a,b,(,),\cup,\circ,,\empty, \epsilon}^*\vert \text{$w$in in the regular expression}}$
 e.g. $w=((a\circ*$ is not a regular expression, so it is not in $REGEX$, but $w=((a))$ is.
Claim:
 $REGEX$ defined above is not a regular language, but context free.
 basically, there is no DFA/NFA to tell if a regular expression is regular.
Proof: (Using pumping lemma)
Assume it was (so have a DFA/NFA for it), let $p$ be the pumping number.
Choose $w=((((..(a^)^..)^*)))$ which is a valid regular expression, and $((((..($ is length $p$.
 so we have $\vert w\vert > p$
For any possible parsing $w=xyz$, where $\vert xy\vert \le p$, $\vert y\vert >0$.
Then it has to be the cases that $y$ contains only $($.
Now, we can simply choose any $i$, for example $i=2$
 $xyyz=(^{p+\vert y\vert } a)^)^…)$ is not balanced in brackets, so it is not regular.
Now, we show that it is contextfree:
 \[G=(\{S\},\{a,b,(,),\cup,\circ,*,\empty, \epsilon\},R,S)\\ R: \begin{cases} S \to ab\epsilon\empty \\ S\to (S \cup S)  (S \circ S)(S^*) \end{cases}\]
 and from here, we can prove that this generates the language $w \in {a,b,(,),\cup,\circ,}^$
Therefore, the set of regular expressions are contextfree.
 The class of contextfree languages is closed under $\circ,\cup,*$
 every regular language is context free
Proof for Context Free Grammar Closure
Theorem:
 The class of contextfree languages is closed under $\circ,\cup,*$
 actually other operations such as complement is not closed
Proof for Union:
Start by constructing the grammars
Now, we need to show that it works.
Claim:
$L(G) = L(G_1) \cup L(G_2)$
so we need to show that $S \Rightarrow ^* w \iff S_1 \Rightarrow^w \vert S_2 \Rightarrow^w$
 this will be trivial as if you start with $S$, it must go either first to $S_1$ or $S_2$. Then the rest follows.
Proof for Concatenation
Again, start by the definition of the grammars:
Now, we need to show that it works:
Claim:
 $L(G) = L(G_1) \circ L(G_2)$
Note:
 usually, the trick is to see that a starting variable $S_i$ of a grammar $G_i$ is representative of the entire contextfree language that it can reach
Proof for Kleene Star
Same thing, start with
Now, we need to show that it work:
Claim:
 $L(G)=L(G_1)^*$
seems to be true by definition
TODO
Regular and ContextFree Language
Theorem:
 Every regular language is context free
There are also other ways to prove it, for example using a DFA. Look at the book to see more.
Proof:
Let $L$ be regular, and generated from a regular expression $R$. We want to show that $L(R)$ is also generated by some CFG $G$.
basically, we want to show that all words in the $L(R)$ is generated by a grammar $G$.
We prove this by induction:
 this part is basically the same as the procedure mentioned in Tips on Proving Regular Expressions
 this is also the same as mentioned in the tips section
Lastly, it works out by the closure property, because for each:
 $R_1,R_2$ involved above, we can use a $G_1$ and $G_2$ which is true by induction.
 now, since $G_1$ and $G_2$ using the operations are closed, hence this completes the proof that you can have a $G$.
Not ContextFree Grammar
There is a lemma similar to the pumping lemma, called the “tandem pumping lemma”
Tandem Pumping Lemma:
If $L$ is a CFL, then there exists a constant $p$, s.t. $\forall w,\,w\in L$, and $\vert w\vert \ge p$, there exists a way to parse $w$ into:
\[w=uvxyz\]where:
 this is the different part from the normal pumping lemma
 $\vert vxy\vert \le p$
 different from the normal pumping lemma as well
 $\vert vy\vert >0$
 again, this and the above has to come out by themselves. You cannot choose them.
 this is the part where we can pump
 notice that $v$ or $y$ themselves could be empty
 $uv^i xy^i z \in L,\,\,\forall i$
 notice that here you can pump two of them at the same time
Proof Idea:
Let $A$ be a CFL and let $G$ be a CFG that generates it. We must show that any sufficiently long string $s$ in $A$ can be pumped and remain in $A$.
Let $s$ be a very long string in $A$. (We make clear later what we mean by “very long.”) Because $s$ is in $A$, it is derivable from $G$ and so has a parse tree. The parse tree for $s$ must be very tall because $s$ is very long. That is, the parse tree must contain some long path from the start variable at the root of the tree to one of the terminal symbols at a leaf. On this long path, some variable symbol $R$ must repeat because of the pigeonhole principle.
 e.g.
As the following figure shows, this repetition allows us to replace the subtree under the second occurrence of $R$ with the subtree under the first occurrence of R and still get a legal parse tree (and once it repeated once, you can repeat infinite number of times). Therefore, we may cut s into five pieces $uvxyz$ as the figure indicates, and we may repeat the second and fourth pieces and obtain a string still in the language. In other words, $uv^ixy^iz$ is in $A$ for any $i\ge0$.
 where $v$ or $y$ could be empty, but at least one of them is not.
 we don’t care what $u,z$ are, we just know that you can repeat $R$ many times and get infinite $R\to vRy$, and finish with $R\to x$.
Template for proving noncontextfree grammar:

Assume CF, then you get a pumping number $p$.

Choose a string $w \in L$, such that $\vert w\vert$ is long enough hence $\vert w\vert \ge p$

Think about all possible parsing of $v,y$, such that:
 $\vert vxy\vert \le p$
 essentially, since $x$ can be $\epsilon$, we need to make sure $\vert vy\vert \le p$
 and $\vert vy\vert > 0$
 as otherwise, you wouldn’t have an really long word shown in the proof sketch above
 $\vert vxy\vert \le p$

Show that for all possible parsing, you can choose an $i$ such that:
\[uv^i xy^i z \notin L\]
Example: Tandem Pumping Lemma
The following two languages are not contextfree (hence not regular as well):
For example:

Question:
Show the following language is not CF:
\[L=\{a^i b^i c^i  i \ge 0\}\] 
Solution:
 so since $vxy$ can only hold up to two different characters, $v,y$ to be pumped can be at most two different characters. Therefore, since $\vert vy\vert >0$, you will have at least one character unchanged.
For example:

Question:
Show that the following language is not contextfree:
\[L=\{ ww \  \ w \in\{a,b\}^* \}\] 
Solution:
The draft attempts are:
 the correct way is that we need to show that for every parsing of $vxy$, it is not in the language
The attempt is:

and here, you will see that in all three cases/parsing, the pumping will work:
Week 7
Other Closure Properties on CFL
Theorem:
 The class of contextfree languages is not closed under intersection.
Proof:
We will show two specific languages, $L_1,L_2$ being context free, but $L_1 \cap L_2$ is not. This is already enough to prove the theorem.
 this means that there are some CFL being closed under intersection, but some not.
Define:
\[L_1 =\{ a^ib^ic^j \,\, i,j\ge 0 \}\\ L_2 =\{ a^ib^jc^j \,\, i,j\ge 0 \}\]
Claim 1: $L_1,L_2$ are context free.
Quite simple, for $L_1$ just need:
\[S\to \epsilon\,\,aSbV \\ V \to c \,\,VV\]For $L_2$, you need:
\[S\to \epsilon\,\,VbSc \\ V \to a \,\,VV\]Claim 2: $L_1 \cap L_2 = { a^ib^ic^i \,\vert \, i\ge 0 }$
 which is not context free.
This completes the proof, since we just need to show one “counter”example.
Theorem:
 The class of contextfree languages is not closed under complement.
Proof:
Assume towards contradiction, that it was closed.
Now, using De Morgen:
\[L_1 \cap L_2 = \overline{(\overline{L_1} \cup \overline{L_2})}\]However, we know that:
 $LHS$ is not closed
 union operation IS closed
 Therefore, it cannot be that complement is closed.
This completes the proof.
PushDown Automata of CFL
(This part is not on the syllabus, will not be tested.)
 A pushdown automata is like an NFA, but also has a stack memory.
This part is interesting, because it can be shown that:
Theorem:
 $L$ is a $CFL$ $\iff$ it is recognized by a $PDA$.
 then we can also show that every regular language is contextfree, since a NFA is just a special case of a PDA without a stack.
Note:
 If we have 0 stack of a PDA, we have a NFA
 If we use 1 stack, we have a PDA.
 If we have 2 stacks, or if we use a queue, we get something equivalent to a Turing Machine.
There will be no exam on CFL
Exam 2 Starts Here
Turing Machine
Basically, this is the model that captures what computers can do.
The definition of a Turing Machine is as follows:
Definition Sketch:
where:
 input is on a tape, which is infinite (s.t. after the finite input, it is filled with “blanks”)
 at each step of computation:
 depending on:
 the current state
 symbol on the tape
 can do:
 switch to another state
 write on the tape at the current head position
 (different from DFA)
 move the head to the left or to the right
 (different from DFA)
 e.g. from state $q_1$ with input $a$, you could write a $b$, go to $q_2$, and move the head to the left.
 there are two special states $q_{acc}, q_{rej}$ (different from DFA):
 $q_{acc}$: accept and halt/exit
 $q_{rej}$: reject and halt/exit
Hence, the summaries of difference between a Turing Machine and a DFA is:
 reading input can go left or right
 infinite memory
 yet still finite amount of states in the state control
 can write as well
 immediately halts upon entering $q_{acc}$ or $q_{rej}$
Formal Definition of a TM
A Turing Machine is a 7tuple:
\[M=(Q,\Sigma, \Gamma, \delta, q_0, q_{acc},q_{rej})\]where:
 $Q$: a finite set of states
 $\Sigma$: a finite set of alphabet (for input)
 $\Gamma$: a finite set of alphabet (for output/writing)
 this means $\Sigma \subseteq \Gamma$, and blank $\textvisiblespace \in \Gamma$ (but $\textvisiblespace \notin \Sigma$)
 $q_0, q_{acc},q_{rej} \in Q$
 $q_0$ is the starting state
 $q_{acc}$ is the accepting state
 $q_{rej}$ is the rejecting state
 $\delta: Q \times \Gamma \to Q \times \Gamma \times {L,R}$
 i.e. it takes a state an a symbol on the tape, and do a state transition, write a symbol, move head to left or right
 e.g. $\delta(q_2,a) = (q_4,b,R)$
Computation Def Sketch:
 A computation of a TM on input $w \in \Sigma^*$ (at the start, before modification/write) can:
 start with $w$ followed by infinitely many $\textvisiblespace$ on the tape, with reading head pointing to first character
 apply $\delta$ repeatedly
 read, transition, move, and write
 if ever enter $q_{acc}$ or $q_{rej}$, halt
Note:
 If head is pointing to left most location, and $\delta$ says to move left, we assume that it stays remained.
For example:

Consider a TM with $\Sigma={a,b}$:

This recognized $L={w \in {a,b}^*\,\vert \, w \,\text{contains a$b$} }$
For example:

Question:
Is the following TM the same as before?

Solution:
It is the same language, but not the same program (e.g. once reached the end, meet $\textvisiblespace$ and terminate. But the other one in black will hang forever).
Language:
 Since a TM $M$, on input $w$, may be:
 accepted  met accept state
 not accepted  met reject state or run forever
Then the language of a TM $M$ is defined as:
\[L(M)=\{w \,\,\text{$M$ accepts $w$}\}\]
Configuration Def:
A configuration represents the status of computation at a certain point. The configuration is:
\[C=uqv\]where:
 $u,v \in \Gamma^*$
 $q\in Q$
and it means:
 the tape currently has $uv$ followed by infinitely many $\textvisiblespace$
 $u,v$ could also contain $\textvisiblespace$
 the current state is $q$
 reading head is pointing to the first symbol of $v$
 haven’t read it yet, it is point
Therefore, it means:
Initial Configuration:
the initial configuration would always be:
\[C_{ini}=q_0w\]where:
 $w$ is the input string
Yielding Configuration:
$C_1 \vdash C_2$ ($C_1$ configuration yields $C_2$ configuration) if applying a $\delta$ on $C_1$ gives $C_2$
e.g. $C_1 =aq_1bab$, and $\delta(q_1,b)=(q_0,a,R)$, then $C_2=aaq_0ab$
schematically:
Accepting and Rejecting Configuration:
 Any $C=uq_{acc}v$ is an accepting configuration, $u,v\in \Gamma^*$
 Any $C=uq_{rej}v$ is an rejecting configuration, $u,v\in \Gamma^*$
 and there will be no further configurations once the above two is reached
Therefore, this means that:
Accepted String Def:
 For a TM $M=(Q,\Sigma, \Gamma, \delta, q_0, q_{acc},q_{rej})$, a (input) string $w \in \Sigma^*$ is accepted by $M$ if there exists a sequence of configurations, $C_1,C_2,…C_n$, such that:
 $C_1=q_0 w$
 $C_i \vdash C_{i+1},\, \forall\,i=1,2,…n1$
 $C_n=uq_{acc}v,$ and $u,v\in\Gamma^*$
TM Recognizable Language:
 A language $L$ is TMrecognizable if there exists a Turing Machine $M$ such that $L(M)=L$.
 sometimes, people also call this $L$ recursivelyenumerable
 $x\in L \to$ $M$ accepts $x$
 $x\notin L \to$ $M$ rejects $x$ or run forever
For example:

Consider the nonregular but contextfree language:
\[L=\{0^i1^i\,\,i\ge0\}\] 
The TM for it would be:
 the idea is to write a $X$ for a $0$, and move to the right
 to write a $Y$ for a $1$, and move to the left
Note:
 Some transitions are omitted in the above diagram, and those transitions will be assumed to go to $q_{rej}$.
For example:

Question:
Construct the TM for:
\[\{1^{2n}\,\,n\ge0\}\] 
My Solution:
 the idea is to cut off every pair of $1$ at left and right end, recursively
 if it is power of 2, then you should end put exactly cutting off everything on the last iteration
 the idea is to cut off every pair of $1$ at left and right end, recursively
Week 8
Turing Machine (Continued)
Definition:
 A TM $M$ is a decider if $M$ halts for every $w \in \Sigma^*$
 i.e. every word can either be accepted, or rejected (cannot have an infinite loop)
This means that:
TMDecidable
 A language $L$ is a TM decidable if if there exist a TM $M$ such that :
 $M$ recognizes $L$
 $M$ is a decider
 $x\in L \to$ $M$ accepts $x$
 $x\notin L \to$ $M$ rejects $x$
So the relationship look like:
where:
 one important thing implied from the above (which is true), is that some sets (regular and CF) are actually proper subsets:
 this will be proven later in the course
InputOutput TM
Definition:
An inputoutput TM:
\[\{Q, \Sigma, \Gamma, \delta, q_{start}, q_{halt}\}\]starts with input written on tape ($C_1=q_{start}w$). If/when it changes to state $q_{halt}$, then the machine stops, and the string written on tape (w/o the trailing infinite blanks) is the output.
 in general, those TM could also go into infinite loop
Computable Function Definition
A function:
\[f: \Sigma^* \to \Sigma^*\]is computable if there exists a TM $M$, such that $\forall w \in \Sigma^*$ (notice the for all here) , $M$ halts with output $f(w)$ written on the tape.
 this also means it does not go into loop
For example:
 A computation function could be: insert $#$ symbol in the beginning of the input (i.e. $w\to #w$)
 TODO: implement this
 another one would be: insert first block of string to end of second block (i.e. $w_1#w_2\to w_1#w_2w_1$)
 TODO: implement this
 a more useful example would be unary addition: $1^n # 1^m \to 1^{n+m}$
 TODO: implement this
In fact: anything you can implement in your computer programming language can be expressed via such a function.
Variants of TM
Essentially, they are the same thing, but there could be useful for solving problems.
Definition:
 Two TM are equivalent, if for $\forall x \in \Sigma^*$, both TMs do the same for each string, s.t. it either:
 halt + accept for both
 halt + reject for both
 run forever for both

TM with Doubly Infinite Tape
 where left side of the head in the beginning will just be blanks
we claim that this is the same as a onesided tape.
Proof Idea:

First, we need to have a 1sided TM, and show that we can construct an equivalent twosided one
 this will be easy

Second, we need to show that if we have a 2sided TM, we can construct an equivalent onesided one

this is more trickly, but the idea will be folding the tape:


TM with command of not moving:
So you have:
\[\delta : Q\times \Gamma \to Q\times \Gamma \times \{L,R,S\}\]where:
 $S$ means not to move/stay
Proof Idea:
 Easy to simulate staying at the same place by moving right and then moving left

Multitape TM
This will be very useful. Basically it is a TM with a fixed number of tape (i.e. you can have 3 tapes, or 6 tapes, etc.). This means the following change:

you will have three reading heads:
in the beginning, all heads will point to the left most character

transition/movement
 \[\delta: Q \times \Gamma^k \to Q \times \Gamma^k\times \{L,R\}^k,\,\,\text{where $k$ is the number of tapes}\]

and at the beginning of the computation, input $w$ is on first tape, every thing else is $\textvisiblespace$ (space)

accept state and reject state is the same as the onetape definition
Proof Sketch:

We simulate onetape using the $k$ tape TM
 this is trivial, done by just ignoring the other tapes.

Conversely, we can simulate a $k$ tape TM using a 1tape TM as follows:
where, in the end it looks like:
 keep content of all tapes separator $#$
 keep a special marked version of a symbol (e.g. using a star, $$, so $w_1 \to w_1 ^$), representing the head position
 transition function:
 scan input from left to right on the above, recording the state of each special marked symbol (basically the head).
 for example, in the above example, you have state of $w_2 ^, x_1^$
 then update the tape based on the transition function of multitape TM, since now you know exactly current state of the multitape, and the input (which you need to come back to the first part of the tape)
 scan input from left to right on the above, recording the state of each special marked symbol (basically the head).
In the end, you will have a quadratic overhead in runtime, since you will need to run back and forth to know what is the current state and what is the next input symbol.

NonDeterministic TM
Definition:
Basically, it is the same as a TM, except for the transition function: $$
\delta : Q \times \Gamma \to \mathbb{P}(Q \times \Gamma \times {L,R})
$$ so that:
 a string is accepted by a NTM $N$, if there exists at least one accepting computation.
This means, given a string $w$ for a NTM, you can draw a configuration tree:
where:
 each node is a configuration
Claim:
 For every nondeterministic TM, $N$, there exists an equivalent (deterministic) TM, $M$.
 i.e. they are the same
Proof:
the direction to simulate a TM using a NTM is simple.
The reverse would be: given $N$, construct a $M$ as follows:
 the idea is to simulate a $N$ using $M$, by trying to simulate the tree (so we simulate all computations of $N$).
 If the word $w$ ever gets accepted on the tree, we accept.
 this will be done using a BFS manner. Because if you use DFS, then you might get into an infinite loop due to the possible existence of an infinitely deep branch. (Notice that one challenge here is to find a way to “backtrack” in the tree).
The solution is to use a 3Tape TM (which is the same as a normal TM $M$ as proven before):
the first tape has the input (we won’t write on this tape)
the second tape has the content of the tape of $N$ in the current execution point
 i.e. contains what is written
the third tape keeps track of the path of the node on the tree we are simulating.
and the detailed steps are as follows:
 initially, we have only tape 1 being nonempty, and tape 3 initialized with filling in a $\epsilon$
 the $\epsilon$ would represent exploring the root node, as you will see in the behavior of tape 3 below
 then, we copy tape 1 to tape 2
 erase whatever we had before on tape 2
 basically, tape 1 would be intact, and will be our backup
 simulate the computation of $N$ on tape 2, so that:
 each time we meet a nondeterministic choice, we look at tape 3 to determine which one to take.
 if choice is not valid from tape 3, or the content of the tape 3 ran out (or rejected already), it means we are done with this path. So we replace string on tape 3 with next string telling the next path to explore (see tape 3 behavior below)
 go back to step 1)
 now, for tape 3:
 Since the total number of states is finite (bounded by the power set), then we can let $b$ being the maximum branching factor (i.e. maximum number of children a node can have).
 This means you have the alphabet $\Gamma_b={1,2,…,b}$
 then, at each level of tree, we basically contains an address, so that:
 for the first level, you can have address ${1},{2},…{b}$
 where ${2}$ would mean start by the root, going to its $2nd$ child, and finish
 so if you finished with ${2}$, your next string is ${3}$.
 for the second level, you can have address ${11},{12},…{bb}$
 where ${12}$ would mean start by the root, going to its $1st$ child, then go to that node’s $2nd$ child, then finish
 …
 at most have up to $b$ level, we you have address ${111…1},{111…2},…{bbb…b}$ (each of length $b$).
Note:
 Though the above algorithm would work, the overhead in the running time is exponential.
 this is because we backtracked and recomputed every step from the root to the next node/choice, by our construction.
 however, we don’t know, until today, whether if such an overhead is inherent (necessary) or not. (P vs NP)
Historical Background of TM
Turing wanted to show that there exists a mathematical problems that are not computable.
 then, he found that at the time, there is no definition of a computation (there were no computers even)
 therefore, in the end he came up with his model of a TM defining a computation
Later on, there comes more models, but it turns out all of them are equivalent of a TM:
 NFA + 2 stacks
 NFA + 1 queue
 general grammar
 lambda calculus
 invented by Church
 random access machines
 any programming language
 either equivalent in power, or weaker
ChurchTuring Thesis
ChurchTuring Thesis Definition:
 Every reasonable model of computation (i.e. algorithm) can be simulated by a TM.
 note that this is not a theorem, because we don’t know what will happen in the future about our computer/computation.
 however, up to today, pretty much everyone takes the thesis as correct/sensible.
Therefore, from now on, we will specify every algorithm in “pseudo code”, such that:
 each line needs to be computable/implementable by a TM
 equivalently, each instruction can be implemented in a programming language
 if you want a TM decider, then you need to make sure that each step of computation is finite, and your description is finite
 i.e. you cannot run to infinity
Theorem
Every algorithm (TM) can be encoded as a string/finite symbol description. We will denote: $$
\langle M \rangle
$$ being the string representing the TM, $M$.
 also implies that we can make a DFA/NFA/TM/Graph/etc. themselves being an input
Week 9
Closure Property for TM
Theorem:
 The class of TMDecidable language is closed under complement.
Proof:
 This works only with a TMDecidable language:
 Set $M’$ to be the same as $M$, but flip the $q_{acc}$ and $q_{rej}$ states.
 Then, whenever $w \in L(M) \iff \text{$M$accepts$w$} \iff \text{$M’$rejects$w$}\iff w \notin L(M’)$
 and conversely $w \notin L(M) \iff \text{$M$rejects$w$} \iff \text{$M’$accepts$w$}\iff w \in L(M’)$
 note, the line above only works if $M$ is a decider. If it is not, we also need to think about the cases of rejected due to running forever
Theorem:
 The language $L$ is TM Decidable $\iff$ both $L$ and $\bar{L}$ are TM recognizable.
Proof:
 The forward direction is easy:
 Assume $L$ is TM decidable
 then $\exists M$ s.t. it is a decider.
 then $L$ is recognizable because $L$ is decidable
 and by the above theorem, we can make a decidable $\bar{L}$ from flipping the states in $M$. So $\bar{L}$ is recognizable.
 The reverse direction is harder:
 Assume $L$ and $\bar{L}$ are both TM recognizable.
 then there $\exists M_1, M_2$ s.t.
 $M_1$ recognizes $L$ $\begin{cases}x\in L \Rarr \text{$M_1$accepts$x$}\x\notin L \Rarr \text{$M_1$rejects or run forever for$x$}\end{cases}$
 $M_2$ recognizes $\bar{L}$ $\begin{cases}x\in \bar{L} \Rarr \text{$M_2$accepts$x$}\x\notin \bar{L} \Rarr \text{$M_2$rejects or run forever for$x$}\end{cases}$
 Use $M_1, M_2$ to construct a decider for $L$:
 on input $x$, run $M_1$ on $x$, and run $M_2$ on $x$ in parallel
 e.g. run them on two tapes simultaneously. (Copy input on the second tape, and run on both)
 Now, we construct $M$ such that:
 if $M_1$ accepts $x$, return accept
 if $M_2$ accepts $x$, return reject
 Therefore, we have $\begin{cases}x\in L\Rarr \text{$M_1$will accept} \Rarr \text{$M$accepts}\x\notin L\Rarr x\in \bar{L}\Rarr\text{$M_2$will accept} \Rarr \text{$M$rejects}\end{cases}$
 Hence we have found a decider for $L$ using $M$ above. So $L$ is TM Decidable.
TODO: Union of TM
Application of TM Decidable Languages
Now, we can use a decider TM for simulating the output of a DFA/NFA/etc.
Application:
The acceptance problem of performing the DFA task is decidable: $$
A_{DFA}={ \langle D,w \rangle \text{$D$ is a DFA and $D$ accepts $w$}} $$ where:
 now, the “input” is a tuple of DFA $D$ and a word $w$.
 notice this means $w$ could be either recognized by $D$ but being rejected, or $w$ could be in the wrong format that $D$ does not even recognized, and hence rejected.
The construction of the TM would be:
where
 if given the encoding of a DFA, $\langle D\rangle$, it will be easy to check if that input is in the right format of $w$.
 this also means we have a way to check if $\langle D\rangle$ is a DFA or not (by definition).
High Level Proof
Application
Similarly, doing the job of a NFA: $$
A_{NFA}={ \langle N,w \rangle \text{$N$ is a NFA and $N$ accepts $w$}} $$ is also decidable.
Construction of decidable TM:
where:
 “on $\langle D,w\rangle$” just means the same thing as “checking if input $x$ is a valid encoding of $\langle D,w\rangle$. If not, reject.”
 and this will be a decider for $A_{NFA}$
Now, a more interesting example would be:
Application:
We can use a TM decider to check whether if a DFA has no acceptations: $$
E_{DFA}={\langle D\rangle \text{$D$ is a DFA and $L(D)=\empty$}} $$
Construction of decidable TM:
 the idea is to check whether or not a DFA $D$ has any reachable accept states.
 took care of the edge cases that you might have an accept state that is not reachable
 basically, this will be done backwards: we start with the accepting states:
where:
 remember, when met statements like “repeat …”, make sure it is in finite time. (In this cases, it is)
 and this algorithm is decidable.
One take home message from the above is that the below construction will not work:
where:
 the step of checking $w \in \Sigma^$ will run forever, which means it is a *recognizer but not a decider.
Application:
 We can use a TM decider to check whether if two DFA are the same: $$
EQ_{DFA}={\langle D_1, D_2\rangle \text{$D_1,D_2$ are DFA and $L(D_1)=L(D_2)$}} $$
 Construction:
 Using the idea of symmetric difference, and construct $L(C)$ as follows.
 since we have $DFA$s, using the closure property, we can get a $L(C)$ using DFA $C$.
 to see if $C$ is empty, we use the above $E_{DFA}$
 if $C$ is empty, then $L(D_1)=L(D_2)$, otherwise, false
Application:
We can also use a decider to do the job of a CFG: $$
A_{CFG}={ \langle G,w \rangle \text{$G$ is a CFG and $G$ accepts $w$}} $$
Construction skipped.
Therefore, this means we can also construct a decider for:
 $A_{Regular Language}$ since $A_{CFG}$ works
Not TM decidable Languages
 $EQ_{CFG} = { \langle G_1,G_2 \rangle \vert \text{$G_1$,$G_2$are CFG,$L(G_1)=L(G_2)$} }$
 $AMB_{G} = { \langle G\rangle \vert \text{$G$is a CFG that is ambiguous} }$
Week 10
TM Recognizable Languages
Now, for languages that are recognizable, means that there exists a TM that solved the problem.
 note that now, we don’t need to have finite steps as compared to before, because this is about recognizers.
 hence we could have the case of infinite loop, as long as the correct output could be reached (e.g. accept=halt)
Examples of Recognizable Languages
The acceptance problem of TM, such that using this I get whether the input is accepted or not (including running forever): $A_{TM}$ $$
A_{TM}= { \langle M,w\rangle \text{$M$ is a TM, and $M$ accepts $w$} } $$ where:
 note that this $A_{TM}$ could run forever. As we are dealing with recognizers, and acceptance is purely dependent on accepting + halting.
 $A_{TM}$ could basically be understood as a general InputOutput TM, such that we have $1$ output as accepted and $0$ output as rejected, infinite loop as infinite loop.
Proof/Construction
where:
 notice that if $M$ is running forever, then $U$ also runs forever. However, this is fine because if $U$ runs forever and is a recognizer, $U$ rejects by definition
 $U$ is not a decider, but a recognizer.
 this $U$ is also called a Universal Turing Machine, because it is general in the sense that this TM can run any other algorithm
Interestingly, we can also prove that the above is not decidable:
Examples of TM Nondecidable Languages
The same acceptance problem of TM, $A_{TM}$, is not decidable $$
A_{TM}= { \langle M,w\rangle \text{$M$ is a TM, and $M$ accepts $w$} } $$
Proof:
Assume towards contradiction, that it is decidable. Then we have a decider $D$ for $A_{TM}$, such that it is a TM decider and hence always halts.
where:
 we see that $T$ is a decider, and this is because $D$ is a decider.
 Now, consider the output of $T$ on $\lang T\rang$:
 If $T$ accepts $\lang T\rang$:
 then by definition of $D$ for $A_{TM}$, it means $D$ accepts $\lang T, \lang T\rang\rang$.
 however, by construction of $T$, it means $D$ rejects. Hence contradiction
 If $T$ rejects $\lang T\rang$:
 then by definition of $D$ for $A_{TM}$, it means $D$ rejects $\lang T, \lang T\rang\rang$.
 however, by construction of $T$, it means $D$ accepts. Hence contradiction
Therefore, $D$ as a decider cannot exist.
In fact, this is the same as doing a diagonalization:
Examples of Recognizable TM
The TM that tells me if a TM is not empty $$
NE_{TM} = { \langle M \rangle \text{$M$ is a $TM$ and $L(M)\neq \empty$} } $$
Proof/Construction
first attempt
 on input $\langle M\rangle$, where $M$ is a TM:
 For all strings $s \in \Sigma^*$ (this infinite loop is fine, as it is implementable for a recognizer)
 Run $M$ on $s$
 if accepts, accept
 if rejects, go on to the next string (might be stuck here)
where this does not work because:
 is implementable for recognizers, because our recognizer can run forever
 but the problem is that some cases of accepted string cannot be accepted by this construction, because the earlier string will loop forever
therefore, it does not recognize $NE_{TM}$
second attempt
 on input $\langle M\rangle$:
 for $i = 1,2,3…$
 for all $s \in \Sigma^*$, $\vert s\vert \le i$
 Run $M$ on $s$ for $i$ steps (hence this step is always finite)
 if accepts, accept (so this step can always be reached)
where this works because:
 the idea is that now, everything inside the loop will always be finite, so if a string will be accepted, then it will be accepted
 remember, the fact that this loop is infinite does not matter
This will work because:
first, on input $\langle M\rangle$
$\Rarr$ then $R$ does not accept
Undecidability and Unrecognizability
In fact, many languages are not recognizable.
 here, we show some examples of language that are not decidable/not recognizable
 and that many natural examples (in real life) are not decidable/not recognizable
Definitions
Definition:
Two sets $A,B$ have the same cardinality ($\vert A\vert =\vert B\vert$) if there is a bijection function (onetoone and onto function) $$
f: A \to B
$$
Definition:
A set $A$ is countable, if $A$ is either finite, or there is a bijection function: $$
f:\N \to A
$$ where:
 $\N$ is the set of natural numbers (which is infinite)
Fact:
 If $A \subseteq B$, then $\vert A\vert \le \vert B\vert$
 If $A \subseteq B$, and $B$ is countable, then $A$ is also countable.
 A countable or finite union of countable sets is countable.
For example,
 a countable set could be all even positive integers
 for a finite alphabet $\Sigma$, $\Sigma^*$ is countable
 we could use the lexicographical order, and order strings into length $i=0,1,2,3…$
 hence, we can map them to the set of natural numbers
Countability with TM
 The set of all TM is countable, because each TM can be encoded into a string, which is countable
 The set of all TM recognizable languages is countable.
 obviously, same for decidable
 basically, if you can go through everything in some defined order, then you can map it to real numbers.
Uncountable Languages
However, there are many things that are uncountable.
Using Diagonalization for TM Proofs
Claim:
 For all the reals between $(0,1)$, it is not countable.
Proof:
Assume towards contradiction that it is countable.
Then we can write a all members into a set, such that we get a onetoone mapping with the natural numbers: $$
{r_1, r_2,r_3, … }\text{, members of $r_i$, $\forall i \in \N$}
$$
Then suppose we have the following order for numbers:
where:
 it does not matter what number we pick for each $r_i$, they can be any number
Now, we construct a real number $r$ that is $100\%$ not in the set because it is different from every one of them by diagonalization
picking $r=0.d_1d_2d_3…$, such that $d_i \neq r_{i,i}$, where $r_{i,i}$ means the $ith$ digit of $r_i$.
 one edge case to care is to pick $d_i \in {1,2,…,8}$, so we take care of the case that $0.0999… = 0.1000…$, even if the digit is different
so the $r$ basically picks a different number than all the highlighted ones:
(e.g. you can have $r=0.253…$)
Therefore, we obtain a $r$ that is different from all the members of $r_i$, because $r$ is different from each one of them as at least one digit is different ($d_i \neq r_{i,i}$).
The idea of diagonalization works often when:
 the representation of each member of the set is infinite.
 As a result, you can always construct a member that is different from each of them by not picking the diagonal members
Remember, the idea is that if it is countable, then there is a onetonone and onto MAPPING (ORDERING) to the $\N$ (bijection), or whatever is countable.
Claim:
For finite $\Sigma$, the set of all languages $$
{ L \subseteq \Sigma^* } = \mathbb{P}(\Sigma^*)
$$ is not countable
Therefore, it means a lot of languages are not TMrecognizable (if all are recognizable, then this set would also be countable since a TMrecognizable language is countable)
Proof:
Again, assume towards contradiction that it is countable, so you can write ${L_1, L_2, L_3,…}$ is a way that covers all languages.
Since any language is essentially telling whether if a string is accepted, we construct:
break $\Sigma^$ into $\Sigma^={ s_1, s_2, s_3, … }$
 remember, that $\Sigma^*$ is countable, since each member ($s_i$) is finite and we can put them in lexicographical order
Then a language is just telling you the T/F output of each string $s_i$:
 notice that the representation of each $L_i$ is infinite, hence hints us of diagonalization.
and the diagonal way to construct $L$ is evident from above, as:
 Construct $L = { s_i \in \Sigma^* \vert s_i \notin L_i }$
 Then, there is at least one string that:
 was accepted by $L_i$ but not accepted by $L$
 was not accepted by $L_i$ but accepted by $L$
Therefore, we have found a $L$ that differs for every one of $L_i$, and it means it is not countable.
Therefore, the set of all languages is not countable.
 there exists some countable languages that are not TM recognizable (e.g. the $L$ we have constructed above)
Corollary:
 Since we know:
 the set of all languages is not countable
 the set of all TMrecognizable language is countable
 Therefore, there is at least one language (actually uncountable many of them) that is not TM recognizable.
Claim:
We say that the problem: $$
\overline{A_{TM}} \text{ is not recognizable}
$$
Proof:
This is purely logical
Since we know if $L$ is decidable, then $L$ and $\overline{L}$ are both recognizable.
 We know that $A_{TM}$ is recognizable, and $A_{TM}$ is not decidable
 Therefore, if $\overline{A_{TM}}$ is recognizable, then there is a contradiction as it means $A_{TM}$ is decidable
 since it would mean both $A_{TM}$ and $\overline{A_{TM}}$ being recognizable
 Therefore, we have a contradiction
Hence $\overline{A_{TM}}$ is not recognizable.
One consequence of the above is that:
Corollary:
 The class of TMrecognizable languages is not closed under complement.
Using Reduction for TM Proofs
Basically, the idea of reduction is that, we can construct (towards contradiction) a TM $R$, and use it to build up a TM that contradicts what we have known so far:
 $A_{TM}$ is not decidable
 $\overline{A_{TM}}$ is not recognizable
 etc.
Claim:
The halting problem: $$
\text{$Halt_{TM}$ = { $\lang M,w\rang$ $M$ is a TM, $M$ halts on $w$ }} $$ is undecidable
Proof:
Again, assume towards contradiction that $Halt_{TM}$ is decidable, so has a decider $R$.
Now, the reduction method is to build up towards something we know would contradict
we use $R$ to construct a decider $S$, such that
where:
 each step is a deciding (finite) step, so $S$ is a decider
Now, notice we have constructed a TM $S$ that is decidable for $A_{TM}$.
 but we know $A_{TM}$ is not decidable. Hence a contradiction.
In more formal detailed words:
Week 11
Using Reduction for TM Proofs (Continued)
Now, we come to the formal definition of reduction:
Reduction:
 We say a language $A$ is Turingreducible to a language $B$, written $A \le_T B$, if given a subroutine/decider/oracle that decides $B$, there exists a decider for $A$.
 i.e. if we know how to solve $B$ (harder part), then we also know how to solve $A$ (easier part).
 i.e. $A$ would be easier to decide than $B$, such that when we decide $B$, we have already decided $A$.
Theorem:
 If $A \le_T B$ and $B$ is decidable, then $A$ is decidable.
 i.e. if we can solve the hard part $B$, then we can decide the easy part $A$ using it.
Proof:
Equivalently, the opposite is also useful for proofs:
Equivalent Theorem:
 If $A \le_T B$ and $A$ is undecidable, then $B$ is undecidable.
 intuitively, if $A$ is the smaller part of $B$, and $A$ is undecidable already, then $B$ is hard/undecidable.
 logically, if we can solve $B$, then we should be able to solve $A$. But since $A$ cannot be solved, then $B$ cannot be solved.
Therefore, the above would be used as follows:
To prove $B$ is undecidable, we can prove using contradiction:
Assume $B$ is decidable (harder problem), then we get a decider for $B$.
Then we can use $B$ to solve the problem $A$, such that $A$ would be decidable as well
obviously, we need to know $A$ to be undecidable for contradiction
the construction of using $B$ to solve $A$ takes work
Since $A$ is undecidable, this contradicts the assumption that $B$ decides.
Hence $B$ is undecidable.
The idea is that if you can solve $B$, the harder problem, then you can use $B$ to easily solve $A$.
 e.g. if we can solve $B=Halt_{TM}$, then we can use to solve $A=A_{TM}$ (look at the proof in the previous section). But $A_{TM}$ is not decidable, then $Halt_{TM}$ is not decidable.
For example:

Claim: $$
E_{TM}={ \lang M\rang\, \, \text{$M$ is a TM, and $L(M)\neq \empty$} } $$ is not decidable.

Proof using reduction:
This will be a proof by reduction from $A_{TM}$, namely, we will show that $A_{TM}$ is Turing reducible to $E_{TM}$, i.e. $A_{TM} \le_TE_{TM}$.
 i.e. if we can decide $E_{TM}$, then we can also decide $A_{TM}$ (contradiction)
Assume towards contradiction that we have a decider $O$ that solves $E_{TM}$. Then we construct a $R$ that uses $O$ as follows:
where:
 the general format for $O$ (assumed decider) would always be:
 if accepts,
(here you need to implement)
 if rejects,
(here you need to implement)
 think of this as: we are given the library of $O$, such that we can use it right away
 if accepts,
This $R$ would be a decider for $A_{TM}$:
So we showed that we can use $E_{TM}$ to decide $A_{TM}$, such that $A_{TM} \le_T E_{TM}$. But $A_{TM}$ is not decidable, this means $E_{TM}$ cannot be decidable.
For example:

Claim: $$
EQ_{TM}={ \lang M_1, M_2 \rang \text{$M_1, M_2$ are TMs and $L(M_1)=L(M_2)$} } $$ is not decidable.
 this means that, in general, we cannot have a way to decide whether if the two programs are the same or not.

Proof Using Reduction:
We do it with a reduction from $E_{TM}$.
 We assume $O$ being a decider for the language $EQ_{TM}$.
 We want to use $O$ to decide $E_{TM}$, which we know is not decidable.
As a result, we have used $O$ to make a decider $R$ for the language $E_{TM}$, because:
so we have showed that $E_{TM} \le_T EQ_{TM}$, since $E_{TM}$ is not decidable, then $EQ_{TM}$ is also not decidable.
For example:

Claim: $$
REG_{TM}={ \lang M\rang \text{$M$ is a TM, $L(M)$ is regular} } $$ is not decidable

Proof using Reduction
Assume that we have a decider $O$ for $REG_{TM}$, then we can construct a decider for $A_{TM}$ using the algorithm below:
and we see this $R$ being the decider for $A_{TM}$ because:
 recall that $0^n1^n$ is not regular due to pumping lemma
One definition not explicitly addressed is as follows:
TM Implementable:
 An iteration “for $s$ in $S$, do…” is implementable if and only if:
 the set $S$ is countable
 each member of the set $S$ is computable (can be found using a decider TM/halting TM)
For example:
 for every $w$ in $\Sigma^*$, it works trivially and is implementable
 for every $\lang M\rang$ for all TM, where $M$ is a decider, this is not implementable
 the set of all TM is a countable set, but
 $\lang M\rang$ being a decider is not computable, because it is not decidable
In summary, we have the following UNDECIDABLE languages:
 $A_{TM}$
 $Halt_{TM}$
 $E_{TM}$

$EQ_{TM}$
 $REG_{TM}$
Rice’s Theorem
(Informal) Rice’s Theorem:
Any nontrivial language of the form: $$
{ \lang M\rang \text{M is a TM and $L(M)$ satisfies…}} $$ is not decidable
 i.e. we cannot decide the properties of a language of a TM machine.
Definition:
 $P$ is a nontrivial property of a recognizable languages:
 $P \subseteq {\lang M\rang \vert \text{$M$is a TM }}$ (defining property)
 then if $M_1, M_2$ are TMs and $L(M_1)=L(M_2)$, then either:
 both have the property $\lang M_1 \rang, \lang M_2 \rang \in P$
 or neither has the property $\lang M_1 \rang, \lang M_2 \rang \notin P$
 so the property is a property of the LANGUAGE of the TM, not of the implementation of TM
 and $P \neq \empty$, $P \neq { \lang M \rang \vert \text{$M$is a TM} }$ (defining nontrivial)
 i.e. $P$ is a proper, nonempty subset of all TM $\lang M\rang$
 i.e. there exists at least one $\lang M \rang \in P$, and at least one $\lang M \rang \notin P$
Rice Theorem:
 Let $P$ be a nontrivial property of recognizable languages. Then $P$ is not decidable.
 e.g. a nontrivial property could be $L(M)$ being regular
Therefore, it means:
 once you know a property $P$ of a TM language is nontrivial
 the language is not decidable by Rice’s Theorem
Proof:
We will use reduction from $A_{TM}$.
 the idea is that all those proofs use the same logic/construction.
Assume $O$ is a decider for (Turing machine of a language with) nontrivial property $P$. We will construct a decider for $A_{TM}$.
Now, we construct the decider $R$:
Now, we prove that this is a decider for $A_{TM}$:
Therefore, since $A_{TM}$ is undecidable, then $P$ is also undecidable (by contradiction).
So we see that the general idea for reduction to $A_{TM}$ is that:
 We can construct the $M’$ such that it:
 is empty and $\notin P$, if $M$ rejects or run forever on $w$
 is nonempty and $\in P$, if $M$ accepts $w$
 or vice versa for $P$ in the above
 Then, we can run $O(\lang M’\rang)$ to decide $A_{TM}$
Extension:
The proof above for Rice’s Theorem, which showed $A_{TM} \le_T P$ can be rephrased to show that $$
\begin{cases} \overline{A_{TM}} \le_M P & \text{if $\lang M_{\empty}\rang \in P$}
\overline{A_{TM}} \le_M \overline{P} & \text{if $\lang M_{\empty}\rang \notin P$} \end{cases}$$ which is mapping reduction, which shows the refined Rice’s Theorem:
 If $P$ is a nontrivial property of a recognizable language, then:
 if $\lang M_{\empty}\rang \in P$, $P$ is not recognizable (because $\overline{A_{TM}}$ is not recognizable)
 if $\lang M_{\empty}\rang \notin P$, $\overline{P}$ is not recognizable (because $\overline{A_{TM}}$ is not recognizable)
For example:

Question:
The language: $$
P={\lang M\rang \text{$M$ is a TM, and every string accepted by $M$ starts with $0$} } $$ is undecidable by Rice’s Theorem

Solution
We just need to prove that this $P$ is a property, and is nontrivial:
 This $P$ is a property because:
 if $M_1, M_2$ are TMs such that $L(M_1)=L(M_2)$, then it means $\lang M_1\rang \in P \iff \lang M_2\rang \in P$
 This $P$ is nontrivial because:
 we can find a $M$ such that $\lang M\rang \in P$
 e.g. accepted the string if the first character is $0$
 and we can find a $M$ such that $\lang M\rang \notin P$
 e.g. accepted the string if the first character is $1$
 we can find a $M$ such that $\lang M\rang \in P$
Therefore, by Rice’s Theorem, $P$ is not decidable.
 This $P$ is a property because:
When does Rice’s Theorem not work?
 If we have a property, but:
 it is a property of a DFA, graph, integers, etc. such that it cannot be rephrased as a TM property
 it is a property that depends on implementation
 e.g. ${ \lang M\rang \vert \text{$M$is a TM,$M$accepts$\epsilon$, and$M$has at least 17 states} }$
In summary, to show that a language is undecidable:
 Diagonalization
 Turing reduction
 Rice’s Theorem
Next, we move on to prove whether if $L$ is not recognizable.
Week 12
Unrecognizable Languages
One way to prove that $L$ is not recognizable would be:
 $L$ is not decidable, but $\bar{L}$ is recognizable, then $L$ cannot be recognizable
 this we already know, because if $L$ and $\bar{L}$ is recognizable, then $L$ is decidable.
One application of the above would be:
Claim:
 $E_{TM}$ is not recognizable
Proof:
 We have seen that $E_{TM}$ is not decidable, but $\overline{E_{TM}}$ is recognizable.
 in fact, for $\overline{E_{TM}}$, we have already proved with $NE_{TM}$, which is recognizable.
 Therefore, $E_{TM}$ is not recognizable
Other ways include:
 Diagonalization (possible for some languages)
 not on exam
 Refined Rice’s Theorem
 not on exam
 Mapping Reduction
 on exam
In fact, just TM reduction would not work for recognizability, because:
 if $A \le_T B$, and $B$ is recognizable, we cannot conclude that $A$ would always have a working recognizer:
 the idea is that if $w \in A$, but we called $B$ multiple times and $w$ hangs forever for $B$, then we have a trouble producing a correct recognizer.
 basically our program could not reach the end/hangs in middle
 however, the above behavior would work, if and only if:
 $B$ is run exactly once
 and $A$ outputs the same as $B$ (i.e. if $w$ hangs on $B$, then it should have also hangs on $A$)
 i.e. if we hang, we need to hang in the end
Formal Definition of Mapping Reduction
A language $A$ is mapping reducible to a language $B$, denoted as $A \le_{M} B$, if there exists a computable function $$
f:\Sigma^* \to \Sigma^*
$$ such that:
$w \in A \iff f(w) \in B$
i.e. every input in $A$ maps to $f(w)$ also in $B$, and every input not in $A$, then $f(w)$ also not in $B$.
For example:

Question: $$
\overline{A_{TM}} \le_M REG_{TM}
$$ where:
 recall that $REG_{TM}={ \lang M\rang \vert \text{$M$is a TM,$L(M)$is regular} }$

Solution:
We first figure out the mapping function $f$:
where:
 this $f$ is computable
then, we need to prove that:
 $w \in A \iff f(w) \in B$:
where:
 we basically mapped $\lang M,w\rang$ to $\lang M’\rang$
Theorem:
 If $A \le_M B$, then $A \le_T B$
 note that the reverse might not be true
 if $A \le_M B$, then $\bar{A} \le_M \bar{B}$
 follows trivially from the definition of mapping reducible
Proof:
Proving: if $A \le_M B$, then $A \le_T B$
Suppose $A \le_M B$. The we get a computable mapping function $f$, such that $w \in A \iff f(w) \in B$
Now, using that $f$, we need to show $A \le_T B$
 we basically construct decider for $A$ if we know $B$ is a decider and $f$ maps the inputs of $A$ to $B$:
and the above would be a decider for $A$, because:
Proof:
 Proving: if $A \le_M B$, then $\bar{A} \le_M \bar{B}$
The more relevant theorems are the follows:
Using Mapping Reduction for Recognizability:
 If $A \le_M B$, and $B$ is recognizable, then $A$ is recognizable
 If $A \le_M B$, and $A$ is not recognizable, then $B$ is not recognizable.
 i.e. if $A$ called $B$ exactly once in the subroutine, and $A$ outputs the same as $B$:
 then if $A$ is not recognizable, $B$ cannot be recognizable. Otherwise, we would have solved $A$ to be recognizable.
Proof:
If $A \le_M B$, and $B$ is recognizable, then $A$ is recognizable
Suppose $A \le_M B$. The we get a computable mapping function $f$, such that $w \in A \iff f(w) \in B$
If $B$ is recognizable, then we get a TM recognizer $M$ for $B$. We use the the $M$ and $f$ to construct a recognizer for $A$:
Then simply:
And $R$ would be a recognizer for $A$, because:
Proving “If $A \le_M B$, and $A$ is not recognizable, then $B$ is not recognizable.” would just work the same way as above.
 For instance, prove by contradiction.
Definition:
 $L$ is corecognizable if $\bar{L}$ is recognizable.
 there is no codecidable, because if $L$ is decidable, then $L$ is codecidable automatically.
For example:

Theorem: $$
ALL_{TM} = { \lang M\rang \text{$M$ is a TM, and $L(M)=\Sigma^*$ } } $$ is neither recognizable nor corecognizable

Proof:
First, we show that $ALL_{TM}$ is not corecognizable, namely, $\overline{ALL_{TM}}$ is not recognizable.
In particular:
where:
 we would like to map:
 if $M$ accepts $w$, then $M’$ accepts all possible strings
 if $M$ does not accept $w$, then $M’$ rejects at least one string
Now, we construct the mapping as follows:
and this mapping works as we wanted, because:
Therefore:
Next, we will show that $ALL_{TM}$ is not recognizable. This means we need to construct a mapping $\overline{A_{TM}} \le_M ALL_{TM}$:
Again, the idea is to map (the reverse):
 if $M$ accepts $w$, then $M’$ rejects at least one string
 if $M$ does not accept $w$, then $M’$ accepts all possible strings
and the mapping functions is as follows:
where:
 notice we cannot write: “if $M$ does not accept, accept $x$”. This is because we might deal with the infinite loop.
Now, we claim that this mapping function works:
where:
 notice that we just needed to construct $M’$ that rejects at least one string if $\lang M,w\rang \in A_{TM}$
 we would like to map:
In summary, the unrecognizable languages we know so far are:
 $\overline{A_{TM}}$
 but $A_{TM}$ is recognizable
 $ALL_{TM}$
In general:
If we have a TM reducibility, then: $$
A \le_T B \iff \bar{A} \le_T B \iff A \le_T \bar{B} \iff \bar{A} \le_T \bar{B}
$$
 since we have decidability, then we can easily flip them to still be decidable
If we have Mapping reducibility, we only have: $$
A \le_M B \iff \bar{A} \le_M \bar{B}
\[\]\bar{A} \le_M B \iff A \le_M \bar{B}
$$
 since now, we are talking about mapping function $f$
Week 13
Computation History (Extension)
This will be another technique we can use to prove:
 undecidability
This computation history technique will be useful if
 The language you need to prove to be recognizable/unrecognizable/decidable/undecidable might not have an obvious TM.
 Then, you cannot use what we have done before: TODO: finish this
 need to use this technique
Another technique to prove unrecognizability.
Using this, we can prove the following are not decidable
Introduction to Complexity Theory
Now, our question comes:
How efficiently can we solve problems that we know are decidable?
In the context of complexity, we will only look at decidable problems, for computing their running time/complexity.
Definition:
Given a TM $M$, the running time/time complexity of $M$ is a function: $$
f: \N \to \N
$$ where:
 $f(n)$ is the max number of steps taken by $M$ on any input of length $n$.
 e.g. $f(12)$ is the maximum number of steps that a decider $M$ can run on all input of length $12$
 a step would be defined as a delta function
Note:
 Running time is a function on input length
 It is to deal with worstcase for each input length
However, this means we need to figure out the formal definition of a TM machine, which will take quite a lot of time.
 so we need a way to still work with this even with high level definitions.
Definition:
We define a set of language: $$
\text{TIME}(t(n)) = { L L \text{ is decidable by a TM that runs in time } O(t(n)) } $$
 note that:
 e.g. $L \in \text{TIME}(n^3+n)$ if we can find one implementation of $M$ that has running time of $O(n^3)$.
Reminder:
For functions $f,g: \N \to \R^+$, we say that: $$
f(n)=O(g(n))
$$ if there exists a positive constant $c, n_0$ such that:
$\forall n \ge n_0$, $f(n) \le c \cdot g(n)$
basically, $g(n)$ grows at least as fast as $f(n)$ for $n$ being large
As a result:
From the definition of $\text{TIME}(t(n))$:
it follows that if $t(n)=O(t’(n))$, then: $$
\text{TIME}(t(n)) \subseteq \text{TIME}(t’(n))
$$
 i.e. obviously, if a TM $M$ runs in $t(n)$, then it also runs in $O(t’(n))$
Definition:
The class polytime computable languages is defined as: $$
P = \bigcup_{k=1}^\infty \text{TIME}(n^k)
$$
note that this means:
e.g. a language $L \in P$ if $L$ runs in time $O(n^{100})$
since $log(n) \subseteq O(n)$, we would also have $\text{TIME}(log(n)) \in P$
Now, this means:
 if we have different models of computation/implementation of TM, we might have different running time.
 therefore $\text{TIME}(t(n))$ is sensitive to our implementation.
 e.g. RAM (random access TM, being able to jump with its head) being transformed into an equivalent single tape, standard TM requires cubic overhead.
 However, the notion of polytime is robust to the above example, and many other variations.
 therefore, we will identify $P$ (polytime computable) with “efficient computation”.
 this might sound to be too broad, because if you have $n^{100}$, it does not sound very efficient
 however, it has the broad use that if $L \notin P$, then it is definitely not efficient
 even if we have $n^{100}$ to a high power, we then at least know it is solvable, and over time, could come up with a better solution.
 therefore, we will identify $P$ (polytime computable) with “efficient computation”.
In fact, one important property of $P$ would be:
 $P$ is invariant for all models of computation that are polynomially equivalent to the deterministic singletape Turing machine
 this covers, if not all, most decidable TMs we have, due to the strong ChurchTuring Thesis.
Strong ChurchTuring Thesis:
 Every reasonable/natural notion of algorithm (model of computation) can be simulated by a TM with polynomial overhead.
 i.e. all reasonable deterministic computational models are polynomially equivalent.
 That is, any one of them can simulate another with only a polynomial increase in running time
 this is more controversial than the original ChurchTuring thesis
 the original ChurchTuring thesis was that: every reasonable model of algorithm can be modelled by a TM machine.
 this is quite widely accepted
Therefore:
An algorithm is polytime if it’s polytime in terms of its input length: $$
f(n) \in P
$$ however:
 this means that the representation of an input matters:
 in turns out that the following representation only adds a polynomial overhead:
 binary
 hex
 decimal
 …
 except for unary representation
Fact:
\[however, if we use unary: so $n=N$, (unary) then:\]
If you have an algorithm that takes an integer as input, and it runs polytime of length of input.
This means that if input is a number $N$, then if you want to run polytime in terms of $N$ as well, we need to only take up to $log(N)$ bits to represent it.
proof sketch:
we know: $$
O(n^k), \text{where $k$ is some constant}
\[to represent a number in base $k$, and $k \neq 1$, we have\]N = k^n
$$
where:
 $k$ is the base. (e.g. base 2 for binary)
 therefore, if $n=log(N)$, we have: $$
f(N)=O(({log_k(N)}^k))=O(N^k)
f(N)=O(N^k)
$$
 still works? TODO
For example:

Question
For a gradeschool algorithm for checking if a given input is prime, is it a polytime algorithm?
 assuming that a number $a$ divides another $b$ can be done in polytime, which is true.
 assuming the input format is not unary

Solution
A simple algorithm would be:
where:
 first, prove that the above is implementable (each step is computable), which is in this case
Running time in $\vert x\vert$:

Each Iteration of loop is polytime in $\vert x\vert$ (since division is polytime)

the number of iterations is $\sqrt{x}$

since $x = 2^{\vert x\vert }$ in binary

or $x=k^{\vert x\vert }=2^{log_k(2)\cdot \vert x\vert }=2^{O(\vert x\vert )}$, for $k$ being the base except for unary

hence: \(\sqrt{x}=2^{O(x)/2}=2^{O(x)}\) Therefore, the runtime for this algorithm is in fact exponential: \(\text{TIME}=2^{O(x)}\cdot O(x^k) \notin P\)

However, for this problem of $\text{PRIME}$, it turns out that there is actually a polytime algorithm, which is very sophisticated so not shown here.
 so in fact, $\text{PRIME} \in P$ even if the above construction doesn’t work to be in $P$.
Examples of PolyComputable
Basically, from now on we are:
 only talking about decidable languages
 i.e. by default, our algorithm will be boolean outputs
 however, in the end it does not matter, as long as it halts
 algorithms are efficient if they run within polytime
P vs NP
 P: intuitively, language that can be decided in polytime
 NP: intuitively, language that can be verified in polytime
Verifier Definition
A verifier for a language $A$, is a TM (algorithm) $V$, s.t. $V$ gets input $(x,c)$ and outputs accept/reject, where: $$
\[ i.e. $V$ is a verifier for $A$, if $A$ consists of all strings $x$ where there **exists some $c$ (certificate/"solution")**, such that $V(x,c)$ accepts.  so\]
A={ x \text{$V(x,c)$ accepts for some $c$} } \begin{cases} x \in A \Rarr \exists c, & V(x,c)\,accepts\\ x \notin A \Rarr \forall c, & V(x,c)\,rejects \end{cases} $$
Polytime Verifier
 A polytime verifier for a language $A$ is a verifier $V$, such that $V$ runs in time $O(\vert x\vert ^k)$, i.e. polytime in length
Therefore:
Polytime Verifiable
 A language $L$ is polytime verifiable if $\exists$ a polytime verifier $V$ for $L$.
Hence the formal definition of P and NP are:
Formal Definition of P and NP $$
\[\]
NP = { L \text{$L$ is polytime verifiable in length of input} }
P = { L \text{$L$ is polytime computable in length of input} } $$
For example:

Consider the language: $$
Composite={ \lang x\rang \text{$x$ is not prime} } $$ where:
 I claim that this language is NP

Proof:
There is a simple polytime verification algorithm:
 recall that the verifier will get $\lang x, c\rang$
 where $c$ would be some “proof/solution”, which might or might not be correct
 if verifier works:
 if $x$ is not prime then there will be some “solution” $c$ that works
 if $x$ is prime, then all “solutions” $c$ will not work
Now, we need to show 2 things:

$V$ runs in time $poly(\vert x\vert )$
 every step is in polytime

and prove that such a construction works
For example:

Question
Is the algorithm $PRIME$ also NP?

Solution
It turns out yes, because we know that $PRIME$ can be solved in polytime, so we just need to use that solution wired in:
where:
 we basically ignored $c$, and solved it directly.
 but this is like a “trick” $NP$ problem, because every solvable $P$ is also a $NP$ problem.
Claim: $$
P \subseteq NP
$$
 where:
 we basically see this in the example above
Proof:
If $L \in P$, there is a polytime decider for $L$
Claim:
However, this is not known: $$
NP \subseteq P \,\,??\,\,NP \nsubseteq P
$$ where:
 if $NP \subseteq P$, then $P=NP$
 one consequence would be that cryptography is dead
 if $NP \nsubseteq P$, then $P \neq NP$
Definition
\[\text{EXP} = \bigcup_{k=1}^\infty \text{TIME}(2^{k\cdot n})\]
 exponential time is defined as:
Claim: $$
P \subseteq NP \subseteq \text{EXP}
$$
Proof:
We have already seen that $P \subseteq NP$
We then just need to show $NP \subseteq \text{EXP}$
 If $L \in NP$, then there exists a polytime verifier $V$ for $L$, such that $V(x,c)$ runs in time $O(\vert x\vert ^k)$
 note that, since $V(x,c)$ is polytime, is immediately made $\vert c\vert \le O(\vert x\vert ^k)$. Because if it were, then simply reading the input $c$ would have exceeded the polytime, then this contradicts the verifier $V$
 Let us fix a constant $a$, such that $V$ runs in $a\vert x\vert ^k$ time, for some constant $a,k$.
 Then, we just check all $c$ up to $\vert c\vert \le a\vert x\vert ^k$
where:
 this would be loop for all numbers $c$ up to $2^{O(\vert c\vert )}=2^{a\cdot \vert x\vert ^k}= O(2^{\vert x\vert ^k})$ times
where, the runtime analysis would be:
Additionally:
Claim: $$
P \subsetneq \text{EXP}
$$
 this is provable. However, we will not prove them
Consequently, we now do not know where $NP$ fits:

the following cases could all be possible $$
\begin{cases} P = NP \subsetneq EXP
P \subsetneq NP \subsetneq EXP
P \subseteq NP = EXP
… \end{cases}$$
Week 14
NP Complete Class
Heuristics:
 NP Complete Class are languages that has the property: if the language $L$ is in NP Complete, then we can use it to show that:
 if $L \in NPComplete$, and that language can be solved in polynomial time $P$, then $P = NP$
The NP Complete Class plays such a role:
where:
 $\text{NPComplete}$ is of course in $NP$, and we have only two possible cases
 if we show that $L \in \text{NPComplete}$ is also $L \in P$ (can be solved in polytime) then every other language in $NP$ would also be in $P$
 intuition: those $L \in \text{NPComplete}$ would be languages that are “hardest” in $NP$. If you can solve the hardest one, then you can use it to solve the easier ones/all in $NP$.
Using Reductions
Definition
A language $A$ is polytime reducible to $B$, ($A \le_P B$), if $\exist$ polytime computable function $f: \Sigma^: \Sigma^$, such that $$
x \in A \iff f(x) \in B
$$ where again we are just mapping inputs from one into another:
 i.e. this is basically the same as mapping reduction
Theorem
 If $A \le_P B$ and $B \in P$, then $A \in P$.
 i.e. then by definition, we can map $A$ to $B$, and since $B$ is easy (polytime solvable)
 If $A \le_P B$, and $B \in NP$, then $A \in NP$
Proof
For If $A \le_P B$ and $B \in P$, then $A \in P$.
Suppose $A \le_P B$, then there is a polytime algorithm $F$ computing a function $f$ such that $$
x \in A \iff f(x) \in B
$$ Suppose $B \in P$, then there is a polytime algorithm $M_B$ that decides $B$.
I use the two elements to construct an algorithm that decides $A$
 then the construction would be simply using $M_B$ and the function $f$
Now, we need to show that:
 the construction is in polynomial time
 since every step above is polytime, and there is no loop
 show that the construction decides $A$:
This completes the proof.
Proof
If $A \le_P B$, and $B \in NP$, then $A \in NP$
This is simple, in fact similar to above
Suppose $A \le_P B$, then I get a polytime decidable function $f$ $$
w \in A \iff f(w) \in B
$$
Suppose $B \in NP$, then I get a polytime verifier $V_B(x,c)$ that verifies $B$.
I construct a verifier for $A$:
$V_A:$ On input $\lang w, c\rang$,
 Compute $f(w)$
 Run $V_B(f(w), c)$
 output same
Now, I prove that this is polytime and this works.
 this is skipped
Note:
 all that matters for a verifier is that, for an input $w \in L$, there $\exists c$ (no matter what format) that works
What is the use of this:
 This means that if a language $L$ is harder or equal than every other language $\in NP$, and we show that that language can be solved in polytime, then $P = NP$.
Definition
 A language $L$ is $\text{NPHard}$ if $\forall A \in NP$, $A \le_P L$.
 so that if we can solve $L$ in polytime, then all the $A$ can be reduced to solve in polytime
Definition of NP Complete
 A language $L$ is $\text{NPComplete}$ if:
 $L \in NP$
 $L \in \text{NPHard}$
Theorem:
 If $A, B$ are $\text{NPComplete}$, then $A \le_P B$ and $B \le_P A$
 this follows from the definition of NPCompleteness
Proof
 If $A,B$ are both NPComplete, then
 $A,B \in NP$
 $A,B$ both NPHard
 I show first $A \le_P B$:
 by definition of NPHard, since $A \in NP$ and $B$ is NPHard, then I can reduce $B$ to $A$
 similarly, show $B \le_P A$:
 by definition of NPHard, since $B \in NP$ and $A$ is NPHard, then I can reduce $A$ to $B$
Theorem:
 If $A \le_P B$, and $A$ is $\text{NPHard}$, then $B$ is $\text{NPHard}$
 i.e. if $B$ is harder (at least as hard) as $A$, which is already $\text{NPHard}$, then $B$ is $\text{NPHard}$
 this will be mostly used for proving NPComplete problems
 so that we first how $B \in NP$, then we use this for reduction
Proof
Suppose \(A \le_P B\). Then I can have a polytime computable function $f$ such that: $$
x \in A \iff f(x) \in B
$$
Suppose $A$ is $\text{NPHard}$
Now, to prove $B$ is $\text{NPHard}$:
 Let any arbitrary language $C \in NP$
 need to show $C \le_P B$
Because $A$ is $\text{NPHard}$, then $C \le_P A$. This means that I get a polytime computable function $g$: $$
w \in C \iff g(w) \in A
\[Now, since I know $C \le_P A$ and $A \le_P B$, I can show $C \le_P B$ by constructing the **mapping (using transitivity)**\]w \in C \iff f(g(w)) \in B
$$
this completes the proof
This means that, to find another language is $\text{NPHard}$:
 reduce from $Ham_{cycle}$ to that language $L$
 then $L$ is also $\text{NPHard}$
But what is the hardest was
 how to prove the first language (e.g. $Ham_{cycle}$ here ) is $\text{NPHard}$
In general, to prove a language is $\text{NPComplete}$
 Show that the language $L \in NP$, by showing a polytime verifier
 Show that, for some $A \in \text{NPHard}$, I can reduce $A\le_P L$
 i.e. $L$ is hard er (at least as hard as) the $\text{NPHard}$ problems
Claim:
 $Ham_{cycyle}$ is $\text{NPComplete}$
 proof skipped.
Closure Properties
Theorem:
 $P$ is closed under completement
Proof:
 If $L \in P$, then there is a polytime decider $M_L$ for $L$.
 I just flip the accept/reject of the decider to get a polytime decider for $\overline{L}$
However,
 we do not know whether $NP$ is closed under complement
 i.e. don’t know whether $NP=CoNP$
 e.g. come up with a verifier to tell whether a graph $G$ does not have a Hamiltonian cycle
Examples of NPComplete
For example:

Question
Given an undirected graph $G$, tells whether $G$ has a Hamiltonian Cycle (a cycle that goes through each nodes exactly once): $$
Ham_{cycyle}={ \lang G\rang \text{$G$ is an undirected graph and $G$ has a Hamiltonian Cycle} } $$ this problem is NP

Proof
Consider the following algorithm $V$:
 where it basically it needs $c$ to contain all vertices
 assuming $c$ contains the “solution” as an ordered list of vertices

where now, we just need to check if we can follow the sequence of vertices given the $E$ edge configuration
Therefore, we have showed that this is a NP.
(In fact, $Ham_{cycle}$ is NPComplete)
 where it basically it needs $c$ to contain all vertices
However
 We do not know if $Ham_{cycyle} \in P$, but we do know that $Ham_{cycyle} \in EXP$
Proof Sketch for $Ham_{cycyle} \in EXP$
 In short, we need to give an algorithm that solves it in $EXP$
 The idea is simple, try all permutations of $c$, which has $\vert V\vert !$ of it, which is exponential, and use the $V$ we had above
Another Example

Question
The sudoku problem $$
sudoku:{ P \text{$P$ is a $n^2 \times n^2$ grid, partially filled with integers from $1$ to $n^2$, such that there exists a valid solution} } $$ is NPComplete.

Solution
Skipped
Another Example

Question
The problem $$
\text{SUBSETSUM}: { (x_1, x_2, …x_n, t) x_1, …,x_n,t \text{ are integers such that there exists nonempty $S \subseteq {1,2,…,n}$ such that $\sum_{i\in S} x_i = t$} } $$ where:
 since $S$ is a set, the index inside cannot be repeated.
 However, the numbers could be repeated, such that $x_2=x_5$, for example
this problem is NPComplete

Solution
Another Example

Question
The problem $$
SAT={ \varphi \varphi \text{ is a satifiable Boolean formula} } $$ where:
 a boolean formula includes
 variables $(x_1, x_2, …x_n)$
 operators $(\land, \lor, \neg)$
 $\varphi$ is satisfiable if there exists a $T/F$ assignment to each variable, such that the output $\varphi$ will be $True$
this problem is NPComplete
 a boolean formula includes
Another Example

Question
The variation of $SAT$ $$
\text{CNFSAT}={ \varphi \varphi \text{ is a CNF of a Boolean formula and is a satifiable} } $$ where:
 a CNF (conjunctive normal form) has:
 variables $(x_1, x_2, …x_n)$
 literals $(x_i, \overline{x_i})$
 clauses: composed of literals $l_i$
 takes the form $(l_1 \lor l_2 \lor l_3 …)$
 only $\lor$ (OR) are included
 $\varphi$ is a CNF form is it is composed of clauses $C_i$
 $\varphi = C_1 \land C_2 \land C_3 …$
 only $\land$ (AND) are included between clauses
 for example:
 $(x_1 \lor \overline{x_2}) \land (x_2 \lor \overline{x_3} \lor x_4)$ is a CNF
this problem is also NPComplete
 a CNF (conjunctive normal form) has:
Another Example

Question:
The problem/variation of $SAT$: $$
3SAT = { \varphi \varphi \text{ is a 3CNF of a Boolean formula and is a satifiable} } $$ where:
 $3CNF$ is basically the CNF of a boolean formula, except that:
 each clauses of $\varphi = C_1 \land C_2 …$ consists of exactly $3$ literals
is also NPComplete
 $3CNF$ is basically the CNF of a boolean formula, except that:
Examples using NPComplete Reductions
Theorem
 $SAT$ is NPComplete
 $SAT \le_P \text{CNFSAT} \le_P \text{3SAT}$
 as a result, all of them is NPHard
 in fact, proving them to be $\in NP$ is easy (easy to built a polytime verifier for them)
 therefore, from the above, all of them can be proved to be NPComplete
Proof
$\text{CNFSAT} \le_P \text{3SAT}$
Remember, we are essentially doing a mapping reduction, so all we need to show is that
 given a $\varphi \in CNF$, I can convert/map it using a polytime computable $f$ to $\varphi’ \in$ $3SAT$ form, such that
 if $\varphi$ satisfiable $\iff$ $f(\varphi)=\varphi’$ satisfiable
Now, the function would be:
Therefore, since this function is polytime computable, and it works, we have completed the proof.
NonDeterministic TM with Complexity
Definition
A nondeterministic TM machine runs in time $t(n)$, if for every input $x$ of length $n$, all possible computations on $x$ take at most $t(n)$ steps.
 i.e. the worst guess you can have takes $t(n)$ steps, so that the machine always halts in $O(n^k)$
Definition
A nondeterministic time: $$
\text{NTIME}(t(n))={ L \exists \text{ a NTM $N$ that decides $L$, and runs in time $O(t(n))$} } $$
Theorem $$
NP=\bigcup_{k=1}^{\infty}\text{NTIME}(n^k)
$$
 i.e. to show that a language is in $NP$, instead of giving a verifier, we could show by giving a nondeterministic TM that actually solves it, with longest path being the order of $O(n^k)$.
Note:
 this only works for proving that a language is in $NP$.
Proof Sketch
forward direction $NP\implies \bigcup_{k=1}^{\infty}\text{NTIME}(n^k)$
The idea is that if $L \in NP$, then I can construct a NTM that runs in polynomial time:
If $L \in NP$, then I get a verifier $V$ that verifiers $L$ in polynomial time
Construct:
N = “On input $w$ of length $n$:
 Nondeterministically select string $c$ of length at most $O(n^k)$
 Run $V$ on input $\lang w, c \rang$
 this is in fact same as testing all possible $c$ one by one, then running $V$ for a deterministic TM
 If V accepts, accept ; otherwise, reject .”
where:
 basically you are just trying different $c$ simultaneously, on the premise that $\exist c$ that works
 by definition of $\text{NTIME}(n^k)$, this works because the longest branch will have ran the same time as $V \in O(n^k)$
 however, if we convert it to a deterministic TM, then the total run time would be $EXP$, which makes sense since $NP \subseteq EXP$
backward direction $\bigcup_{k=1}^{\infty}\text{NTIME}(n^k) \implies NP$
Assume you have a NTM $N$ deciding $L$, then construct verifier $V$ as follows
$V$ = “On input $\lang w, c \rang$, where $w$ and $c$ are strings:
 Simulate $N$ on input $w$, treating each symbol of $c$ as a description of the nondeterministic choice to make at each step (as we did for the proof of simulating multitape TM to singletape TM).
 i.e. $c$ tells us which branch to take
 If this branch of $N$’s computation accepts, accept ; otherwise, reject .”
Examples of NP using NTM
Though building a verifier $V$ that proves $NP$ would usually be more easier, this construction of NTM is sometimes useful. The overall idea is:
 the NTM $N$ will nondeterministically try all $c$
 if there is a certificate $c$ that works, then $\vert c\vert \in O(n^k)$
 hence we are just testing all branches up to depth $O(n^k)$, which would be $\in \text{NTIME}(n^k)$
For example

Question
The problem $\text{SUBSETSUM}$ is in NP, using a NTM.

Solution
$N$ = “On input $\lang S, t \rang$:
 Nondeterministically select a subset $c$ of the numbers in $S$.
 Test whether $c$ is a collection of numbers that sum to $t$.
 If the test passes, accept ; otherwise, reject .”
In fact, the verifier would also be simple, and perhaps more intuitive
$V$ = “On input$\lang \lang S, t \rang, c\rang$
 Test whether $c$ is a collection of numbers that sum to $t$.
 Test whether $S$ contains all the numbers in $c$.
 If both pass, accept ; otherwise, reject .”
However, this NTM technique, when applied with depth, could be used for proving NPComplete problems without reduction
Example of NPComplete using NTM
CookLevin Theorem
 $SAT$ is NPComplete
Proof Sketch
 This uses the nondeterministic TM
 And they proved that $\forall L \in NP$, $L \le_P SAT$
 if $L \in NP$, then there is a NTM $N$ that decides $x \in L \iff$ $\exists$ an accepting computation/configuration of $N$ on $x$
 let $\varphi$ be the formula that is satisfiable $\iff$ $\exist$ an accepting sequence of configuration (an accepting branch in configuration tree)
More detailed Proof Sketch
 Showing that $SAT$ is in $NP$ is easy, and we do so shortly. The hard part of the proof is showing that any language in $NP$ is polynomial time reducible to $SAT$.
 To do so, we construct a polynomial time reduction for each language $A$ in $NP$ to $SAT$. The reduction for $A$ takes a string $w$ and produces a Boolean formula $\varphi$ that simulates the NP machine for $A$ on input $w$. If the machine accepts, $\varphi$ has a satisfying assignment that corresponds to the accepting computation. If the machine doesn’t accept, no assignment satisfies $\varphi$.
 Therefore, $w$ is in $A$ if and only if $\varphi$ is satisfiable.
 Actually constructing the reduction to work in this way is a conceptually simple task, though we must cope with many details. A Boolean formula may contain the Boolean operations AND, OR, and NOT, and these operations form the basis for the circuitry used in electronic computers. Hence the fact that we can design a Boolean formula to simulate a Turing machine isn’t surprising. The details are in the implementation of this idea.
Tips
General Tips:
 During proofs, the length of a string/word could be a useful parameter
 e.g. assuming $w$ is a word accepted having the smallest length in $L$.
Tips on Proving Regularity with DFA/NFA
If you are given a regular language, and asked to prove that some modification of that language is also regular:
 List out the constraints/rules you have to fulfill
 e.g. at the second state, I should have had stored the information/string of $a\circ b$.
 Try to sum them up in one line
 Using the modification summarized above, draw the modified DFA/NFA
 one thing that is usually a correct step is to make sure the original DFA/NFA is intact
 this is to take care that they might be some potential loops/cycles
 more than one connection to a specific state
 one thing that is usually a correct step is to make sure the original DFA/NFA is intact
 Write out formally
Tips on Proving If and Only If
If proving the claim, $\text{Claim 1} \iff \text{Claim/Condition 2}$
 Show that, if I have $\text{Claim/Condition 2}$, then putting it into $\text{Claim 1}$, it is true.
 this part usually is trivial, and sometimes can be proved by words/logics
 Conversely, if I have $\text{Claim 1}$, then it implies/derives to $\text{Claim 2}$.
 this part usually needs the help of variables, such as a word $w$, and splitting it up into $w_1 \circ w_2$, and etc.
Tips on Converting Regex to NFA
If given a regular expression, and need to convert to NFA:
 For example, if the alphabet is ${a,b}$
 Start with drawing NFAs $L(a)$ and $L(b)$ for each symbol in the alphabet as the base case
 Draw a NFA with regular expression only using one of $\circ, \cup,*$. Such as $(a \cup b)$
 Doing the above recursively
 here, you could either choose to process it in one go, straight from left to right
 or, you could split the expression into smaller ones such as $R_1 \circ R_2$, and then assemble them
Tips on Writing Regular Expressions
If given a language in words, you need to write a regular expression for it:
 Break the conditions down to smaller conditions.
 for example, no two consecutive symbols are the same $=$ alternating + all 1s + all 0s
 Write out the smaller conditions in diagram/figure out the expression
 Assemble the expressions together
Tips on Proving Regular Expressions
If given a regular expression $\alpha$, and we need to prove some properties of it:
 Usually, using Induction will work
 Show that it is true for $\vert \alpha\vert =1$
 $L(a)$
 $L(\epsilon)$
 $L(\empty)$
 Assume that the property is true for $\vert \alpha\vert = k$
 Show that, for $\vert \alpha\vert =k+1$, we can have:
 $\alpha = (\alpha_1 \circ \alpha_2)$, $L(\alpha) = L(\alpha_1)\circ L(\alpha_2)$ still has that property
 notice that $\vert \alpha_1\vert ,\vert \alpha_2\vert$ are now smaller than $k$, so assumption from (3) can be used
 $\alpha = (\alpha_1 \cup \alpha_2)$, $L(\alpha) = L(\alpha_1)\cup L(\alpha_2)$ still has that property
 $\alpha = (\alpha_1)^$, $L(\alpha) = L(\alpha_1)^$ still has that property
 for this, you can split the word into $w = w_1 \circ w_2 \circ…\circ w_n$, where each $w_i \in L(\alpha_1)$, and $n$ can grow up to infinity by definition
 $\alpha = (\alpha_1 \circ \alpha_2)$, $L(\alpha) = L(\alpha_1)\circ L(\alpha_2)$ still has that property
If given a regular expression $\alpha$, and we need to prove that it is the same as $\beta$
 Unless you can manipulate the expression directly, you will need to do the following
 Prove that $\alpha \subseteq \beta$ (strings generated from $\alpha$ is a subset of strings generated from $\beta$)
 sometimes, one of the two will be straightforward and you can say by words
 Prove that $\beta \subseteq \alpha$
 usually, along the way you will need to create several other $\subseteq$ sets for reaching the conclusion
Tips on Constructing ContextFree Grammar
If given a language that has some requirement on the length of the string or its subsubstrings, we can think about the following:
 Split the question into its essential, smaller parts
 e.g. if we needed the string $0^i1^{i+j}0^j$, then we could use the idea:
 first part is to get $0^i1^i$, then get $1^j0^j$,
 so you have $\begin{cases}S \to UT\ U\to 0U1 \vert \epsilon \ T\to 1T0 \vert \epsilon\end{cases}$
 e.g. if we needed the string $0^i1^{i+j}0^j$, then we could use the idea:
 Since each variable is contextfree, what can a variable know?
 e.g. $w$ starts with a $1$ and has a middle element $0$
 this implies the $1$ and $0$ needs to be hardcoded, since the variable has no idea if it is at the front or in the middle of the string
 e.g. $w$ starts with a $1$ and has a middle element $0$
 What must happen when you add exactly one more variable? (Think about it in terms of symmetry, what part does not matter (could be any symbol), and what part matters)
 e.g. $w$ starts with a $1$ and has a middle element $0$ and is of odd length
 when one more variable is added, it must exactly add one symbol to the front and back:
 $\begin{cases}S \to 1T0 \vert 1T1\ T\to UTU \vert 0 \ U \to 1\vert 0\end{cases}$
 where $T\to UTU\vert 0$ populates the string to have $UUU…0UUU…$
 e.g. $w$ starts with a $1$ and has a middle element $0$ and is of odd length
Tips on Constructing TM Reduction
If we need to reduce a language $B$ to a language $A$:
 if reducible, get a decider for $B$.
 know the input of $A$ and $B$
 show towards contradiction that you can construct decider for $A$, which is not decidable
 i.e. having a decider for $B$ means too much power
Remember, the construction will look like:
R: on <input of A>:
1. Construct <input of B>:
 ...
2. Run decider for B on <input of B>:
 if B accept, ...
 if B reject, ...
 First think about the input of
<input of A>
 e.g. reduce $E_{TM}$ to $A_{TM}$, (construct a decider for $A_{TM}$) then input is for $A_{TM}$
 Think about input of
<input of B>
, which is the assumed decider  Think about what we need to map:
 e.g. if we need to reduce $E_{TM}$ to $A_{TM}$
 map $\lang M, w\rang \in A_{TM}$ to
 construct $\lang M’\rang \in E_{TM}$ or
 construct $\lang M’\rang \notin E_{TM}$ (easier in this case)
 map $\lang M, w\rang \notin A_{TM}$ to
 construct $\lang M’\rang \notin E_{TM}$ or
 construct $\lang M’\rang \in E_{TM}$ (easier in this case)
 map $\lang M, w\rang \in A_{TM}$ to
 e.g. if we need to reduce $E_{TM}$ to $A_{TM}$
Tips on Diagonalization for Undecidability
It will only work nicely on some problems, and the idea is just three steps:
 Assume that a decider exists. Then we can get a table.
 Formalize the input/output in to a table
 Flip the diagonal, and verify this is not in the table
 prove that what you produced is constructable/exists
For example:

assume that such an oracle exists, then we have a decider and a table

To prove $A_{TM}$ is not recognizable, we first constructed a table
 where instead of putting $w$, we used $\lang M\rang$

Then we flipped it, and verify it is not in the table

We show that the new program exists, by constructing it
Tips on Constructing Mapping Reduction
Think in the reverse. If we have $L_1 \le_M L_2$:
 We need to construct a $f$ that:
 $input \in L_1$ $\implies$ $output \in L_2$
 $input \notin L_1$ $\implies$ $output \notin L_2$
Appendix
List of Known TM Languages
So far, we have encountered the following:
TM recognizable
 $A_{TM} = { \langle M,w\rangle \vert \text{$M$is a TM, and$M$accepts$w$} }$
 $NE_{TM}$
TM decidable
 $A_{DFA}$
 $A_{NFA}$
 $E_{DFA}$
 $EQ_{DFA}$
 $A_{CFG}$
 $A_{regular}$
TM not recognizable
 $\overline{A_{TM}}$
 $E_{TM}$
 $ALL_{TM}$
 $\overline{ALL_{TM}}$
TM not decidable
 $A_{TM}$
 $Halt_{TM} = {\lang M,w\rang\vert M \text{ halts on input$w$} }$
 $E_{TM}$
 $EQ_{TM}$
 $REG_{TM}$
 $EQ_{CFG}$
 Language that depends on the nontrivial property of a $TM$ (recognizable) machine