online advertising

Monday, March 20, 2017

NP completeness (3) - Show that Set Cover is NP complete

Problem 1 - Prove that Set Cover is in NP.

We will use the set ids as a certificate. The number of set ids is at most $ n $ and therefore the size of the certificate is $ O(n) $ and is polynomial in the size of the problem.

To verify the solution, we can check that there are $ k $ set ids, and their union is the universal set. Checking there are $ k $ set ids take $ O(1) $ time, and checking if the union covers the set take at most $ O(k) \times O(|P|) $ time, where $ |P| $ is the problem size, therefore, we can check if a certificate is valid in polynomial time.

Therefore k-SET Cover is in NP.

Problem 2 - Show that Set-Cover in NP complete

In problem 1 - we have already shown Set-Cover is in NP, so all we have to do is to reduce Vertex Cover to Set Cover. The key idea is that we map an edge to be an element in the set cover problem, and a node into a set of edges that it connects to. Now, the problem of choosing a subset of node to cover all edges becomes choosing the set of edges corresponding to the node to cover the edges, the latter is obviously a set cover problem.

Suppose there exists a set-cover of $ k $ sets, we know each of these $ k $ set corresponding to exactly one node. We claim that the nodes corresponding to the set is indeed a vertex cover. We know that the $ k $ set cover all edges, in particular, for any edge $ e $, there exists at least one set $ x $ cover it. The node $ n $ corresponding to $ x $ must have $ e $ connected to it, so $ e $ is covered. Because that must be true for all edges, we conclude when the set cover problem found a set cover, we also find a vertex cover.

Suppose there does not exist a set-cover of $ k $ sets, and suppose there does exist a vertex-cover of $ k $ vertices, we will derive a contradiction. That will conclude our proof because we show that if set-cover reports there is no set-cover, there is no vertex cover. Consider the sets associated with the $ k $ vertices, this must include all edges because it is a vertex cover, therefore, there does exist a $ k $ set-cover, so we reached the desired contradiction.

Last but not least, the mapping from a node to a set can spend at most $ |p|^2 $ time (in the worst case of a complete graph), therefore the reduction takes polynomial time.

To summarize, we have shown:

  • Set cover is in NP - in problem 1
  • A vertex cover problem instance can transform into a to set cover problem instance in polynomial time - the last part above
  • The transformation preserve yes instances - the first proof
  • The transformation preserve no instances - the second proof

Therefore we gladly conclude set-cover in NP complete!

Wednesday, March 8, 2017

NP completeness (2) - Show that k-Clique is NP complete

It is easy to show k-Clique is in NP, given a k-clique, we can use it as the certificate. The size of the certificate is at most the number of nodes, which is polynomial of the input size. We need to verify that it is indeed a k-clique, which can be simply checked by testing all pairs are edges, which take times $ O(E) $ and is certainly polynomial in time in the input size.

To prove that it is NP complete, we reduce the independent set problem to it.

Suppose we need to solve the problem that whether or not a graph of $ |V| $ node contains an independent set of $ k $ nodes, we can solve the problem by asking whether or not the complement of the graph has a clique of $ k $ nodes. Note that the complement $ \bar{G} $of a graph $ G $ is defined by having the same vertex set, $ u $ and $ v $ are connected as an edge in $ \bar{G} $ if and only if they are not an edge in $ G $

Suppose there exists a clique of $ k $ nodes in $ \bar{G} $, we conclude that the same set of nodes is an independent set of $ G $. This is obvious because those nodes are all directly connected in $ \bar{G} $, and therefore not connected to each other in $ G $.

Suppose there does not exists a clique of $ k $ node in $ \bar{G} $, we conclude that there does not exists an independent set of $ k $ nodes, for if it does, it would have been the $ k $ clique in $ \bar{G} $.

With that, we can reduce the independent set problem to $ k $ clique, therefore $ k $ clique is NP complete.

NP completeness (1) - Vertex Cover

Problem:

Prove that Vertex Cover is NP complete.

Solution:

We will assume the independent set problem is NP complete.

It is easy to show Vertex Cover is in NP, given a vertex cover, we can use it as the certificate. The size of the certificate is at most the number of nodes, which is polynomial of the input size. Given the certificate, we can simply iterate through the set of all edges and check that the given node indeed cover all edges, such an algorithm that $ O(E) $ time and is certainly polynomial in time in the input size.

To prove that it is NP complete, we reduce the independent set problem to it.

Suppose we need to solve the problem that whether or not a graph of $ |V| $ node contains an independent set of $ k $ nodes, we can solve the problem by asking whether or not we have a vertex cover of $ |V| - k $ nodes.

Suppose there exists a vertex cover of $ |V| - k $ nodes, we claim the complement of the vertex cover is an independent set. Indeed, if the complement is not an independent set, then there exists an edge such that both end points is in the complement, that means both end points were not in the vertex cover, of course that is impossible because a vertex cover must cover all edges.

Suppose there does not exist a vertex cover of $ |V| - k $ nodes, then we claim that there does not exist an independent set of $ k $ node. If there does exist an independent set of $ k $ nodes, we claim that the complement of the independent set is a vertex cover. Indeed, if the complement is not a vertex cover, then there must exists an edge that is does not cover, but that would mean both ends on that edge is included in the independent set, which is clearly impossible.

With that, we conclude that we can reduce independent set, a known NP complete problem, to vertex cover, therefore, vertex cover is NP complete too.

Monday, February 27, 2017

Recursion Theorem in Practice - building a quine in C#

The goal of this post is to share my experience building a C# program that output its own source. This is often known as a quine.

According to the recursion theorem, a Turing machine can reference its own representation. So a quine, if built, serve as an example to show recursion theorem is true in this case.

Without further ado, here is the C# program I built. It can generate it an identical copy of its own source code to the console.

class Program
{
    static void Main()
    {
        string tape = @"        string a = @""class Program
{
    static void Main()
    {
        string tape = @"""""" + tape.Replace(""\"""", ""\""\"""") + @"""""";
"";
        System.Console.WriteLine(a + tape);
    }
}";
        string a = @"class Program
{
    static void Main()
    {
        string tape = @""" + tape.Replace("\"", "\"\"") + @""";
";
        System.Console.WriteLine(a + tape);
    }
}

The coloring is intended to show how this quine is constructed. The red part of the program is intended to construct the blue part of the program. In fact, the whole highlighted part in the red program is really just a string literal that is the same as the blue part. Therefore the definition of the red part depends on the blue part.

Now the blue part must construct the red part and concatenate them together. But we have a small problem here. The blue part's definition cannot depend on the red part's, otherwise we have a dependency cycle. The key is that we compute the red part instead. That is why we used three strings to compute a variable named a, and once we get the red part, we get the blue part from the tape variable, and therefore we completed constructing the whole source code, now we output it!

Try to run the program and use a diff tool to compare the source code, they are identical! Not a space is missing!

Wednesday, January 4, 2017

Languages

Problem:

Mixing it up: Let $ F $ denote some finite language, $ R $ denote some regular language, $ C $ denote some context-free language, $ D $ denote some decidable language, $ E $ denote some
recognizable language, and $ N $ denote some non-recognizable language. For each one of the
following statements, prove whether it always holds, sometimes holds, or never holds:


$ N - D $ is recognizable

Short answer: never

Support I could enumerate $ N - D $, I can also enumerate $ D $ because $ D $ is decidable. Now I can dovetail to enumerate $ N = (N - D) \cup D $, therefore I should not be able to enumerate $ N - D $.

$ RE $ is context free 

Short answer: sometimes

A finite language is recursive enumerable and also context free.
A recursively enumerable language include $ a^n b^n c^n $, which is not context free.

$ N \cap F $ is decidable

Short answer: always

$ N \cap F $ is finite, and a finite language is always decidable.

$ E \cup F $ is recognizable

Short answer: sometimes

E is a recognizable language, it could be finite. F is finite, so E union F could be finite, and a finite language is regular.

E could be $ a^n b^n $, and $ F $ could be the empty language, so $ E \cup F $ is $ a^n b^n $ and is not regular.

$ N^* $ is non recognizable

Short answer: always

If $ N* $ is recognizable, we will have a Turing machine that halt on $ N* $. We can modify that machine such that whenever it reach a final state, any input will go to a state that just consume the input and do nothing and never halt, that would become a recognizer for $ N $.
Therefore $ N^* $ is not recognizable.

$ N \cup C $ is finite

Short answer: never

If $ N \cup C $ is finite, then $ N $ is finite, but finite language is always recognizable.

Coloring Cyborgs

Problem:

Two cyborgs walk into your home, both claiming to be oracles for the graph 3-colorability decision problem. They both always give a yes/no answer in constant time for any instance, and are each self-consistent (i.e. each always gives the same answer for the same instance). However, one is a true oracle and the other is a shameless impostor, and you have a large instance of 3-colorability upon which they disagree. Prove whether it is possible to expose the impostor within time polynomial in the size of that instance.

Short Answer:

The key idea is to use the oracle (that only tell me yes or no) to output the coloring. The one that gives a valid coloring is the right one.

Proof:

We need a gadget for that, first, we add three nodes (call them red - green - blue ) as a triangle to the graph. If we wanted to force a node to be red, simply connect green and blue to it.

Now we have a mechanism to force a node to have a certain color - now we can guess the color of the nodes. Here is an algorithm to extract the 3 coloring from it.

For the cyborg that said the graph is 3 colourable, we run this algorithm:

while (there exists uncolored node)
{
  color it red
  if (the colored graph is not 3 colourable)
  {
    color it green instead
    if (the colored graph is not 3 colourable)
    {
      color it blue instead
      if (the coloured graph is not colourable)
      {
        output "you are fake oracle!";
      }
    }
  }
}
output you are true oracle!

The algorithm do at most 3 coloring attempt for each node, so it is obviously polynomial time!

A Turing-recognizable language consisting of descriptions of Turing machines

Problem:

Let $ A $ be a Turing-recognizable language consisting of descriptions of Turing machines, $ \{(M_1 ), (M_2), \cdots \}$, where every $ M_i $ is a decider. Prove that some decidable language $ D $ is not decided by any decider $ M_i $ whose description appears in $ A $.
(Hint: You may find it helpful to consider an enumerator for $ A $.)

Short answer: 

We can adopt the diagonalization argument to produce a decidable language that is not accepted by any of the $ M_i $.

Proof:

Consider the language $ D $ decided by the following Turing machine

Reject any input that is not a binary number.
Enumerate the ith Turing machine, where $ i $ is the input string.
If $ M_i $ accept the input string, reject it, otherwise accept it.

Apparently this is a Turing machine, and it is not accepted by any of the $ M_i $ (there is always a string $ i $ where $ M_i $ does not agree with $ D $)

Most Boolean function on N input requires exponential (as a function of N) number of 2-input Boolean gates.

Problem:

Most Boolean function on N input requires exponential (as a function of N) number of 2-input Boolean gates.

Proof idea:

The number of Boolean function is huge (i.e. exponential exponential) but the number of circuit you can make out of polynomial number of gates is smaller than that.

Proof:

The number of Boolean functions of n input can be counted by listing the output of the Boolean function output starting from

00000..000 = 0
00000..001 = 0
...

There are $ 2^n $ rows, therefore every unique bit string of length $ 2^n $ represents an unique Boolean function, so we have $ 2^{2^n} $ Boolean functions.

On the other hand, if we have a polynomial number of gates, they could be arranged in anyway, the number of circuits could be at most $ (cn^s)! < 2^{cn^s} < 2^{2^n} $, therefore, most of the circuits do require more than polynomial number of gates.

Show that the set of incompressible strings contains no infinite subset that is Turing recognizable.

Problem:

Show that the set of incompressible strings contains no infinite subset that is Turing recognizable.

Short proof: 

If such an infinite subset exists, we can compress those strings using a Turing enumerator.

Proof:

If such an infinite subset exists, we build an enumerator of them. By the recursion theorem, we can build a Turing machine that enumerate the strings that infinite subset and stop on the first string that is larger than the representation of itself.

Such a machine must eventually halt because of the set must have a string with length larger than this finite machine.

Show that NP is closed under union and concatenation

Problem:

Show that NP is closed under union and concatenation

Short answer: 

We can build non-deterministic Turing machine to solve the union language and the concatenation language

Proof:

Assume language X and language Y are in NP, we wanted to show X union Y is in NP

Because X and Y are in NP, there exists non-deterministic Turing machine X and non-deterministic Turing machine Y that verifies X and Y, respectively.

Rename the states in Y so that it does not have the same name as in X.

Build a new non-deterministic Turing machine U as follow:

Create a new start state, for each input symbol, it write the same symbol back to the tape, and non-deterministically go to the start state of X or the start state of Y. Take the union of state and transition function from both X and Y, that is a non-deterministic Turing machine U.

For any string in X, non-deterministic Turing machine U can take a non-deterministic jump to the start state of X, and follow the non-deterministic Turing machine X to finally get accepted.
For any string in Y, non-deterministic Turing machine U can take a non-deterministic jump to the start state of Y, and follow the non-deterministic Turing machine Y to finally get accepted.
For any string that is neither in X nor in Y, non-deterministic jump to either X or Y cannot lead to acceptance.

Therefore we showed the non-deterministic Turing machine U precisely accept X union Y.

Assume language X and language Y are in NP, we wanted to show X concatenate Y is in NP

Because X and Y are in NP, there exists non-deterministic Turing machine X and non-deterministic Turing machine Y that verifies X and Y, respectively.

Rename the states in Y so that it does not have the same name as in X.

Build a new non-deterministic Turing machine C as follow:

For each final state in X, for each input symbol, write the same symbol back to the tape and jump to the start state of Y, and make it non-final.

For any string in X concatenate Y, it can follow the non-deterministic Turing machine X to go to final state of the original non-deterministic Turing machine X, jump to start state of Y, and then follow it to the final state of Y.

For any string, it cannot possibly go to a final state without going through a original final state of X. (All path to final state goes through it)

But if it goes through to final state through a original final state of X, the string could be split into two pieces where a prefix is in X and a suffix is in Y, that means the string has to be in X concatenate Y.
The contrapositive of this show if a string is not in X concatenate Y, it is not accepted.

Therefore we showed the non-deterministic Turing machine C precisely accept X concatenate Y.

Is NP countable?

Problem: 

Is NP countable?

Short answer: 

yes

Proof:

For each problem in NP - there exists a non-deterministic Turing machine as verifier.
As we can encode non-deterministic Turing machine as finite binary string, the set of non-deterministic Turing machine is countable, therefore the set NP is also countable.