Friday, February 11, 2011

P, NP, NP-complete, and NP-hard

What are P, NP, NP-complete, and NP-hard?

There are two classes of problems, P, and NP (there are many, many more, but we will ignore the rest here).  These stand for "Polynomial" and "Non-deterministic Polynomial".

Assuming knowledge of big-O notation, we'll move on from there.  Request an explanation and you shall have it.

Problems in P are solvable by a polynomial-time algorithm (runs in O(f(n)), where f is polynomial).  This includes things like sorting and triangulation.

Problems in NP are checkable by a polynomial-time algorithm, but not necessarily solvable.  Generally, we mean NP minus P, when we say NP (so they are only checkable).  This means that, given the solution to an instance of the problem, we can verify that it is the correct solution in polynomial time, but that we couldn't have come up with it on our own, in polynomial time.  An easy example of this is subset sum: given a set of numbers, does there exist a subset whose sum is zero?  It turns out that this is not solvable in polynomial time, but given the answer, we can easily check that it is right (the decision problem is a little harder to see, but assume you are given the actual subset).

Now, we have to discuss NP-Complete and NP-Hard, which requires a discussion of reducibility.  Reducibility is the notion that, given an instance of some hard problem, we can quickly construct an instance of some other problem, whose answer would help us solve the first problem.

The typical logic is this: you claim to have a fast algorithm to, say, solve subset sum.  I create a machine that, in polynomial time, takes a traveling salesman problem instance, and creates from it, an instance of subset sum.  If you can give my machine the answer to that question, it will, again in polynomial time, change that back into the answer for my original traveling salesman problem.  Therefore, if your subset sum algorithm is really polynomial, then I have just added some extra bits to it and created a polynomial time algorithm for solving traveling salesman problems.  If I can create such a machine, then it creates a notion of reducibility, that is, traveling salesman is reducible to subset sum, which indicates that subset sum is at least as hard as traveling salesman.  Intuitively, one might imagine that with all the universal truth out there, one could say that traveling salesman has no polynomial-time solution.  In that case, the above would be a proof by contradiction, that subset sum also has no polynomial time solution.

Now, we can pretty easily state what NP-Complete and NP-Hard are:

If two problems can be reduced to each-other, they are in some sense, "equivalent".  All problems in NP that are so equivalent, are called NP-Complete.

If there is an NP-Complete problem that can be reduced to some problem H, then H is NP-Hard.  This means that, in some sense, H is at least as hard as all the NP-Complete problems, but note that there does not need to be a reduction from H to any NP-Complete problem, so H might be harder than all the NP problems.

One proves that a problem is NP-Hard by taking an existing NP-Complete problem (3-SAT and Traveling Salesman are my favorites, since they work well for geometry), and using it to generate input to your problem, such that an answer to your problem is equivalent to solving the NP-Complete problem you chose.