We now arrive at the heart of the discussion, the definition of the
complexity class NP as computational problems where you can efficiently
recognize the solution, roughly the same thing is problem solvable is in good
force search. The idea of NP completeness, that the problem can be as
hard as any other problem, where you can efficiently recognize a solution, the
ubiquity of NP-complete problems, including possibly a problem you're
working on right now in one of your own projects, and what should be part of your
toolbox, a simple recipe for proving that problems are NP-complete.
We left off the last video noting that the travelling salesmen problem by virtue
of being solvable in the exponential time using brute force search is certainly not
as hard as notorious problems like the halting problem.
So, perhaps we can instead of mass evidence of interactability for the TSP
by showing it's as hard as any other problem solvable using brute-force
search, that is every other brute-force search solvable problem reduces to
solving the traveling salesman problem. To turn this idea into mathematics we
have to identify the minimal ingredients necessary to solve a problem using brute
force search. And the key idea turns out to be the
efficient recognition of purported solution.
[SOUND] Specifically, we say that a computational problem,
like, say the traveling salesman problem, or frankly almost any other problem we've
talked about in these classes is a member of the complexity class NP if it meets
the following two criteria. The first property is sort of a simple
prerequisite. It is search that solutions have length polynomial in the input size.
The second property, and this is really the key one, is that if someone offers up
to you a purported solution to a NP problem, you can verify its correctness
in polynomial time. To make sense of this abstract
definition, let's think about some concrete examples.
Let's start just with the traveling salesman problem.
So, consider the instance of the traveling salesman problem that's
specified by some n vertices and distances amongst all of the pairs of
vertices. And suppose we want to know whether or
not there's a tour that has a total length.
At most one thousand. So, let's for the moment set aside the
problem of checking whether or not one of the n factorial tours does indeed have
cost of at most 1000. Let's instead think about the much more
modest question, of whether or not a single proposed tour has cost at most
1000. Well, that computational problem is to
verify whether or not a single suggested tour meets the target or not.
That's an easy computational problem, right? The tour is specified just by
order in which you're supposed to visit the vertices.
The length of that ordering is certainly polynomial in the length of the input,
and all you gotta do is add up the n edges that gets traversed in this tour,
and check if this sum is at most 1000 or not.
This argument shows that checking whether or not there exists a tour, with total
cost and most sum threshold, does indeed belong to the complexity class NP.
Right, you can write down a purported solution, in polynomial length. All you
need to do is specify the ordering, and you can verify whether or not a candidate
solution, whether or not a proposed tour meets the threshold in linear time, just
by adding up the corresponding edge costs.
If it bothers you that I moved away from the problem of computing an optimal,
minimum cost tour, and studied instead the seemingly easier problem, I'm just
checking, does there exist a tour that has total cost at most a given threshold,
notice that the former problem reduces to Multiple indications of the latter
problem, just by binary search over the threshold.
The same reasoning can be applied to pretty much all the computational
problems we've seen in these courses, placing all of them into the complexity
class NP. So for the types of problems we've been
thinking about, solutions can generally be specified in linear space, correctness
can be verified in linear time. Another genre of problems that we haven't
really discussed in these classes, but are important in the application domains,
and are also canonical NP problems are constraint satisfaction problems.
In general in a constraint satisfaction problem you have a bunch of variables.
The simplest case would be binary or Boolean variables.
And then you have a list of constraints, and each constraint specifies for a
subset of the variables. a permitted list of configurations of
those variables. So one simple case which some of you've
seen and if you study NP completeness property you'll definitely see it will be
the three set problem. The three set problem does indeed
consider Boolean variables, so each variable xi can either be zero or one and
you can think of the clauses as forbidding a single assignment to a
triple vertices. So a clause might say you are not allowed
to simultaneously assign. X3 to 1, X5 to 0, and X8 to 1.
So clauses for big particular triples of assignments, and the question then is,
does there exist an assignment of all of the variables to 0 an 1, that
simultaneously satisfies all of the constraints.
These kinds of constraint satisfaction problems also belong to the complexity
class in p. So here if you have a purported
assignment of all of the variables to values that meets all of the constraints,
well you can write it down very simply just the value for each of the variables.
And it's easy to check. I just iterate through the constraints
one at a time and see if yes indeed, the values that you're suggesting for the
variables in that constraint are on the list of permissible value combinations.
At the beginning of this video, I suggested that the complexity class NP
would represent problems that are brute force solvable, in the same way that the
traveling salesman problem is. If you look at the actual definition I
just gave you of the class NP, it's in terms of efficient recognition of
solutions, but let's now observe that this definition NP, in terms of efficient
verification of purported solutions does indeed imply that these problems can be
solved in exponential time using brute-force search.
So really, all you do here is check each of the candidate solutions one at a time.
And in doing so, you see very clearly the role of the two properties in the
definition of an NP problem. The role of the first property saying solution length
has to be a polynomial in the input size, in turn, restricts the number of possible
solutions, the number of bit strings of that given length,
to be at most, exponential in the input size.
The second property of course assures you that for each of those only exponentially
many possibilities, you can verify whether or not it is a correct solution,
whether or not it's a short TSB tour, whether or not it's a satisfying
assignment to some variables in a constraint satisfaction problem in
polynomial time. Now because the requirements for the
class NP are so weak, the class NP is very big, where all
membership in NP requires essentially as the ability to officially recognize the
solution. You know one when you see one,
and as you can imagine, that property is met by many of the computational problems
that we ever think about. Factoring almost any graph problems you'd
ever think about, sequencing problems, most constraint satisfaction problems,
and so on. As we've said, it's true that not every,
natural computational problem, is NP. The halting problem is an extreme example.
That's in fact undecidable, there's no algorithm at all. There's also some
natural problems and some application domains which are neither in NP, nor they
undecidable. So, model checking is one domain that
furnishes lots of examples of problems strictly harder than NP.
The vastness of the complexity class NP means that NP-completeness is strong
evidence of intractability. Remember what it means for a problem to
be complete. For some set of problems, it means it's
as hard as any other problem in that set. Every other problem in that set reduces
to the complete problem. Therefore, suppose you had a polynomial
time algorithm for just one, NP-complete problem,
you would get, automatically, by the definition of a reduction, a polynomial
time algorithm for every single computational problem in NP.
Every single problem for which you can efficiently recognize a solution.