So the hope behind this case analysis is that we're going to be able to boil down
the possibilities for the optimal solution to merely three candidates,
one candidate for each of the three possibilities for the contents of the
final position. That would be analogous to what we did in
both the independent set and knapsack problems, where we boiled the optimal
solution down to just one of two candidates, corresponding to whether
either the final vertex or the final item, as the case may be, was in the
optimal solution. Another way of thinking about this is
we'd like to make precise the idea that if we just knew what was going on in the
final position, if only a little birdy would tell us which of the three cases
that we're in, then we'd be done by just solving the some smaller sub-problem
recursively. So let's now state for each of the three
possible scenarios for the final position,
what is the corresponding candidate for the optimal solution, the way in which,
it must necessary be composed with an optimal solution to a smaller sub
problem. So who are going to be the protagonists
in our smaller sub-problem? Well, the smaller sub-problem's going to
involve everything except the stuff in the final position.
So it's going to involve the string's X and Y, possibly with one character
remaining. So let's let x prime be x, with its final
character peeled off. Y prime's going to be y, with its final
character peeled off. So let me just remind you of how I
numbered the three cases. So case one is when the final position
contains the final characters of both of the two strings, that is, when there's no
gaps. Case two is when x, little x of n gets
matched with the gap and case three is when little y of n gets matched with the
gap. Alright, so let's suppose that case one
holds. This means that the contents of the final
position, includes both of the characters little x sub m and little y sub n.
So now what we're going to do is we want to look at a smaller sub problem.
And we want to look at the sub problem induced by the contents of all of the
rest of the positions. We're going to call that the induced
alignment. Since we started with an alignment, two
things that had to equal length, and we peeled off the final position of both, we
have another thing that has equal link so we're justified in calling it an
alignment. Now what is it an alignment of?
Well if we're in case one, that means what's missing from the induced alignment
are the final characters. little X of M and little Y's of N,
which means the induced alignment is a bona fide alignment of X prime and Y
prime. And certainly, what we're hoping is true,
is that the induced alignment is in fact, an optimal alignment of these smaller
strings x prime and y prime. This would say that when we're in case
one, the optimal solution to the original problem is built up in a simple way from
an optimal solution to a smaller sub problem.
We're of course hoping that something analogous happens in cases two and three.
The only change is going to be that the protagonists of the sub-problem will be a
little bit different. In case two, the thing which is missing
from the induced alignment is the final character of x.
So, it's going to be the induced alignment of x prime and y.
Similarly, in case three, the induced alignment is going to be an alignment of
x and y prime. So, this is an insertion, this is a
claim, it's not completely obvious, though the proof isn't hard, as I will
show you on the next slide. But assuming for the moment that this
assertion is true, it fulfills the hope we had earlier.
It says that indeed, the optimal solution can only be one of three candidates, that
one for each of the possibilities for the contents of the final position.
Alternatively it says, that if we only knew which of the three cases we were in,
we'd be done, we can recurse, we could look up a solution to a smaller sub
problem and we could extend it in an easy way to a optimal solution for the
original problem. So lets now move onto the proof of this
assertion. Why is it true that an optimal solution
must be built up from an optimal solution to the relevant smaller sub-problem?
Well all of the cases are pretty much the same argument so I'm just going to do
case one, the other cases are basically the same.
I invite you to fill in the details. So it's going to be the same type of
simple proof by contradiction that we used earlier, when reasoning about the
structure of optimal solutions for the independent set in knapsack problems.
We're going to assume the contrary, we're going to assume that the induced solution
to the smaller subproblem is not optimal, and from the fact that there is a better
solution for the subproblem, we will extract a better solution for the
original problem, contradicting the purported optimality of the solution that
we started with. So when we're dealing with case one, the
induced alignment is of the strings X prime and Y prime, X and Y with the final
character peeled off. And so for the contradiction, let's
assume that this induced alignment, it has some penalty, say capital P. Let's
assume it's not actually an optimal alignment of X prime and Y prime.
That is, suppose if we started from scratch we'd come up with some superior
alignment of X prime and Y prime, with total penalty P star, strictly smaller
than P. But if that were the case, it would be a
simple matter to lift this purportedly better alignment of x prime and y prime
to an alignment of the original strings x and y.
Namely we just reuse the exact same alignment of x prime and y prime, and
then in the final position, we just match x m with y n.