0:00

Okay. So, we have given this, this dynamic,

this algorithm for the problem of RNA secondary structure and as I said, this

fits within the category of algorithms that we call, dynamic programming.

[SOUND] Okay?

So that, that the, the term programming here does not have to do

with programming in real programming languages.

This is just the name of this class of algorithms.

And dynamic, [COUGH] dynamic programming as I said, it, it solves the problem.

Certain problems when we sort certain optimization problems,

we observe that the, the optimal solution to the big problem,

can be obtained from optimal solutions to solve problems.

Like in this case here, the solution to op-,

the solution to the entire problem opt of 1N, can be

obtained from opt of 1N minus one and opt of, of,

of NT and N1 to T minus one and T plus one to T minus one.

So the solution to,

the bigger problem IJ, can be obtained from a solution to sub-problems.

That involves smaller, smaller sub-problems here.

Okay, so, when we talk, when we look at dynamic programming, dynamic programming,

we have to have this, this property that the problem, the solution to the problem,

the big problem, is obtained, from solutions to sub-problems.

And when we have this,

this properties, when we compute the solutions to sub-problems,.

We store them in a, usually in a matrix and,

when we come to solve the big problem, when we encounter this problems,

all we need to do, because they have already been computed.

Just look them up.

So looking up this value, this value, this value, it's all of one operation for

each one of them.

1:52

Of course, because we are doing this technique the gain from, the gain from

dynamic programming comes, in cases where the same sub problem is going to be, needs

to be computed over and over and over if we do recursive algorithm, for example.

So, if I get to a situation where, the problem is solved from sub-problems, but

these sub-problems do not overlap, ever, in the sense that,

OPT of (7,19) is solved once.

OPT of (7,20) is solved only once, or needed only once.

2:24

Then I'm not going to gain much.

The, the, the gain I would get from this if the same sub-problem, is,

needs to be solved over and over and over.

With dynamic programming, the way we do that.

By solving it only once, storing it in a matrix and every time we need to solve it

again, all we do is just, look up the value in the matrix, okay?

The other thing that we have to keep in mind,

that the fact that I have written this, this kind of formula and this algorithm.

Does not necessarily mean, or if I have dynamic programming,

does not necessarily mean I have an efficient algorithm, or

a polynom, which we mean polynomial time algorithm.

For this to be polynomial, the sub problems, that we divide and

solve, has to be polynomial as well, okay?

Because, if you, have to solve, if the matrix has exponential size.

There's nothing much you can do about it, because you are filing the entire matrix.

So usually the running time, of a dynamic programming algorithm,

is dominated by the size of matrix because, or the size of it,

the size of matrix because, we have that entries in the matrix.

At least we have to fill out these entries in the matrix.

Even if we do the simplest amount of computation,

which is constant number of operations per entry, the number of

entries in the matrix, is going to give us a lower bound, on the running time.

Now, what is the running time of this algorithm?

It's very simple to analyze, actually.

This loop here is on the order of n, right.

n minus five, it is order of n.

This here is also in the order of n because in the, in the,

in the worst case, n is five and this is going to go from one to n minus five.

This is o of one.

Here, this is going to be O(1).

Of course this is O(1), O(1), the addition is O(1) but

there is this [BLEEP] that is ranging.

If I am looking again in the worse case at position N,

T could be, any of these positions here.

So in the worse case, it is O(n).

So this we see, in the worse case we might be looking at O(n) values.

So, this can be O(n).

So O(n), this is nested within it, this is nested within this.

So this algorithm, is in the order of nq.

Okay?

And this is much better than trying a brute force approach,

that would have tried, for example, let's look at every possible way of pairing.

Indexes in the [INAUDIBLE] secondary structure and then check feasibility, and

then find the one with the maximum number, okay?

And, if you,

if you want to think about, is this faster than the recursive algorithm?

I en, I highly encourage you to implement this, as a,

in dynamic programming; using a matrix.

Just implement this using the recursive version,

where this opt is the name of the function instead of a matrix.

So you call, that you are calling the function here every time.

And run them and look at the running time of these two algorithms and

see which one is going to be faster.

5:32

Of course, with dynamic programming, when we're looking at

dynamic programming implementation, we, there is an issue always with with memory,

because we have to be building that matrix, okay?

Usually the memory is not a big deal when we are looking at, at small strings,

because it is, in this case it's going to be on the order of n squared.

But if n gets very, very large, the dynamic programming approaches,

start suffering because of the memory issue.

They are still efficient in terms of time, but the amount of memory they need to

store, becomes very large, and that becomes the bottleneck, okay.

One final word about this algorithm for our secondary structure, and with respect

to dynamic programming, is, remember that the biologist was not interested.

In the, in the, in the number.

The biologist didn't tell me,

it didn't ask me to give them the number of pairs that are going to match.

The biologists wanted the actual pairs, because they wanted the actual structure.

So how would I get, how would I get,

the actual pairs themselves, rather than just the number?

Because when I say Return OPT of (1,n), this can be like the number five.

But the biologist is interested not in five,

the biologist is interested in the five pairs, that I produced.

That's very easy to obtain if you think about this algorithm carefully.

When you look at OPT (1,n), OPT (1,n), it's either the max of this or this.

If it is, if, if this is the value that's giving us the max.

Then we know that n is not part of the solution.

We repeat the exercise now on 1 to n minus 1.

If this is the part that give us the solution,

then we find the t that is maximizing this value.

If t is let's say 7 and n is 100, then we tell the biologists 7 and

100 are going to be one of the pairs.

Then we repeat the exercise and find what are the other pairs.

Okay?

So this is a technique that's called backtracking.

When we compute the ends are using dynamic programming.

Usually we are, computing a number that is related to

the solution which is usually the optimality score.

Here it's the maximum number of pairs.

Or, or something like that.

And, and within other algorithms.

So usually dynamic programming algorithms are implemented in such a way that,

the matrix is containing, the interest are containing the optimality score.

But using the trace back technique; we can find the actual solution,

not just the score of it, okay?

And again, here if I just.

The traceback as you go from the, remember that the matrix we built,

1,n was here, this is optive 1,n.