>> . In this module, we will be walking through
a review of linear optimization using a hedging example.
We are going to see how there are two kinds of optimization problems, something
called a primal optimization problem, something called a dual optimization
problem, how they are related. And we also introduced the idea of La
Grangian relaxation. Here's the hedging problem that we are
interested in. I've got d assets.
So take d, just to fix ideas, take d to be 3 or 4.
But some number of assets that are there in the market.
The price of these assets at time 0 is r to the d.
It's some price vector, which has d components, where every component tells me
the price of that particular asset. At time t equal to 1, the market outcomes
are uncertain. I don't know what, which state the market
is going to be, it's going to be in one of m possible states.
So here's the situation at time T equal to zero, and I'm in state zero, And I could
go to one of n different possible states. Go from one, two, all the way to, down to
m. What do I mean by a state?
For my purposes, state is simply telling me what are the different prices that can
happen. So I'm going to characterize every state
by the prices of the asset in that particular state And I can do it in 2
different ways. I could either tell you, what is the price
of all the assets in any given state, Or I can tell you the price of a given in all
possible states and we'll flip between these two ideas as we go through.
We'll use the power of matrices to see what it means, sometimes, it's easier to
represent it one way, Sometimes it's easier to represent it the other way.
And understanding it both ways gives us an insight of which one is more beneficial
for the particular application that I'm looking for.
So, to motivate that, I'm going to define something called Sj.
So S sub j, will be a column vector, and it will tell me the price of asset j in
all possible states, So it's s1j, s2j all the way down to smj.
J is fixed, it's asset j and the number of states would ran-, go from 1 to m.
Now how do I represent all the assets in the market?
I'm just going to take these column and stack them.
S1 would refer to the column corresponding to S, asset one S2, asset two and so one
up to S D. If I write them out in gory detail, you
end up getting this matrix. The matrix has M rows, because it's got M
states, its got D columns because its got D assets.
And what going on here, every row here, tells you the price of all the assets in a
particular state. So what I've circled here, are the prices
in state 1, S1d, 12S, 11S, 12, all the way up to S1D.
Genetically, somewhere it's going to be Sm1, Sm2 , all to Smd.
So every row tells me what happens to all the assets in a given state, every column
tells me what happens to a particular asset in all the different states.
Alright, so that's how I'm going to describe my market.
At times 0 I know the price, at time 1, which is the place where the market opens
again, I don't know the price, it's uncertain, it's going to be one of n
possible states, but I know that if the state is given to me, what the prices are.
Now you might think that this is a very simple representation of the market, and
indeed it is. But it turns out, that even in the most
modern methods, of risk management, like value at risk, and conditional value at
risk, people represent what happens to the market By a model which looks very much
like this. Except instead of talking about states,
we'll talk about simulations. I'm going to simulate returns and I'm
going to say, depending on all of these different scenarios, what happens?
But essentially the, the main ideas are going to be captured by this simple toy
model. Alright, I know what happens to the asset.
Now, what I want to do with these assets is to hedge an obligation.
I'm going to walk through what does hedging mean.
So, I have an obligation. What is an obligation?
It's a vector x in rm. Why m?
Because the obligation depends on the state.
In a good state I might had to pay more, in a bad state I might had to pay less.
So depending upon which state occurs, I'm going to pay an obligation xi, if this
state i occurs. And what I want to do, is I want to buy a
portfolio now, I want to buy a certain number of shares of the assets right now,
in order to have enough money to pay my obligation.
So, I'm going to choose a portfolio theta. Theta-1 through theta-d are going to be
the number of shares that I'm going to purchase of each of these assets.
I'm going to allow for the possibility of short selling, so thetas could be
negative, or I'll buy them long, which is, when thetas are positive.
And I want to do, I want to choose this portfolio, so I can hedge the obligation
that, That I'm interested in hedging. So I'll step you through what it means and
what, what hedging will ensue and then we'll go to what is the linear
optimization problem that we end up getting.
Okay! So at time 0, I'm going to put, buy a
position theta at rd. Why d?
Because it's got d different assets. So theta J's the number of shares of
assets J that I purchased, where J goes from 1 through d.
What's the cost of this purchase? The cost of the position theta, is simply
the price of every asset times the number of shares that I produce.
A lot of those assets! So it's pj times theta j, sum from j equal
to 1 through d. It's the inner product of the vector p,
with the vector theta. And we know that inner products are
nothing but p transpose times theta. What happens at time t equal to 1?
So, if a state i occurs, then I'm going to liquidate my position.
And when I liquidate positions, I'm going to sell the assets.
And what's going to happen? In state i, the price of asset j is s, i,
j, I hold theta j positi-, shares of this asset, so by selling asset j I get Sij
times theta j But I'm going to liquidate the entire portfolio.
So j I'm going to sum from 1 equal to d and that's the amount of money, that's the
pay-off that I'm going to get in the stake.
Its just Sij time theta j sum from j equal to 1 through d, That gives me yi.
And if you just think about it for a moment, if I stock up all the pay-offs as
a vector. If I call a vector y to be y1, y2, all the
way up to y m, the pay-offs in the m states This is nothing but the matrix S,
Times the vector theta. Just to remind you, this is my matrix S,
it has prices for every state as rows. So the payoff in the first state would be
this row times the vector. And the second state would be this row
times the vector. And that's exactly what is being done
here. Sij, as j goes from 1 through D, is a row.
You take the i'th row, multiply it by the vector, you get the payoff in the i'th
state. And therefore, the vector y is simply the
matrix, times theta. Now I want to look at this, in a different
way. I want to think of this, as not
multiplying row, row by row but column by column.
So, instead of looking at this matrix S, as row by row as we have done so far, I'm
going to put them in columns. I've got the column for the first asset,
column for the second asset, column for the d asset and, that's going to multiply
theta-1 theta-2 up to theta-d. And you can see that this multiplication
is nothing but, this row, times this column, so you'll get theta j times Sj.
J is summed from 1 through, that should not be an n, but a d.
It goes from j equals 1, through d. Interpreting it this way, you end up
getting a different interpretation of what is going on.
Now, the pay-offs ys are nothing but linear combinations of the columns.
So vector y, will, I can represent, I can generate a particular pay-off, if it's in
the range of the matrix S. Remember in the concept, we had introduced
this concept in the model of matrices, that a vector y, belongs to the range, of
s, if these are all vectors s-theta, where theta is in, r to the d.
And therefore, y belongs to the range of S.
And remember, we had also introduced the notion of rank.
We had said, that rank tells me how rich that space is.
Can I, how many different vectors can I generate in the range?
So if the rank of the matrix S is n, is equal to m, meaning all possible pay-offs
can be generated, then I can hedge everything.
If on the other hand, the rank of the matrix s, is less than m, that means there
are certain pay-offs. There are certain pay-off vectors that can
not be generated because they can not be produced as if I'm make, taking the matrix
S and multiplying it to theta. So that's the concept that's going to play
a role in the course.We're going to talk about complete markets and incomplete
markets, and that has to do with the rank of this matrix.
Before we get there, here's a simpler notion.
We'll say that a pay-off y hedges x, if y is greater than or equal to x.
Component by component, the pay-off that I've generated using my portfolio is
greater than the pay-off that I need to hedge or give-, the payoff that I need to
give at time 1. Alright, now comes our first optimization
problem. What's the optimization problem?
I want to minimize the price of the portfolio such that it hedges my
obligation. I want to minimize p transpose theta, such
that s-theta is greater or equal to x. And, what are some features about this
optimization problem? The objective!
What I'm trying to maximize or minimize is a linear function, linear function of
theta. The constraints!
Constraints, and what thetas that I can choose, are again linear constraints, as
theta must be greater than or equal to x. Linear or inequality.
So any optimization problem, which has a linear objective function, either
minimization or maximization it doesn't matter, and all the constraints are either
linear inequalities, as is the case here, or linear equalities.
We will call that problem, a linear optimization problem, or a linear program.
It turns out that linear programs are very rich, there's a rich theory about them,
and you can do a lot of interesting things with them.
You can model lots of problems, you can solve them very efficiently, you can get a
lot of interpretation out of them. So the one thing that we're going to focus
on in linear optimization, and the interpretation of linear optimization is
the notion of duality. And what do I mean by the notion of
duality? For every linear program, I can write
another linear program which is intimately connected to it, and this connection is
called duality. So I've got a linear program here minimize
x, c transpose x, Ax greater than or equal to b.
Here's another linear program. Maximize B transpose U, A transpose U
equal to C and U is greater than or equal to 0.
Min goes to max. Now, for the purposes of this course, you
will not be responsible for how I generated the dual linear program from the
primal linear program, or the second linear program from the first linear
program, we will give you that in the course.
The only thing that you're going to be responsible for is, to understand the
relationship, and we'll emphasize this again during the course.
They're here from-, some of the interesting resolve that comes from this
duality concept. The first thing is something called weak
duality. What it says is that this minimization
problem. So the way I, the picture that I have, I
wanted to keep in mind is that here's my value P.
For all feasible values, for all xs, such that Ax is greater than or equal to wha-,
b, I get some numbers on this side. Why do I get greater?
Because I'm trying to minimize. So I get the lowest possible number is p.
I've got another number d. Which is the value of this second linear
program down here. For all Us that are feasible, meaning that
satisfies A transpose U equal to C, and U is greater than or equal to 0, I'll get
values that are less than this D. Why?
Because this is a maximization problem. The largest possible thing that I can get
is B transpose U, is going to be equal to D.
The first theorem says, that in fact, this picture that I'm drawing is correct.
That P is going to be greater than D. There's no reason for it to be that way,
they could have crossed. But the nice thing is, if you construct
the dual linear program correctly, and in the next slide I'm going to show you a
simple example. Again let me emphasize you're not
responsible for knowing it. Just walking through the exercise, and
understanding how I'm going to use the duality.
So the primal and the dual are intimately connected; the optimum value of the primal
P is greater than equal to the optimum value of the dual D.
And because this is true, you end up getting a chain of inequalities that are
very good. We know that c transpose x is greater than
equal to p. Why?
Because I''m trying to minimize c transpose x, so for any feasible x,
anything that satisfies the inequality Ax greater than equal to b, I'll get a value
greater that b. The second piece is also true, D is
greater than equal to b, transpose u because I'm trying to maximize b transpose
u. And the inner part comes because of the v
duality. So now this gives you a very interesting
way, if I can find an x, and I can find a u, such that c transpose x is close to b
transpose u, then I know that the x must be very close to optimal, And u must also
be very close to optimal for the dual. Why?
Because P is greater than, equal to D, if these two at, points are very close, it
must, it must be that P is also very close to c transpose x, which means that it's
optimum. Similarly, b transpose u must be close to
D, which means that is optimum. You end up, you can go one step further,
and say that when either P, or D, is finite, either the primal value is finite,
or the dual value is finite, then in fact, they must be equal.
And finally, the reason why we call them dual, is because if you go from the primal
to the dual, and from, you take the dual of the dual, you get back the primal, and
that's why they're called dual linear programs.
Going a little bit further. Here's another pair of primal dual linear
programs that we'll be, that we'll be using in the course.
Minimize over x, c transpose x, Ax equals b, they, it's equal to maximize u, b
transpose u, A transpose u equal to c. And this equality, I'm putting it there to
emphasize the fact that there is strong duality between them.
But to keep in mind that this equality holds only if you can show that either
this one, p or this one, b is finite. So if p, or d is less than, is less than,
is not equal to let's say, plus infinity or minus infinity, in that case these two
values are the same, and in that sense they are equal.
So, the last piece in this module, we're going to walk through and tell you how to
construct a duel, it's a very general concept called La Grangian relaxation.
And we'll use that in the next module on non linear programming, as well.
So here's the problem. The primal problem is, minimize c
transpose x, Ax, greater than or equal to b.
Now I'm going to take a vector u, which is greater than or equal to zero.
And so u is component wise greater then or equal to 0, and remember Ax is going to be
greater than equal to b. So Ax minus b is component wise greater
than equal to 0, you take some vector which is component wise greater than equal
to the oh, multiplied to another vector that is component wise greater than equal
to 0, you end up getting a number, which is greater than or equal to 0.
So I'm subtracting it, so I'm getting a number, which is less than c transpose x.
So this linear program, which has a changed objective function involving this
vector u, is going to have a value, which is going to be less than equal to P.
B transpose U does not involve the decision x, it does not involve the
minimization, the minimization is going on over x, it does not involve x, so I pull
that out. I end up getting a new problem, which is
minimize c minus a transpose u times x. And what happened to this constraint?
I threw it away! Why did I throw it away?
It's complicated. I don't know how to deal with constraints,
so I threw it away. But I'm guaranteed that if I throw away
these constraints my set over which I can optimize, the xs over that I can chose
become larger, and therefore this minimum only becomes smaller.
So I end up getting that this quantity, is going to be smaller than the previous
line. Now, because I don't have any constraints,
I have a very simple problem. I've got some vector, let's call this
vector d. I've got d transpose x.
So I want to minimize d transpose x. So here's my d vector.
I want to minimize this, d transpose x, and I can choose my x to be anything.
So what am I going to do? I'm going to choose my x to be in the
negative direction and going off to infinity.
Here's my b, here's my x! If I multiply d and x together I get a
very large negative number. So if I have vector d which is not equal
to 0, I can make this optimization problem equal to minus infinity and that's what
this says. If d is not equal to 0 and, just to
emphasize d's equal to c minus A transpose u, if this is not equal to 0 you get to
minus infinity. If in fact, d is equal to 0, which means
that c is equal to A transpose u, then I can't do anything over here.
This vector is equal to 0, I multiply to any other vector, I'll get a 0.
So you end up getting that this minimization problem has a value equal to
0. And p is greater then equal to b transpose
u. Now u is arbitrary, the only thing that I
needed to do was, which is missed over here, is that I needed to have that u must
be greater than equal to zero. So now, you have p must be greater than
equal to maximize b transpose u, which is the value here, provided A transpose u is
equal to c and u is greater than equal to 0.
That immediately gives you a weak duality, a little bit more work gives you a strong
duality. So here's the connection.
Max-, minimize C transpose x, Ax greater than or equal to B is equal to maximize B
transpose U, A transpose U equal to C, and U greater than equal to 0.
That's what we derived over here. What we did, we dualize constraints, and
this word will show up some times during the course.
Dualize means I take the constraints and multiply them by a variable which has a
particular sign. We had a constraint Ax minus b, I
multiplied by a vector of u which is greater than equal to 0, that's
dualization and then I'd relax the constraint.
I have this constraint Ax greater than equal to b, I didn't like it, it was too
complicated, I threw it away. I'd relaxed them!
And by doing dualization and relaxation, you end up getting something called a La
Grangian relaxation, which gives you duals, And gives you some very nice
properties that we'll explore more in the course.