>> .
In this module, we will be walking through a review of linear optimization using a
hedging example. We are going to see how there are two
kinds of optimization problems, something called a primal optimization problem,
something called a dual optimization problem, how they are related.
And we also introduced the idea of La Grangian relaxation.
Here's the hedging problem that we are interested in.
I've got d assets. So take d, just to fix ideas, take d to be
3 or 4. But some number of assets that are there
in the market. The price of these assets at time 0 is r
to the d. It's some price vector, which has d
components, where every component tells me the price of that particular asset.
At time t equal to 1, the market outcomes are uncertain.
I don't know what, which state the market is going to be, it's going to be in one of
m possible states. So here's the situation at time T equal to
zero, and I'm in state zero, And I could go to one of n different possible states.
Go from one, two, all the way to, down to m.
What do I mean by a state? For my purposes, state is simply telling
me what are the different prices that can happen.
So I'm going to characterize every state by the prices of the asset in that
particular state And I can do it in 2 different ways.
I could either tell you, what is the price of all the assets in any given state, Or I
can tell you the price of a given in all possible states and we'll flip between
these two ideas as we go through. We'll use the power of matrices to see
what it means, sometimes, it's easier to represent it one way, Sometimes it's
easier to represent it the other way. And understanding it both ways gives us an
insight of which one is more beneficial for the particular application that I'm
looking for. So, to motivate that, I'm going to define
something called Sj. So S sub j, will be a column vector, and
it will tell me the price of asset j in all possible states, So it's s1j, s2j all
the way down to smj. J is fixed, it's asset j and the number of
states would ran-, go from 1 to m. Now how do I represent all the assets in
the market? I'm just going to take these column and
stack them. S1 would refer to the column corresponding
to S, asset one S2, asset two and so one up to S D.
If I write them out in gory detail, you end up getting this matrix.
The matrix has M rows, because it's got M states, its got D columns because its got
D assets. And what going on here, every row here,
tells you the price of all the assets in a particular state.
So what I've circled here, are the prices in state 1, S1d, 12S, 11S, 12, all the way
up to S1D. Genetically, somewhere it's going to be
Sm1, Sm2 , all to Smd. So every row tells me what happens to all
the assets in a given state, every column tells me what happens to a particular
asset in all the different states. Alright, so that's how I'm going to
describe my market. At times 0 I know the price, at time 1,
which is the place where the market opens again, I don't know the price, it's
uncertain, it's going to be one of n possible states, but I know that if the
state is given to me, what the prices are. Now you might think that this is a very
simple representation of the market, and indeed it is.
But it turns out, that even in the most modern methods, of risk management, like
value at risk, and conditional value at risk, people represent what happens to the
market By a model which looks very much like this.
Except instead of talking about states, we'll talk about simulations.
I'm going to simulate returns and I'm going to say, depending on all of these
different scenarios, what happens? But essentially the, the main ideas are
going to be captured by this simple toy model.
Alright, I know what happens to the asset. Now, what I want to do with these assets
is to hedge an obligation. I'm going to walk through what does
hedging mean. So, I have an obligation.
What is an obligation? It's a vector x in rm.
Why m? Because the obligation depends on the
state. In a good state I might had to pay more,
in a bad state I might had to pay less. So depending upon which state occurs, I'm
going to pay an obligation xi, if this state i occurs.
And what I want to do, is I want to buy a portfolio now, I want to buy a certain
number of shares of the assets right now, in order to have enough money to pay my
obligation. So, I'm going to choose a portfolio theta.
Theta-1 through theta-d are going to be the number of shares that I'm going to
purchase of each of these assets. I'm going to allow for the possibility of
short selling, so thetas could be negative, or I'll buy them long, which is,
when thetas are positive. And I want to do, I want to choose this
portfolio, so I can hedge the obligation that, That I'm interested in hedging.
So I'll step you through what it means and what, what hedging will ensue and then
we'll go to what is the linear optimization problem that we end up
getting. Okay!
So at time 0, I'm going to put, buy a position theta at rd.
Why d? Because it's got d different assets.
So theta J's the number of shares of assets J that I purchased, where J goes
from 1 through d. What's the cost of this purchase?
The cost of the position theta, is simply the price of every asset times the number
of shares that I produce. A lot of those assets!
So it's pj times theta j, sum from j equal to 1 through d.
It's the inner product of the vector p, with the vector theta.
And we know that inner products are nothing but p transpose times theta.
What happens at time t equal to 1? So, if a state i occurs, then I'm going to
liquidate my position. And when I liquidate positions, I'm going
to sell the assets. And what's going to happen?
In state i, the price of asset j is s, i, j, I hold theta j positi-, shares of this
asset, so by selling asset j I get Sij times theta j But I'm going to liquidate
the entire portfolio. So j I'm going to sum from 1 equal to d
and that's the amount of money, that's the pay-off that I'm going to get in the
stake. Its just Sij time theta j sum from j equal
to 1 through d, That gives me yi. And if you just think about it for a
moment, if I stock up all the pay-offs as a vector.
If I call a vector y to be y1, y2, all the way up to y m, the pay-offs in the m
states This is nothing but the matrix S, Times the vector theta.
Just to remind you, this is my matrix S, it has prices for every state as rows.
So the payoff in the first state would be this row times the vector.
And the second state would be this row times the vector.
And that's exactly what is being done here.
Sij, as j goes from 1 through D, is a row. You take the i'th row, multiply it by the
vector, you get the payoff in the i'th state.
And therefore, the vector y is simply the matrix, times theta.
Now I want to look at this, in a different way.
I want to think of this, as not multiplying row, row by row but column by
column. So, instead of looking at this matrix S,
as row by row as we have done so far, I'm going to put them in columns.
I've got the column for the first asset, column for the second asset, column for
the d asset and, that's going to multiply theta-1 theta-2 up to theta-d.
And you can see that this multiplication is nothing but, this row, times this
column, so you'll get theta j times Sj. J is summed from 1 through, that should
not be an n, but a d. It goes from j equals 1, through d.
Interpreting it this way, you end up getting a different interpretation of what
is going on. Now, the pay-offs ys are nothing but
linear combinations of the columns. So vector y, will, I can represent, I can
generate a particular pay-off, if it's in the range of the matrix S.
Remember in the concept, we had introduced this concept in the model of matrices,
that a vector y, belongs to the range, of s, if these are all vectors s-theta, where
theta is in, r to the d. And therefore, y belongs to the range of
S. And remember, we had also introduced the
notion of rank. We had said, that rank tells me how rich
that space is. Can I, how many different vectors can I
generate in the range? So if the rank of the matrix S is n, is
equal to m, meaning all possible pay-offs can be generated, then I can hedge
everything. If on the other hand, the rank of the
matrix s, is less than m, that means there are certain pay-offs.
There are certain pay-off vectors that can not be generated because they can not be
produced as if I'm make, taking the matrix S and multiplying it to theta.
So that's the concept that's going to play a role in the course.We're going to talk
about complete markets and incomplete markets, and that has to do with the rank
of this matrix. Before we get there, here's a simpler
notion. We'll say that a pay-off y hedges x, if y
is greater than or equal to x. Component by component, the pay-off that
I've generated using my portfolio is greater than the pay-off that I need to
hedge or give-, the payoff that I need to give at time 1.
Alright, now comes our first optimization problem.
What's the optimization problem? I want to minimize the price of the
portfolio such that it hedges my obligation.
I want to minimize p transpose theta, such that s-theta is greater or equal to x.
And, what are some features about this optimization problem?
The objective! What I'm trying to maximize or minimize is
a linear function, linear function of theta.
The constraints! Constraints, and what thetas that I can
choose, are again linear constraints, as theta must be greater than or equal to x.
Linear or inequality. So any optimization problem, which has a
linear objective function, either minimization or maximization it doesn't
matter, and all the constraints are either linear inequalities, as is the case here,
or linear equalities. We will call that problem, a linear
optimization problem, or a linear program. It turns out that linear programs are very
rich, there's a rich theory about them, and you can do a lot of interesting things
with them. You can model lots of problems, you can
solve them very efficiently, you can get a lot of interpretation out of them.
So the one thing that we're going to focus on in linear optimization, and the
interpretation of linear optimization is the notion of duality.
And what do I mean by the notion of duality?
For every linear program, I can write another linear program which is intimately
connected to it, and this connection is called duality.
So I've got a linear program here minimize x, c transpose x, Ax greater than or equal
to b. Here's another linear program.
Maximize B transpose U, A transpose U equal to C and U is greater than or equal
to 0. Min goes to max.
Now, for the purposes of this course, you will not be responsible for how I
generated the dual linear program from the primal linear program, or the second
linear program from the first linear program, we will give you that in the
course. The only thing that you're going to be
responsible for is, to understand the relationship, and we'll emphasize this
again during the course. They're here from-, some of the
interesting resolve that comes from this duality concept.
The first thing is something called weak duality.
What it says is that this minimization problem.
So the way I, the picture that I have, I wanted to keep in mind is that here's my
value P. For all feasible values, for all xs, such
that Ax is greater than or equal to wha-, b, I get some numbers on this side.
Why do I get greater? Because I'm trying to minimize.
So I get the lowest possible number is p. I've got another number d.
Which is the value of this second linear program down here.
For all Us that are feasible, meaning that satisfies A transpose U equal to C, and U
is greater than or equal to 0, I'll get values that are less than this D.
Why? Because this is a maximization problem.
The largest possible thing that I can get is B transpose U, is going to be equal to
D. The first theorem says, that in fact, this
picture that I'm drawing is correct. That P is going to be greater than D.
There's no reason for it to be that way, they could have crossed.
But the nice thing is, if you construct the dual linear program correctly, and in
the next slide I'm going to show you a simple example.
Again let me emphasize you're not responsible for knowing it.
Just walking through the exercise, and understanding how I'm going to use the
duality. So the primal and the dual are intimately
connected; the optimum value of the primal P is greater than equal to the optimum
value of the dual D. And because this is true, you end up
getting a chain of inequalities that are very good.
We know that c transpose x is greater than equal to p.
Why? Because I''m trying to minimize c
transpose x, so for any feasible x, anything that satisfies the inequality Ax
greater than equal to b, I'll get a value greater that b.
The second piece is also true, D is greater than equal to b, transpose u
because I'm trying to maximize b transpose u.
And the inner part comes because of the v duality.
So now this gives you a very interesting way, if I can find an x, and I can find a
u, such that c transpose x is close to b transpose u, then I know that the x must
be very close to optimal, And u must also be very close to optimal for the dual.
Why? Because P is greater than, equal to D, if
these two at, points are very close, it must, it must be that P is also very close
to c transpose x, which means that it's optimum.
Similarly, b transpose u must be close to D, which means that is optimum.
You end up, you can go one step further, and say that when either P, or D, is
finite, either the primal value is finite, or the dual value is finite, then in fact,
they must be equal. And finally, the reason why we call them
dual, is because if you go from the primal to the dual, and from, you take the dual
of the dual, you get back the primal, and that's why they're called dual linear
programs. Going a little bit further.
Here's another pair of primal dual linear programs that we'll be, that we'll be
using in the course. Minimize over x, c transpose x, Ax equals
b, they, it's equal to maximize u, b transpose u, A transpose u equal to c.
And this equality, I'm putting it there to emphasize the fact that there is strong
duality between them. But to keep in mind that this equality
holds only if you can show that either this one, p or this one, b is finite.
So if p, or d is less than, is less than, is not equal to let's say, plus infinity
or minus infinity, in that case these two values are the same, and in that sense
they are equal. So, the last piece in this module, we're
going to walk through and tell you how to construct a duel, it's a very general
concept called La Grangian relaxation. And we'll use that in the next module on
non linear programming, as well. So here's the problem.
The primal problem is, minimize c transpose x, Ax, greater than or equal to
b. Now I'm going to take a vector u, which is
greater than or equal to zero. And so u is component wise greater then or
equal to 0, and remember Ax is going to be greater than equal to b.
So Ax minus b is component wise greater than equal to 0, you take some vector
which is component wise greater than equal to the oh, multiplied to another vector
that is component wise greater than equal to 0, you end up getting a number, which
is greater than or equal to 0. So I'm subtracting it, so I'm getting a
number, which is less than c transpose x. So this linear program, which has a
changed objective function involving this vector u, is going to have a value, which
is going to be less than equal to P. B transpose U does not involve the
decision x, it does not involve the minimization, the minimization is going on
over x, it does not involve x, so I pull that out.
I end up getting a new problem, which is minimize c minus a transpose u times x.
And what happened to this constraint? I threw it away!
Why did I throw it away? It's complicated.
I don't know how to deal with constraints, so I threw it away.
But I'm guaranteed that if I throw away these constraints my set over which I can
optimize, the xs over that I can chose become larger, and therefore this minimum
only becomes smaller. So I end up getting that this quantity, is
going to be smaller than the previous line.
Now, because I don't have any constraints, I have a very simple problem.
I've got some vector, let's call this vector d.
I've got d transpose x. So I want to minimize d transpose x.
So here's my d vector. I want to minimize this, d transpose x,
and I can choose my x to be anything. So what am I going to do?
I'm going to choose my x to be in the negative direction and going off to
infinity. Here's my b, here's my x!
If I multiply d and x together I get a very large negative number.
So if I have vector d which is not equal to 0, I can make this optimization problem
equal to minus infinity and that's what this says.
If d is not equal to 0 and, just to emphasize d's equal to c minus A transpose
u, if this is not equal to 0 you get to minus infinity.
If in fact, d is equal to 0, which means that c is equal to A transpose u, then I
can't do anything over here. This vector is equal to 0, I multiply to
any other vector, I'll get a 0. So you end up getting that this
minimization problem has a value equal to 0.
And p is greater then equal to b transpose u.
Now u is arbitrary, the only thing that I needed to do was, which is missed over
here, is that I needed to have that u must be greater than equal to zero.
So now, you have p must be greater than equal to maximize b transpose u, which is
the value here, provided A transpose u is equal to c and u is greater than equal to
0. That immediately gives you a weak duality,
a little bit more work gives you a strong duality.
So here's the connection. Max-, minimize C transpose x, Ax greater
than or equal to B is equal to maximize B transpose U, A transpose U equal to C, and
U greater than equal to 0. That's what we derived over here.
What we did, we dualize constraints, and this word will show up some times during
the course. Dualize means I take the constraints and
multiply them by a variable which has a particular sign.
We had a constraint Ax minus b, I multiplied by a vector of u which is
greater than equal to 0, that's dualization and then I'd relax the
constraint. I have this constraint Ax greater than
equal to b, I didn't like it, it was too complicated, I threw it away.
I'd relaxed them! And by doing dualization and relaxation,
you end up getting something called a La Grangian relaxation, which gives you
duals, And gives you some very nice properties that we'll explore more in the
course. .