>> . In this module, we will be walking through a review of linear optimization using a hedging example. We are going to see how there are two kinds of optimization problems, something called a primal optimization problem, something called a dual optimization problem, how they are related. And we also introduced the idea of La Grangian relaxation. Here's the hedging problem that we are interested in. I've got d assets. So take d, just to fix ideas, take d to be 3 or 4. But some number of assets that are there in the market. The price of these assets at time 0 is r to the d. It's some price vector, which has d components, where every component tells me the price of that particular asset. At time t equal to 1, the market outcomes are uncertain. I don't know what, which state the market is going to be, it's going to be in one of m possible states. So here's the situation at time T equal to zero, and I'm in state zero, And I could go to one of n different possible states. Go from one, two, all the way to, down to m. What do I mean by a state? For my purposes, state is simply telling me what are the different prices that can happen. So I'm going to characterize every state by the prices of the asset in that particular state And I can do it in 2 different ways. I could either tell you, what is the price of all the assets in any given state, Or I can tell you the price of a given in all possible states and we'll flip between these two ideas as we go through. We'll use the power of matrices to see what it means, sometimes, it's easier to represent it one way, Sometimes it's easier to represent it the other way. And understanding it both ways gives us an insight of which one is more beneficial for the particular application that I'm looking for. So, to motivate that, I'm going to define something called Sj. So S sub j, will be a column vector, and it will tell me the price of asset j in all possible states, So it's s1j, s2j all the way down to smj. J is fixed, it's asset j and the number of states would ran-, go from 1 to m. Now how do I represent all the assets in the market? I'm just going to take these column and stack them. S1 would refer to the column corresponding to S, asset one S2, asset two and so one up to S D. If I write them out in gory detail, you end up getting this matrix. The matrix has M rows, because it's got M states, its got D columns because its got D assets. And what going on here, every row here, tells you the price of all the assets in a particular state. So what I've circled here, are the prices in state 1, S1d, 12S, 11S, 12, all the way up to S1D. Genetically, somewhere it's going to be Sm1, Sm2 , all to Smd. So every row tells me what happens to all the assets in a given state, every column tells me what happens to a particular asset in all the different states. Alright, so that's how I'm going to describe my market. At times 0 I know the price, at time 1, which is the place where the market opens again, I don't know the price, it's uncertain, it's going to be one of n possible states, but I know that if the state is given to me, what the prices are. Now you might think that this is a very simple representation of the market, and indeed it is. But it turns out, that even in the most modern methods, of risk management, like value at risk, and conditional value at risk, people represent what happens to the market By a model which looks very much like this. Except instead of talking about states, we'll talk about simulations. I'm going to simulate returns and I'm going to say, depending on all of these different scenarios, what happens? But essentially the, the main ideas are going to be captured by this simple toy model. Alright, I know what happens to the asset. Now, what I want to do with these assets is to hedge an obligation. I'm going to walk through what does hedging mean. So, I have an obligation. What is an obligation? It's a vector x in rm. Why m? Because the obligation depends on the state. In a good state I might had to pay more, in a bad state I might had to pay less. So depending upon which state occurs, I'm going to pay an obligation xi, if this state i occurs. And what I want to do, is I want to buy a portfolio now, I want to buy a certain number of shares of the assets right now, in order to have enough money to pay my obligation. So, I'm going to choose a portfolio theta. Theta-1 through theta-d are going to be the number of shares that I'm going to purchase of each of these assets. I'm going to allow for the possibility of short selling, so thetas could be negative, or I'll buy them long, which is, when thetas are positive. And I want to do, I want to choose this portfolio, so I can hedge the obligation that, That I'm interested in hedging. So I'll step you through what it means and what, what hedging will ensue and then we'll go to what is the linear optimization problem that we end up getting. Okay! So at time 0, I'm going to put, buy a position theta at rd. Why d? Because it's got d different assets. So theta J's the number of shares of assets J that I purchased, where J goes from 1 through d. What's the cost of this purchase? The cost of the position theta, is simply the price of every asset times the number of shares that I produce. A lot of those assets! So it's pj times theta j, sum from j equal to 1 through d. It's the inner product of the vector p, with the vector theta. And we know that inner products are nothing but p transpose times theta. What happens at time t equal to 1? So, if a state i occurs, then I'm going to liquidate my position. And when I liquidate positions, I'm going to sell the assets. And what's going to happen? In state i, the price of asset j is s, i, j, I hold theta j positi-, shares of this asset, so by selling asset j I get Sij times theta j But I'm going to liquidate the entire portfolio. So j I'm going to sum from 1 equal to d and that's the amount of money, that's the pay-off that I'm going to get in the stake. Its just Sij time theta j sum from j equal to 1 through d, That gives me yi. And if you just think about it for a moment, if I stock up all the pay-offs as a vector. If I call a vector y to be y1, y2, all the way up to y m, the pay-offs in the m states This is nothing but the matrix S, Times the vector theta. Just to remind you, this is my matrix S, it has prices for every state as rows. So the payoff in the first state would be this row times the vector. And the second state would be this row times the vector. And that's exactly what is being done here. Sij, as j goes from 1 through D, is a row. You take the i'th row, multiply it by the vector, you get the payoff in the i'th state. And therefore, the vector y is simply the matrix, times theta. Now I want to look at this, in a different way. I want to think of this, as not multiplying row, row by row but column by column. So, instead of looking at this matrix S, as row by row as we have done so far, I'm going to put them in columns. I've got the column for the first asset, column for the second asset, column for the d asset and, that's going to multiply theta-1 theta-2 up to theta-d. And you can see that this multiplication is nothing but, this row, times this column, so you'll get theta j times Sj. J is summed from 1 through, that should not be an n, but a d. It goes from j equals 1, through d. Interpreting it this way, you end up getting a different interpretation of what is going on. Now, the pay-offs ys are nothing but linear combinations of the columns. So vector y, will, I can represent, I can generate a particular pay-off, if it's in the range of the matrix S. Remember in the concept, we had introduced this concept in the model of matrices, that a vector y, belongs to the range, of s, if these are all vectors s-theta, where theta is in, r to the d. And therefore, y belongs to the range of S. And remember, we had also introduced the notion of rank. We had said, that rank tells me how rich that space is. Can I, how many different vectors can I generate in the range? So if the rank of the matrix S is n, is equal to m, meaning all possible pay-offs can be generated, then I can hedge everything. If on the other hand, the rank of the matrix s, is less than m, that means there are certain pay-offs. There are certain pay-off vectors that can not be generated because they can not be produced as if I'm make, taking the matrix S and multiplying it to theta. So that's the concept that's going to play a role in the course.We're going to talk about complete markets and incomplete markets, and that has to do with the rank of this matrix. Before we get there, here's a simpler notion. We'll say that a pay-off y hedges x, if y is greater than or equal to x. Component by component, the pay-off that I've generated using my portfolio is greater than the pay-off that I need to hedge or give-, the payoff that I need to give at time 1. Alright, now comes our first optimization problem. What's the optimization problem? I want to minimize the price of the portfolio such that it hedges my obligation. I want to minimize p transpose theta, such that s-theta is greater or equal to x. And, what are some features about this optimization problem? The objective! What I'm trying to maximize or minimize is a linear function, linear function of theta. The constraints! Constraints, and what thetas that I can choose, are again linear constraints, as theta must be greater than or equal to x. Linear or inequality. So any optimization problem, which has a linear objective function, either minimization or maximization it doesn't matter, and all the constraints are either linear inequalities, as is the case here, or linear equalities. We will call that problem, a linear optimization problem, or a linear program. It turns out that linear programs are very rich, there's a rich theory about them, and you can do a lot of interesting things with them. You can model lots of problems, you can solve them very efficiently, you can get a lot of interpretation out of them. So the one thing that we're going to focus on in linear optimization, and the interpretation of linear optimization is the notion of duality. And what do I mean by the notion of duality? For every linear program, I can write another linear program which is intimately connected to it, and this connection is called duality. So I've got a linear program here minimize x, c transpose x, Ax greater than or equal to b. Here's another linear program. Maximize B transpose U, A transpose U equal to C and U is greater than or equal to 0. Min goes to max. Now, for the purposes of this course, you will not be responsible for how I generated the dual linear program from the primal linear program, or the second linear program from the first linear program, we will give you that in the course. The only thing that you're going to be responsible for is, to understand the relationship, and we'll emphasize this again during the course. They're here from-, some of the interesting resolve that comes from this duality concept. The first thing is something called weak duality. What it says is that this minimization problem. So the way I, the picture that I have, I wanted to keep in mind is that here's my value P. For all feasible values, for all xs, such that Ax is greater than or equal to wha-, b, I get some numbers on this side. Why do I get greater? Because I'm trying to minimize. So I get the lowest possible number is p. I've got another number d. Which is the value of this second linear program down here. For all Us that are feasible, meaning that satisfies A transpose U equal to C, and U is greater than or equal to 0, I'll get values that are less than this D. Why? Because this is a maximization problem. The largest possible thing that I can get is B transpose U, is going to be equal to D. The first theorem says, that in fact, this picture that I'm drawing is correct. That P is going to be greater than D. There's no reason for it to be that way, they could have crossed. But the nice thing is, if you construct the dual linear program correctly, and in the next slide I'm going to show you a simple example. Again let me emphasize you're not responsible for knowing it. Just walking through the exercise, and understanding how I'm going to use the duality. So the primal and the dual are intimately connected; the optimum value of the primal P is greater than equal to the optimum value of the dual D. And because this is true, you end up getting a chain of inequalities that are very good. We know that c transpose x is greater than equal to p. Why? Because I''m trying to minimize c transpose x, so for any feasible x, anything that satisfies the inequality Ax greater than equal to b, I'll get a value greater that b. The second piece is also true, D is greater than equal to b, transpose u because I'm trying to maximize b transpose u. And the inner part comes because of the v duality. So now this gives you a very interesting way, if I can find an x, and I can find a u, such that c transpose x is close to b transpose u, then I know that the x must be very close to optimal, And u must also be very close to optimal for the dual. Why? Because P is greater than, equal to D, if these two at, points are very close, it must, it must be that P is also very close to c transpose x, which means that it's optimum. Similarly, b transpose u must be close to D, which means that is optimum. You end up, you can go one step further, and say that when either P, or D, is finite, either the primal value is finite, or the dual value is finite, then in fact, they must be equal. And finally, the reason why we call them dual, is because if you go from the primal to the dual, and from, you take the dual of the dual, you get back the primal, and that's why they're called dual linear programs. Going a little bit further. Here's another pair of primal dual linear programs that we'll be, that we'll be using in the course. Minimize over x, c transpose x, Ax equals b, they, it's equal to maximize u, b transpose u, A transpose u equal to c. And this equality, I'm putting it there to emphasize the fact that there is strong duality between them. But to keep in mind that this equality holds only if you can show that either this one, p or this one, b is finite. So if p, or d is less than, is less than, is not equal to let's say, plus infinity or minus infinity, in that case these two values are the same, and in that sense they are equal. So, the last piece in this module, we're going to walk through and tell you how to construct a duel, it's a very general concept called La Grangian relaxation. And we'll use that in the next module on non linear programming, as well. So here's the problem. The primal problem is, minimize c transpose x, Ax, greater than or equal to b. Now I'm going to take a vector u, which is greater than or equal to zero. And so u is component wise greater then or equal to 0, and remember Ax is going to be greater than equal to b. So Ax minus b is component wise greater than equal to 0, you take some vector which is component wise greater than equal to the oh, multiplied to another vector that is component wise greater than equal to 0, you end up getting a number, which is greater than or equal to 0. So I'm subtracting it, so I'm getting a number, which is less than c transpose x. So this linear program, which has a changed objective function involving this vector u, is going to have a value, which is going to be less than equal to P. B transpose U does not involve the decision x, it does not involve the minimization, the minimization is going on over x, it does not involve x, so I pull that out. I end up getting a new problem, which is minimize c minus a transpose u times x. And what happened to this constraint? I threw it away! Why did I throw it away? It's complicated. I don't know how to deal with constraints, so I threw it away. But I'm guaranteed that if I throw away these constraints my set over which I can optimize, the xs over that I can chose become larger, and therefore this minimum only becomes smaller. So I end up getting that this quantity, is going to be smaller than the previous line. Now, because I don't have any constraints, I have a very simple problem. I've got some vector, let's call this vector d. I've got d transpose x. So I want to minimize d transpose x. So here's my d vector. I want to minimize this, d transpose x, and I can choose my x to be anything. So what am I going to do? I'm going to choose my x to be in the negative direction and going off to infinity. Here's my b, here's my x! If I multiply d and x together I get a very large negative number. So if I have vector d which is not equal to 0, I can make this optimization problem equal to minus infinity and that's what this says. If d is not equal to 0 and, just to emphasize d's equal to c minus A transpose u, if this is not equal to 0 you get to minus infinity. If in fact, d is equal to 0, which means that c is equal to A transpose u, then I can't do anything over here. This vector is equal to 0, I multiply to any other vector, I'll get a 0. So you end up getting that this minimization problem has a value equal to 0. And p is greater then equal to b transpose u. Now u is arbitrary, the only thing that I needed to do was, which is missed over here, is that I needed to have that u must be greater than equal to zero. So now, you have p must be greater than equal to maximize b transpose u, which is the value here, provided A transpose u is equal to c and u is greater than equal to 0. That immediately gives you a weak duality, a little bit more work gives you a strong duality. So here's the connection. Max-, minimize C transpose x, Ax greater than or equal to B is equal to maximize B transpose U, A transpose U equal to C, and U greater than equal to 0. That's what we derived over here. What we did, we dualize constraints, and this word will show up some times during the course. Dualize means I take the constraints and multiply them by a variable which has a particular sign. We had a constraint Ax minus b, I multiplied by a vector of u which is greater than equal to 0, that's dualization and then I'd relax the constraint. I have this constraint Ax greater than equal to b, I didn't like it, it was too complicated, I threw it away. I'd relaxed them! And by doing dualization and relaxation, you end up getting something called a La Grangian relaxation, which gives you duals, And gives you some very nice properties that we'll explore more in the course. .