You are on page 1of 63

Why nancial markets?

Financial markets enable ecient allocation of resources


across time
across states of nature
Young worker with a high salary. What should she do?
Financial markets: invest in stocks and bonds to nance retirement, home
ownership, education, etc.
No nancial markets: consumption? what else?
Farmer producing oranges.
Financial markets: Hedge using futures markets, weather related derivatives.
No nancial markets: only the spot market available.
2
DR. M.KASHIF
1

More on markets and products
Role of Markets
Gather information
Aggregate liquidity i.e. supply and demand
Promote eciency and fairness
Products: satisfy needs
Hedge risk
Allow speculation
Raise funds
Fund liabilities
3
DR. M.KASHIF
2

Modeling nancial markets
Two kinds of market models
Discrete time models
Single period models
Multi-period models
Continuous time models
Pros/Cons of discrete time models
Pros: All important concepts with less sophisticated mathematics
Cons: No closed form solutions ... have to resort to numerical calculations.
Focus of this course: Discrete time multi-period models
Caveat: Very, very few continuous time concepts covered, e.g. the Black-Scholes
formula.
4
DR. M.KASHIF
3

Financial Economics vs Financial Engineering
Financial Economics: Use equilibrium arguments to
Price equities, bonds and other assets
Set interest rates
Financial Engineering: Assume prices of equities and interest rates given
Price derivatives on equities, bonds, interest rates, etc., using the
no-arbitrage condition
Not even close to being a complete separation
For example, Capital Asset Pricing Model of interest to both
5
DR. M.KASHIF
4

Central problems of FE
Security pricing
Primary securities: stocks and bonds ... nancial economics
Derivative securities: forwards, swaps, futures, options, on the underlying
securities.
Portfolio selection: choose a trading strategy to maximize the utility of
consumption and nal wealth.
Intimately related to security pricing
Single-period models: Markowitz portfolio selection
Real options, e.g. options on gas pipelines, oil leases, mines.
Risk management: understand the risks inherent in a portfolio
Tail risk: probability of large losses
Value-at-risk and conditional value-at-risk
Starting to become important for portfolio selection as well.
Led to interesting applied math / operations research problems.
6
DR. M.KASHIF
5

Discrete Random Variables
Denition. The cumulative distribution function (CDF), F(), of a random
variable, X, is dened by
F(x) := P(X x).
Denition. A discrete random variable, X, has probability mass function (PMF),
p(), if p(x) 0 and for all events A we have
P(X A) =

xA
p(x).
Denition. The expected value of a discrete random variable, X, is given by
E[X] :=

i
x
i
p(x
i
).
Denition. The variance of any random variable, X, is dened as
Var(X) := E
#
(X E[X])
2
$
= E[X
2
] E[X]
2
.
2
DR. M.KASHIF
6

The Binomial Distribution
We say X has a binomial distribution, or X Bin(n, p), if
P(X = r) =
3
n
r
4
p
r
(1 p)
nr
.
For example, X might represent the number of heads in n independent coin
tosses, where p = P(head). The mean and variance of the binomial distribution
satisfy
E[X] = np
Var(X) = np(1 p).
3
DR. M.KASHIF
7

A Financial Application
Suppose a fund manager outperforms the market in a given year with
probability p and that she underperforms the market with probability 1 p.
She has a track record of 10 years and has outperformed the market in 8 of
the 10 years.
Moreover, performance in any one year is independent of performance in
other years.
Question: How likely is a track record as good as this if the fund manager had no
skill so that p = 1/2?
Answer: Let X be the number of outperforming years. Since the fund manager
has no skill, X Bin(n = 10, p = 1/2) and
P(X 8) =
n

r=8
3
n
r
4
p
r
(1 p)
nr
Question: Suppose there are M fund managers? How well should the best one do
over the 10-year period if none of them had any skill?
4
DR. M.KASHIF
8

The Poisson Distribution
We say X has a Poisson() distribution if
P(X = r) =

r
e

r!
.
E[X] = and Var(X) = .
For example, the mean is calculated as
E[X] =

r=0
r P(X = r) =

r=0
r

r
e

r!
=

r=1
r

r
e

r!
=

r=1

r1
e

(r 1)!
=

r=0

r
e

r!
= .
5
DR. M.KASHIF
9

Bayes Theorem
Let A and B be two events for which P(B) = 0. Then
P(A| B) =
P(A
u
B)
P(B)
=
P(B | A)P(A)
P(B)
=
P(B | A)P(A)
q
j
P(B | A
j
)P(A
j
)
where the A
j
s form a partition of the sample-space.
6
DR. M.KASHIF
1
0

An Example: Tossing Two Fair 6-Sided Dice
6 7 8 9 10 11 12
5 6 7 8 9 10 11
4 5 6 7 8 9 10
Y
2
3 4 5 6 7 8 9
2 3 4 5 6 7 8
1 2 3 4 5 6 7
1 2 3 4 5 6
Y
1
Table : X = Y
1
+Y
2
Let Y
1
and Y
2
be the outcomes of tossing two fair dice independently of
one another.
Let X := Y
1
+Y
2
. Question: What is P(Y
1
4 | X 8)?
7
DR. M.KASHIF
1
1

Continuous Random Variables
Denition. A continuous random variable, X, has probability density function
(PDF), f (), if f (x) 0 and for all events A
P(X A) =

A
f (y) dy.
The CDF and PDF are related by
F(x) =

f (y) dy.
It is often convenient to observe that
P
1
X
1
x

2
, x +

2
22
f (x)
8
DR. M.KASHIF
1
2

The Normal Distribution
We say X has a Normal distribution, or X N(,
2
), if
f (x) =
1

2
2
exp
3

(x )
2
2
2
4
.
The mean and variance of the normal distribution satisfy
E[X] =
Var(X) =
2
.
9
DR. M.KASHIF
1
3

The Log-Normal Distribution
We say X has a log-normal distribution, or X LN(,
2
), if
log(X) N(,
2
).
The mean and variance of the log-normal distribution satisfy
E[X] = exp( +
2
/2)
Var(X) = exp(2 +
2
) (exp(
2
) 1).
The log-normal distribution plays a very important in nancial applications.
10
DR. M.KASHIF
1
4

Conditional Expectations and Variances
Let X and Y be two random variables.
The conditional expectation identity says
E[X] = E[E[X|Y]]
and the conditional variance identity says
Var(X) = Var(E[X|Y]) + E[Var(X|Y)].
Note that E[X|Y] and Var(X|Y) are both functions of Y and are therefore
random variables themselves.
2
DR. M.KASHIF
1
5

A Random Sum of Random Variables
Let W = X
1
+X
2
+ . . . +X
N
where the X
i
s are IID with mean
x
and variance

2
x
, and where N is also a random variable, independent of the X
i
s.
Question: What is E[W]?
Answer: The conditional expectation identity implies
E[W] = E
C
E
C
N

i=1
X
i
| N
DD
= E[N
x
] =
x
E[N] .
Question: What is Var(W)?
Answer: The conditional variance identity implies
Var(W) = Var(E[W|N]) + E[Var(W|N)]
= Var(
x
N) + E[N
2
x
]
=
2
x
Var(N) +
2
x
E[N].
3
DR. M.KASHIF
1
6

An Example: Chickens and Eggs
A hen lays N eggs where N Poisson(). Each egg hatches and yields a chicken
with probability p, independently of the other eggs and N. Let K be the number
of chickens.
Question: What is E[K|N]?
Answer: We can use indicator functions to answer this question.
In particular, can write K =
q
N
i=1
1
H
i
where H
i
is the event that the i
th
egg
hatches. Therefore
1
H
i
=
;
1, if i
th
egg hatches;
0, otherwise.
Also clear that E[1
H
i
] = 1 p + 0 (1 p) = p so that
E[K|N] = E
C
N

i=1
1
H
i
| N
D
=
N

i=1
E[1
H
i
] = Np.
Conditional expectation formula then gives E[K] = E[E[K|N]] = E[Np] = p.
4
DR. M.KASHIF
1
7

Multivariate Distributions I
Let X = (X
1
. . . X
n
)

be an n-dimensional vector of random variables.


Denition. For all x = (x
1
, . . . , x
n
) R
n
, the joint cumulative distribution
function (CDF) of X satises
F
X
(x) = F
X
(x
1
, . . . , x
n
) = P(X
1
x
1
, . . . , X
n
x
n
).
Denition. For a xed i, the marginal CDF of X
i
satises
F
X
i
(x
i
) = F
X
(, . . . , , x
i
, , . . . ).
It is straightforward to generalize the previous denition to joint marginal
distributions. For example, the joint marginal distribution of X
i
and X
j
satises
F
ij
(x
i
, x
j
) = F
X
(, . . . , , x
i
, , . . . , , x
j
, , . . . ).
We also say that X has joint PDF f
X
(, . . . , ) if
F
X
(x
1
, . . . , x
n
) =

x
1

x
n

f
X
(u
1
, . . . , u
n
) du
1
. . . du
n
.
2
DR. M.KASHIF
1
8

Multivariate Distributions II
Denition. If X
1
= (X
1
, . . . X
k
)

and X
2
= (X
k+1
. . . X
n
)

is a partition of
X then the conditional CDF of X
2
given X
1
satises
F
X
2
|X
1
(x
2
| x
1
) = P(X
2
x
2
| X
1
= x
1
).
If X has a PDF, f
X
(), then the conditional PDF of X
2
given X
1
satises
f
X
2
|X
1
(x
2
| x
1
) =
f
X
(x)
f
X
1
(x
1
)
=
f
X
1
|X
2
(x
1
| x
2
)f
X
2
(x
2
)
f
X
1
(x
1
)
(1)
and the conditional CDF is then given by
F
X
2
|X
1
(x
2
|x
1
) =

x
k+1

x
n

f
X
(x
1
, . . . , x
k
, u
k+1
, . . . , u
n
)
f
X
1
(x
1
)
du
k+1
. . . du
n
where f
X
1
() is the joint marginal PDF of X
1
which is given by
f
X
1
(x
1
, . . . , x
k
) =

f
X
(x
1
, . . . , x
k
, u
k+1
, . . . , u
n
) du
k+1
. . . du
n
.
3
DR. M.KASHIF
1
9

Independence
Denition. We say the collection X is independent if the joint CDF can be
factored into the product of the marginal CDFs so that
F
X
(x
1
, . . . , x
n
) = F
X
1
(x
1
) . . . F
X
n
(x
n
).
If X has a PDF, f
X
() then independence implies that the PDF also factorizes
into the product of marginal PDFs so that
f
X
(x) = f
X
1
(x
1
) . . . f
X
n
(x
n
).
Can also see from (1) that if X
1
and X
2
are independent then
f
X
2
|X
1
(x
2
| x
1
) =
f
X
(x)
f
X
1
(x
1
)
=
f
X
1
(x
1
)f
X
2
(x
2
)
f
X
1
(x
1
)
= f
X
2
(x
2
)
so having information about X
1
tells you nothing about X
2
.
4
DR. M.KASHIF
2
0

Implications of Independence
Let X and Y be independent random variables. Then for any events, A and B,
P(X A, Y B) = P(X A) P(Y B) (2)
More generally, for any function, f () and g(), independence of X and Y implies
E[f (X)g(Y)] = E[f (X)]E[g(Y)]. (3)
In fact, (2) follows from (3) since
P(X A, Y B) = E
#
1
{XA}
1
{YB}
$
= E
#
1
{XA}
$
E
#
1
{YB}
$
by (3)
= P(X A) P(Y B) .
5
DR. M.KASHIF
2
1

Implications of Independence
More generally, if X
1
, . . . X
n
are independent random variables then
E[f
1
(X
1
)f
2
(X
2
) f
n
(X
n
)] = E[f
1
(X
1
)]E[f
2
(X
2
)] E[f
n
(X
n
)].
Random variables can also be conditionally independent. For example, we say X
and Y are conditionally independent given Z if
E[f (X)g(Y) | Z] = E[f (X) | Z] E[g(Y) | Z].
used in the (in)famous Gaussian copula model for pricing CDOs!
In particular, let D
i
be the event that the i
th
bond in a portfolio defaults.
Not reasonable to assume that the D
i
s are independent. Why?
But maybe they are conditionally independent given Z so that
P(D
1
, . . . , D
n
| Z) = P(D
1
| Z) P(D
n
| Z)
often easy to compute this.
6
DR. M.KASHIF
2
2

The Mean Vector and Covariance Matrix
The mean vector of X is given by
E[X] := (E[X
1
] . . . E[X
n
])

and the covariance matrix of X satises


:= Cov(X) := E
#
(XE[X]) (XE[X])

$
so that the (i, j)
th
element of is simply the covariance of X
i
and X
j
.
The covariance matrix is symmetric and its diagonal elements satisfy
i,i
0.
It is also positive semi-denite so that x

x 0 for all x R
n
.
The correlation matrix, (X), has (i, j)
th
element
ij
:= Corr(X
i
, X
j
)
- it is also symmetric, positive semi-denite and has 1s along the diagonal.
7
DR. M.KASHIF
2
3

Variances and Covariances
For any matrix A R
kn
and vector a R
k
we have
E[AX+a] = AE[X] + a (4)
Cov(AX+a) = A Cov(X) A

. (5)
Note that (5) implies
Var(aX + bY) = a
2
Var(X) + b
2
Var(Y) + 2ab Cov(X, Y).
If X and Y independent, then Cov(X, Y) = 0
but converse not true in general.
8
DR. M.KASHIF
2
4

The Multivariate Normal Distribution I
If the n-dimensional vector X is multivariate normal with mean vector and
covariance matrix then we write
X MN
n
(, ).
The PDF of X is given by
f
X
(x) =
1
(2)
n/2
||
1/2
e

1
2
(x)


1
(x)
where | | denotes the determinant.
Standard multivariate normal has = 0 and = I
n
, the n n identity matrix
- in this case the X
i
s are independent.
The moment generating function (MGF) of X satises

X
(s) = E

e
s

= e
s

+
1
2
s

s
.
2
DR. M.KASHIF
2
5

The Multivariate Normal Distribution II
Recall our partition of X into X
1
= (X
1
. . . X
k
)

and X
2
= (X
k+1
. . . X
n
)

.
Can extend this notation naturally so that
=
3

1

2
4
and =
3

11

12

21

22
4
.
are the mean vector and covariance matrix of (X
1
, X
2
).
Then have following results on marginal and conditional distributions of X:
Marginal Distribution
The marginal distribution of a multivariate normal random vector is itself normal.
In particular, X
i
MN(
i
,
ii
), for i = 1, 2.
3 DR. M.KASHIF
2
6

The Bivariate Normal PDF
The Bivariate Normal PDF
4 DR. M.KASHIF
2
7

The Multivariate Normal Distribution III
Conditional Distribution
Assuming is positive denite, the conditional distribution of a multivariate
normal distribution is also a multivariate normal distribution. In particular,
X
2
| X
1
= x
1
MN(
2.1
,
2.1
)
where
2.1
=
2
+
21

1
11
(x
1

1
) and
2.1
=
22

21

1
11

12
.
Linear Combinations
A linear combination, AX+a, of a multivariate normal random vector, X, is
normally distributed with mean vector, AE[X] +a, and covariance matrix,
A Cov(X) A

.
5
DR. M.KASHIF
2
8

Martingales
Denition. A random process, {X
n
: 0 n }, is a martingale with respect
to the information ltration, F
n
, and probability distribution, P, if
1. E
P
[|X
n
|] < for all n 0
2. E
P
[X
n+m
|F
n
] = X
n
for all n, m 0.
Martingales are used to model fair games and have a rich history in the modeling
of gambling problems.
We dene a submartingale by replacing condition #2 with
E
P
[X
n+m
|F
n
] X
n
for all n, m 0.
And we dene a supermartingale by replacing condition #2 with
E
P
[X
n+m
|F
n
] X
n
for all n, m 0.
A martingale is both a submartingale and a supermartingale.
2
DR. M.KASHIF
2
9

Constructing a Martingale from a Random Walk
Let S
n
:=
q
n
i=1
X
i
be a random walk where the X
i
s are IID with mean .
Let M
n
:= S
n
n. Then M
n
is a martingale because:
E
n
[M
n+m
] = E
n
C
n+m

i=1
X
i
(n +m)
D
= E
n
C
n+m

i=1
X
i
D
(n +m)
=
n

i=1
X
i
+ E
n
C
n+m

i=n+1
X
i
D
(n +m)
=
n

i=1
X
i
+ m (n +m) = M
n
.
3
DR. M.KASHIF
3
0

A Martingale Betting Strategy
Let X
1
, X
2
, . . . be IID random variables with
P(X
i
= 1) = P(X
i
= 1) =
1
2
.
Can imagine X
i
representing the result of coin-ipping game:
Win $1 if coin comes up heads
Lose $1 if coin comes up tails
Consider now a doubling strategy where we keep doubling the bet until we
eventually win. Once we win, we stop and our initial bet is $1.
First note that size of bet on n
th
play is 2
n1
assuming were still playing at time n.
Let W
n
denote total winnings after n coin tosses assuming W
0
= 0.
Then W
n
is a martingale!
4
DR. M.KASHIF
3
1

A Martingale Betting Strategy
To see this, rst note that W
n
{1, 2
n
+ 1} for all n. Why?
1. Suppose we win for rst time on n
th
bet. Then
W
n
=
!
1 + 2 + + 2
n2
"
+ 2
n1
=
!
2
n1
1
"
+ 2
n1
= 1
2. If we have not yet won after n bets then
W
n
=
!
1 + 2 + + 2
n1
"
= 2
n
+ 1.
To show W
n
is a martingale only need to show E[W
n+1
| W
n
] = W
n
then follows by iterated expectations that E[W
n+m
| W
n
] = W
n
.
5
DR. M.KASHIF
3
2

A Martingale Betting Strategy
There are two cases to consider:
1: W
n
= 1: then P(W
n+1
= 1|W
n
= 1) = 1 so
E[W
n+1
| W
n
= 1] = 1 = W
n
(6)
2: W
n
= 2
n
+ 1: bet 2
n
on (n + 1)
th
toss so W
n+1
{1, 2
n+1
+ 1}.
Clear that
P(W
n+1
= 1 | W
n
= 2
n
+ 1) = 1/2
P(W
n+1
= 2
n+1
+ 1 | W
n
= 2
n
+ 1) = 1/2
so that
E[W
n+1
| W
n
= 2
n
+ 1] = (1/2)1 + (1/2)(2
n+1
+ 1)
= 2
n
+ 1 = W
n
. (7)
From (6) and (7) we see that E[W
n+1
| W
n
] = W
n
.
6
DR. M.KASHIF
3
3

Polyas Urn
Consider an urn which contains red balls and green balls.
Initially there is just one green ball and one red ball in the urn.
At each time step a ball is chosen randomly from the urn:
1. If ball is red, then its returned to the urn with an additional red ball.
2. If ball is green, then its returned to the urn with an additional green ball.
Let X
n
denote the number of red balls in the urn after n draws. Then
P(X
n+1
= k + 1 | X
n
= k) =
k
n + 2
P(X
n+1
= k | X
n
= k) =
n + 2 k
n + 2
.
Show that M
n
:= X
n
/(n + 2) is a martingale.
(These martingale examples taken from Introduction to Stochastic Processes
(Chapman & Hall) by Gregory F. Lawler.)
7
DR. M.KASHIF
3
4

Brownian Motion
Denition. We say that a random process, {X
t
: t 0}, is a Brownian motion
with parameters (, ) if
1. For 0 < t
1
< t
2
< . . . < t
n1
< t
n
(X
t
2
X
t
1
), (X
t
3
X
t
2
), . . . , (X
t
n
X
t
n1
)
are mutually independent.
2. For s > 0, X
t+s
X
t
N(s,
2
s) and
3. X
t
is a continuous function of t.
We say that X
t
is a B(, ) Brownian motion with drift and volatility
Property #1 is often called the independent increments property.
Remark. Bachelier (1900) and Einstein (1905) were the rst to explore Brownian
motion from a mathematical viewpoint whereas Wiener (1920s) was the rst to
show that it actually exists as a well-dened mathematical entity.
2
DR. M.KASHIF
3
5

Standard Brownian Motion
When = 0 and = 1 we have a standard Brownian motion (SBM).
We will use W
t
to denote a SBM and we always assume that W
0
= 0.
Note that if X
t
B(, ) and X
0
= x then we can write
X
t
= x + t +W
t
(8)
where W
t
is an SBM. Therefore see that X
t
N(x + t,
2
t).
3
DR. M.KASHIF
3
6

Sample Paths of Brownian Motion
4
DR. M.KASHIF
3
7

Information Filtrations
For any random process we will use F
t
to denote the information available
at time t
- the set {F
t
}
t0
is then the information ltration
- so E[ | F
t
] denotes an expectation conditional on time t information available.
Will usually write E[ | F
t
] as E
t
[ ].
Important Fact: The independent increments property of Brownian motion
implies that any function of W
t+s
W
t
is independent of F
t
and that
(W
t+s
W
t
) N(0, s).
5
DR. M.KASHIF
3
8

A Brownian Motion Calculation
Question: What is E
0
[W
t+s
W
s
]?
Answer: We can use a version of the conditional expectation identity to obtain
E
0
[W
t+s
W
s
] = E
0
[(W
t+s
W
s
+ W
s
) W
s
]
= E
0
[(W
t+s
W
s
) W
s
] + E
0
#
W
2
s
$
. (9)
Now we know (why?) E
0
[W
2
s
] = s.
To calculate rst term on r.h.s. of (9) a version of the conditional expectation
identity implies
E
0
[(W
t+s
W
s
) W
s
] = E
0
[E
s
[(W
t+s
W
s
) W
s
]]
= E
0
[W
s
E
s
[(W
t+s
W
s
)]]
= E
0
[W
s
0]
= 0.
Therefore obtain E
0
[W
t+s
W
s
] = s.
6
DR. M.KASHIF
3
9

Geometric Brownian Motion
Denition. We say that a random process, X
t
, is a geometric Brownian motion
(GBM) if for all t 0
X
t
= e
!

2
2
"
t + W
t
where W
t
is a standard Brownian motion.
We call the drift, the volatility and write X
t
GBM(, ).
Note that
X
t+s
= X
0
e
!

2
2
"
(t+s) + W
t+s
= X
0
e
!

2
2
"
t + W
t
+
!

2
2
"
s + (W
t+s
W
t
)
= X
t
e
!

2
2
"
s + (W
t+s
W
t
)
(10)
a representation that is very useful for simulating security prices.
2
DR. M.KASHIF
4
0

Geometric Brownian Motion
Question: Suppose X
t
GBM(, ). What is E
t
[X
t+s
]?
Answer: From (10) we have
E
t
[X
t+s
] = E
t
5
X
t
e
!

2
2
"
s + (W
t+s
W
t
)
6
= X
t
e
!

2
2
"
s
E
t

e
(W
t+s
W
t
)

= X
t
e
!

2
2
"
s
e

2
2
s
= e
s
X
t
so the expected growth rate of X
t
is .
3
DR. M.KASHIF
4
1

Sample Paths of Geometric Brownian Motion
4
DR. M.KASHIF
4
2

Geometric Brownian Motion
The following properties of GBM follow immediately from the denition of BM:
1. Fix t
1
, t
2
, . . . , t
n
. Then
X
t
2
X
t
1
,
X
t
3
X
t
2
, . . . ,
X
t
n
X
t
n1
are mutually independent.
2. Paths of X
t
are continuous as a function of t, i.e., they do not jump.
3. For s > 0, log
1
X
t+s
X
t
2
N
1
(

2
2
)s,
2
s
2
.
5
DR. M.KASHIF
4
3

Modeling Stock Prices as GBM
Suppose X
t
GBM(, ). Then clear that:
1. If X
t
> 0, then X
t+s
is always positive for any s > 0.
- so limited liability of stock price is not violated.
2. The distribution of X
t+s
/X
t
only depends on s and not on X
t
These properties suggest that GBM might be a reasonable model for stock prices.
Indeed it is the underlying model for the famous Black-Scholes option formula.
6
DR. M.KASHIF
4
4

Reals numbers and vectors
We will denote the set of real numbers by R
Vectors are nite collections of real numbers
Vectors come in two varieties
Row vectors: v =
#
v
1
v
2
. . . v
n
$
Column vectors w =

w
1
w
2
.
.
.
w
n

By default, vectors are column vectors


The set of all vectors with n components is denoted by R
n
2
DR. M.KASHIF
4
5

Linear independence
A vector w is linearly dependent on v
1
, v
2
if
w =
1
v
1
+
2
v
2
for some
1
,
2
R
Example:

2
6
4

= 2

1
1
0

+4

0
1
1

Other names: linear combination, linear span


A set V = {v
1
, . . . , v
m
} are linearly independent if no v
i
is linearly
dependent on the others, {v
j
: j = i}
3
DR. M.KASHIF
4
6

Basis
Every w R
n
is a linear combination of the linearly independent set
B =
_

_
_

_
1
0
.
.
.
0
_

_
,
_

_
0
1
.
.
.
0
_

_
, . . .
_

_
0
0
.
.
.
1
_

_
_

_
w = w
1
_

_
1
0
.
.
.
0
_

_
.
e
1
+w
2
_

_
0
1
.
.
.
0
_

_
.
e
2
+. . . +w
n
_

_
0
0
.
.
.
1
_

_
.
e
n
Basis any linearly independent set that spans the entire space
Any basis for R
n
has exactly n elements
4
DR. M.KASHIF
4
7

Norms
A function (v) of a vector v is called a norm if
(v) 0 and (v) = 0 implies v = 0
(v) = || (v) for all R
(v
1
+ v
2
) (v
1
) + (v
2
) (triangle inequality)
generalizes the notion of length
Examples:

2
norm: x
2
=

q
n
i=1
|x
i
|
2
... usual length

1
norm: x
1
=
q
n
i=1
|x
i
|

norm: x

= max
1in
|x|
i

p
norm, 1 p < : x
p
=
1
q
n
i=1
|x|
p
i
21
p
5
DR. M.KASHIF
4
8

Inner product
The inner-product or dot-product of two vector v, w R
n
is dened as
v w =
n

i=1
v
i
w
i
The
2
norm v
2
=

v v
The angle between two vectors v and w is given by
cos() =
v w
v
2
w
2
Will show later: v w = v

w = product of v transpose and w


6
DR. M.KASHIF
4
9

Matrices
Matrices are rectangular arrays of real numbers
Examples:
A =
5
2 3 7
1 6 5
6
: 2 3 matrix
B =
#
2 3 7
$
: 1 3 matrix row vector
A =

a
11
a
12
. . . a
1n
a
21
a
22
. . . a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
. . . a
mn

: mn matrix ... R
mn
I =

1 0 . . . 0
0 1 . . . 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 . . . 1

... n n Identity matrix


Vectors are clearly also matrices
2
DR. M.KASHIF
5
0

Matrix Operations: Transpose
Transpose: A R
md
A

a
11
a
12
. . . a
1d
a
21
a
22
. . . a
2d
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
. . . a
md

a
11
a
21
. . . a
m1
a
12
a
22
. . . a
d2
.
.
.
.
.
.
.
.
.
.
.
.
a
1d
a
2d
. . . a
md

R
dm
Transpose of a row vector is a column vector
Example:
A =
5
2 3 7
1 6 5
6
: 2 3 matrix ... A

=
C
2 1
3 6
7 5
D
: 3 2 matrix
v =
C
2
6
4
D
: column vector ... v

=
#
2 6 4
$
: row vector
3
DR. M.KASHIF
5
1

Matrix Operations: Multiplication
Multiplication: A R
md
, B R
dp
then C = AB R
mp
c
ij
=
#
a
i1
a
i2
. . . a
id
$

b
1j
b
2j
.
.
.
b
dj

row vector v R
1d
times column vector w R
d1
is a scalar.
Identity times any matrix AI
n
= I
m
A = A
Examples:
5
2 3 7
1 6 5
6
C
2
6
4
D
=
5
2(2) + 3(6) + 7(4)
1(2) + 6(6) + 5(4)
6
=
5
50
58
6

2
norm:
.
.
.
.
5
1
2
6.
.
.
.
2
=

1
2
+ (2)
2
=

#
1 2
$
5
1
2
6
=

5
1
2
6

5
1
2
6
inner product: v w = v

w
4
DR. M.KASHIF
5
2

Linear functions
A function f : R
d
R
m
is linear if
f (x +y) = f (x) +f (y), , R, x, y R
d
A function f is linear if and only if f (x) = Ax for matrix A R
md
Examples
f (x) : R
3
R: f (x) =
#
2 3 4
$
C
x
1
x
2
x
3
D
= 2x
1
+ 3x
2
+ 4x
3
f (x) : R
3
R
2
: f (x) =
5
2 3 4
1 0 2
6
C
x
1
x
2
x
3
D
=
5
2x
1
+ 3x
2
+ 4x
3
x
1
+ 2x
3
6
Linear constraints dene sets of vectors that satisfy linear relationships
Linear equality: {x : Ax = b} ... line, plane, etc.
Linear inequality: {x : Ax b} ... half-space
5
DR. M.KASHIF
5
3

Rank of a matrix
column rank of A R
md
= number of linearly independent columns
range(A) = {y : y = Ax for some x}
column rank of A = size of basis for range(A)
column rank of A = m range(A) = R
m
row rank of A = number of linearly independent rows
Fact: row rank = column rank min{m, d}
Example:
A =
5
1 2 3
2 4 6
6
, rank = 1, range(A) =
;

5
1
2
6
: R
<
A R
nn
and rank(A) = n A invertible, i.e. A
1
R
nn
A
1
A = AA
1
= I
6
DR. M.KASHIF
5
4

Hedging problem
d assets
Prices at time t = 0: p R
d
Market in m possible states at time t = 1
Price of asset j in state i = S
ij
S
j
=

S
1j
S
2j
.
.
.
S
mj

S =
#
S
1
S
2
. . . S
d
$
=

S
11
S
12
. . . S
1d
S
21
S
22
. . . S
2d
.
.
.
.
.
.
.
.
.
.
.
.
S
m1
S
m2
. . . S
md

R
md
Hedge an obligation X R
m
Have to pay X
i
if state i occurs
Buy/short sell = (
1
, . . . ,
d
)

shares to cover obligation


2
DR. M.KASHIF
5
5

Hedging problem (contd)
Position R
d
purchased at time t = 0

j
= number of shares of asset j purchased, j = 1, . . . , d
Cost of the position =
q
d
j=1
p
j

j
= p

Payo from liquidating position at time t = 1


payo y
i
in state i: y
i
=
q
d
j=1
S
ij

j
Stacking payos for all states: y = S
Viewing the payo vector y: y range(S)
y =
#
S
1
S
2
. . . S
d
$

2
.
.
.

=
d

j=1

j
S
j
Payo y hedges X if y X.
3
DR. M.KASHIF
5
6

Hedging problem (contd)
Optimization problem:
min
q
d
j=1
p
j

j
( p

)
subject to
q
d
j=1
S
ij

j
X
i
, i = 1, . . . , m ( S X)
Features of this optimization problem
Linear objective function: p

Linear inequality constraints: S X


Example of a linear program
Linear objective function: either a min/max
Linear inequality and equality constraints
max/min
x
c

x
subject to A
eq
x = b
eq
A
in
x b
in
4
DR. M.KASHIF
5
7

Linear programming duality
Linear program
P = min
x
c

x
subject to Ax b
Dual linear program
D = max
u
b

u
subject to A

u = c
u 0
Theorem.
Weak Duality: P D
Bound: x feasible for P, u feasible for D, c

x P D b

u
Strong Duality: Suppose P or D nite. Then P = D.
Dual of the dual is the primal (original) problem
5 DR. M.KASHIF
5
8

More duality results
Here is another primal-dual pair
min
x
c

x
subject to Ax = b
= max
u
b

u
subject to A

u = c
General idea for constructing duals
P = min{c

x : Ax b}
min{c

x u

(Ax b) : Ax b} for all u 0


b

u + min{(c A

u)

x : x R
n
}
=
;
b

u A

u = c
otherwise
max{b

u : A

u = c}
Lagrangian relaxation: dualize constraints and relax them!
6 DR. M.KASHIF
5
9

Unconstrained nonlinear optimization
Optimization problem
min
xR
n f (x)
Categorization of minimum points
x

global minimum if f (y) f (x

) for all y
x

loc
local minimum if f (y) f (x

loc
) for all y such that y x

loc
r
Sucient condition for local min
gradient f (x) =

f
x
1
.
.
.
f
x
n

= 0: local stationarity
Hessian
2
f (x) =

2
f
x
2
1

2
f
x
1
x
2
. . .

2
f
x
1
x
n
.
.
.
.
.
.
.
.
.
.
.
.

2
f
x
n
x
1

2
f
x
n
x
2
. . .

2
f
x
2
n

positive semidenite
Gradient condition is sucient if the function f (x) is convex.
2
DR. M.KASHIF
6
0

Unconstrained nonlinear optimization
Optimization problem
min
xR
2 x
2
1
+ 3x
1
x
2
+ x
3
2
Gradient
f (x) =
5
2x
1
+ 3x
2
3x
1
+ 3x
2
2
6
= 0 x = 0,
C

9
4
3
2
D
Hessian at x: H =
5
2 3
3 6x
2
6
x = 0: H =
5
2 3
3 0
6
. Not positive denite. Not local minimum.
x =
5

9
4
3
2
6
: H =
5
2 3
3 9
6
. Positive semidenite. Local minimum
3 DR. M.KASHIF
6
1

Lagrangian method
Constrained optimization problem
max
xR
2 2 ln(1 + x
1
) + 4 ln(1 + x
2
),
s.t. x
1
+ x
2
= 12
Convex problem. But constraints make the problem hard to solve.
Form a Lagrangian function
L(x, v) = 2 ln(1 + x
1
) + 4 ln(1 + x
2
) v(x
1
+ x
2
12)
Compute the stationary points of the Lagrangian as a function of v
L(x, v) =
5
2
1+x
1
v
4
1+x
2
v
6
= 0 x
1
=
2
v
1, x
2
=
4
v
1
Substituting in the constraint x
1
+ x
2
= 12, we get
6
v
= 14 v =
3
7
x =
1
3
5
11
25
6
4
DR. M.KASHIF
6
2

Portfolio Selection
Optimization problem
max
x

x x

Vx
s.t. 1

x = 1
Constraints make the problem hard!
Lagrangian function
L(x, v) =

x x

Vx v(1

x 1)
Solve for the maximum value with no constraints

x
L(x, v) = 2Vx v1 = 0 x =
1
2
V
1
( v1)
Solve for v from the constraint
1

x = 1 1

V
1
( v1) = 2 v =
1

V
1
2
1

V
1
1
Substitute back in the expression for x
5
DR. M.KASHIF
6
3

You might also like