Topic One - Review of Math and Stat - Nov2nd

Topic One: Review of Math and Statistics
Learning goal:
understand the basic math/stat that are useful for Asset pricing theory
Math Statistics
Functions  Probability distribution

systems of equations and matrix  Descriptive statistics
differential calculus
extreme values
constrained optimization
I. Functions
o A mapping from each 𝑥 in 𝑋 (domain) to some 𝑦 in 𝑌 (range), 𝑓: 𝑋 → 𝑌

o 𝑦=𝑓 𝑥
o i.e. each 𝑥 maps into one 𝑦
 any mapping that is not a function?
 any examples for functions?
 what 𝑋 can be?

𝑎
 Let 𝑦 = ς𝑛𝑖=1 𝑥𝑖 𝑖 , where 𝑎𝑖 s are constant. Is it a function?
I. Functions
o Properties of functions
 Monotonicity (strict, inverse)
 concavity
 extreme values (local, global)
 slope
 continuity
I. Functions
 Examples of common functions
 𝑓 𝑥 = 𝑎𝑥 𝑏
 𝑓 𝑥 = 𝛼0 + 𝛼1 𝑥 + 𝛼2 𝑥 2 + ⋯ + 𝛼𝑛 𝑥 𝑛
 𝑓 𝑥 = 𝑎𝑏 𝑥
Quadratic Equations Let 𝑓 𝑥 = 𝑎𝑥 2 + 𝑏𝑥 + 𝑐
−𝑏± 𝑏2 −4𝑎𝑐 𝑏 2 𝑏2
Roots: 𝑥 = Extreme value: 𝑓 𝑥 = 𝑎 𝑥 + +𝑐−
2𝑎 2𝑎 4𝑎
II. System of equations and matrix
 Example: 2-equation with 2 unknowns
2𝑥 + 3𝑦 = 10
5𝑥 + 8𝑦 = 22
 solve for 𝑥 and 𝑦
 how about there are 5 equations with 5 unknowns?

Matrices and Vectors

𝑎11 ⋯ 𝑎1𝑚
A = ⋮ ⋱ ⋮
(𝑛𝑥𝑚)
𝑎𝑛1 ⋯ 𝑎𝑛𝑚
(when 𝑛 = 𝑚, a matrix is called square matrix)

𝑥1
𝐱 = ⋮
𝑛×1
𝑥𝑛
Transpose of a Matrix, 𝐴′ (or 𝐴𝑇 )

1 4
′ 1 2 3
Example: 𝐴 = 2 5  𝐴 =
4 5 6
3 6
Basic Matrix Operations
1. Addition and Subtraction (element-by-element)

1 2 5 −6 6 −4
+ =
3 4 −6 8 −3 12
2. Matrix Multiplication (not element-by-element)

5 8
1 2 3
𝐴= ,𝐵 = 6 9
4 5 6
7 10
1 ∗ 5 + 2 ∗ 6 + 3 ∗ 7 1 ∗ 8 + 2 ∗ 9 + 3 ∗ 10
𝐴𝐵 = ;
4 ∗ 5 + 5 ∗ 6 + 6 ∗ 7 4 ∗ 8 + 5 ∗ 9 + 6 ∗ 10
BA=?
Common Used Matrices
1. Identity Matrix
1 0
𝐼2 =
0 1
2. Inverse Matrix
Given a square matrix, 𝐴, its inverse 𝐴−1 (if it exists), satisfies
𝐴−1 𝐴 = 𝐼
𝐴𝐴−1 = 𝐼
1. Determinant
𝑎11 𝑎12
Given 𝐴 = 𝑎 𝑎 , det 𝐴 = 𝐴 = 𝑎11 𝑎22 − 𝑎21 𝑎12
21 22
𝑎11 𝑎12 𝑎13
𝑎22 𝑎23 𝑎12 𝑎13
Given 𝐴 = 𝑎21 𝑎22 𝑎23 = 𝑎11 ∗ 𝑎 𝑎 + −𝑎21 ∗ 𝑎32 𝑎33 +
32 33
𝑎31 𝑎32 𝑎33
𝑎12 𝑎13
𝑎31 ∗ 𝑎
22 𝑎23
2. Linearly independence
Given non-zero vectors 𝑣 and 𝑤, the two vectors are linearly independent
iff 𝑎 ∗ 𝑣 + 𝑏 ∗ 𝑤 ≠ 0, ∀𝑎, 𝑏 ∈ 𝑅\{0}
III. Differential calculus
Let 𝑦 = 𝑓(𝑥)
Δ𝑦 𝑓 𝑥0 +Δ𝑥 −𝑓(𝑥0 )
The difference quotient of 𝑦 at 𝑥 = 𝑥0 ≡ ȁ =
Δ𝑥 𝑥=𝑥0 Δ𝑥
 Derivative is difference quotient as Δ𝑥 → 0
𝑑𝑦 𝑓 𝑥 + Δ𝑥 − 𝑓 𝑥
= lim = 𝑓′ 𝑥
𝑑𝑥 Δ𝑥→0 Δ𝑥
 Derivative is slope of tangent line
 example: tax rate and tax revenue

Let 𝑦 = 𝑓 𝑥 𝑎𝑛𝑑 𝑘 is a constant number
Rule of differentiation
𝑑𝑦
1. Sum rule 𝑖𝑓 𝑦 = 𝑓 𝑥 + 𝑔 𝑥 ⇒ = 𝑓 ′ 𝑥 + 𝑔′ 𝑥
𝑑𝑥
𝑑𝑦 ′
2. scale rule 𝑖𝑓 𝑦 = 𝑘𝑓 𝑥 ⇒ = 𝑘𝑓 𝑥
𝑑𝑥
𝑑𝑦
3. product rule 𝑖𝑓 𝑦 = 𝑓 𝑥 ∗ 𝑔 𝑥 ⇒ = 𝑓 ′ 𝑥 ∗ 𝑔 𝑥 + 𝑓 𝑥 ∗ 𝑔′ 𝑥
𝑑𝑥
𝑘 𝑑𝑦 𝑘−1
4. power rule 𝑖𝑓 𝑦 = 𝑥 ⇒ = 𝑘𝑥
𝑑𝑥
𝑘𝑥 𝑑𝑦
5. exponential rule [𝑖𝑓 𝑦 = 𝑒 ⇒ = 𝑘𝑒 𝑘𝑥 ]
𝑑𝑥
𝑑𝑦 1
6. logarithmic rule 𝑖𝑓 𝑦 = ln 𝑥 ⇒ =
𝑑𝑥 𝑥
𝑑𝑦
7. chain rule 𝑖𝑓 𝑦 = 𝑔 𝑓 𝑥 ⇒ = 𝑔′ 𝑓 𝑥 ∗ 𝑓′ 𝑥
𝑑𝑥
 Second derivative
𝑑 𝑑𝑦 𝑑2𝑦
= 2 = 𝑓 ′′ 𝑥
𝑑𝑥 𝑑𝑥 𝑑𝑥
 Example:
1
1. 𝑦 = 𝑥 2
2. 𝑦 = 𝑎 ln 𝑥
 Convexity
 convex
 concave
 Calculus with many variables (multivariate)
𝑦 = 𝑓 𝑥1 , 𝑥2 , … , 𝑥𝑛
 Partial derivative is change in 𝑦 in response to an infinitesimal change in a

single variable 𝑥𝑖 , holding all other variables constant
𝜕𝑦 𝑓 𝑥1 , … , 𝑥𝑖−1 , 𝑥𝑖 + Δ𝑥𝑖 , 𝑥𝑖+1 , … , 𝑥𝑛 − 𝑓 𝑥1 , … , 𝑥𝑖 , … , 𝑥𝑛

= lim
𝜕𝑥𝑖 Δ𝑥𝑖 →0 Δ𝑥𝑖
 Example: Cobb-Douglas production function, 𝑄 = 𝐴𝐾 𝛼 𝐿1−𝛼

Definition of Taylor series
𝑓 𝑥
′′ 2 𝑛−1 𝑛−1
′
𝑓 𝑎 𝑥– 𝑎 𝑓 𝑎 𝑥 − 𝑎
= 𝑓 𝑎 + 𝑓 𝑎 𝑥– 𝑎 + +⋯+ + 𝑅𝑛
2! (𝑛 − 1)!
𝑓𝑛 𝜉 𝑥 − 𝑎 𝑛
where 𝑅𝑛 = 𝑤ℎ𝑒𝑟𝑒 𝑎 ≤ 𝜉 ≤ 𝑥
𝑛!
This result holds if 𝑓(𝑥) has continuous derivatives of order 𝑛 at last.
𝐼𝑓 lim 𝑅𝑛 = 0, the infinite series obtained is called Taylor series for 𝑓(𝑥) about
𝑛→∞
𝑥 = 𝑎
 Calculus with many variables (multivariate)
𝑦 = 𝑓 𝑥1 , 𝑥2 , … , 𝑥𝑛
 Partial derivative is change in 𝑦 in response to an infinitesimal change in a

single variable 𝑥𝑖 , holding all other variables constant
𝜕𝑦 𝑓 𝑥1 , … , 𝑥𝑖−1 , 𝑥𝑖 + Δ𝑥𝑖 , 𝑥𝑖+1 , … , 𝑥𝑛 − 𝑓 𝑥1 , … , 𝑥𝑖 , … , 𝑥𝑛

= lim
𝜕𝑥𝑖 Δ𝑥𝑖 →0 Δ𝑥𝑖
 Example: Cobb-Douglas production function, 𝑄 = 𝐴𝐾 𝛼 𝐿1−𝛼

 Derivative of matrix functions [just remember it!!]
Let 𝐴 be a (𝑛 × 𝑛) symmetric matrix and 𝑥, 𝑦 be 𝑛 × 1 vectors
𝜕
1. 𝑥 ′𝑦 = 𝑦
𝜕𝑥
𝜕
𝐴𝑥 ′
𝜕𝑥1
𝜕
2. 𝜕𝑥
𝐴𝑥 ≔ ⋮ =𝐴
𝜕 ′
𝐴𝑥
𝜕𝑥𝑛
𝜕
𝑥′𝐴𝑥
𝜕𝑥1
𝜕
3. 𝜕𝑥
𝑥′𝐴𝑥 ≔ ⋮ = 2𝐴𝑥
𝜕
(𝑥 ′ 𝐴𝑥)
𝜕𝑥𝑛
III. Extreme Values
Univariate function (𝑦 = 𝑓(𝑥))
 First-order condition
If 𝑓(𝑥) is everywhere differentiable and reach a maximum or minimum at 𝑥 ∗ ,

then 𝑓 ′ 𝑥 ∗ = 0
 Second-order Condition
if 𝑓 ′′ 𝑥 ∗ < 0, 𝑥 ∗ is a local maximum

if 𝑓 ′′ 𝑥 ∗ > 0, 𝑥 ∗ is a local minimum
III. Extreme Values
Multivariate functions (𝑦 = 𝑓 𝑥1 , 𝑥2 , … , 𝑥𝑛 )
 first-order condition
𝑓1 𝑥1∗ , 𝑥2∗ , … , 𝑥𝑛∗ = 0
…
…
𝑓𝑛 𝑥1∗ , 𝑥2∗ , … , 𝑥𝑛∗ = 0
 Second-order (sufficient) condition in the bivariate case, 𝑓(𝑥1 , 𝑥2 )
2
for local minimum: 𝑓11 > 0 and 𝑓11 𝑓22 > 𝑓12
2
for local maximum: 𝑓11 < 0 and 𝑓11 𝑓22 > 𝑓12
IV. Constrained optimization
 objective function + constraint(s)
 Let us focus on bivariate case, 𝑓(𝑥1 , 𝑥2 ) subject to 𝑔 𝑥1 , 𝑥2 = 𝑐
 substitution method
 Lagrange method
𝐿 𝑥1 , 𝑥2 , 𝜆 = 𝑓 𝑥1 , 𝑥2 + 𝜆 𝑐 − 𝑔 𝑥1 , 𝑥2
(details about SOC)
 Optimization with inequality constraints [Kuhn-Tucker method]

 Key: complementary slackness conditions
 we will not cover this part in our course
IV. Probability distribution
Assuming a random variable, 𝑋, takes on finite number (n) of possible values
 The probability (mass) distribution of a random variable, 𝑋, gives the

probability that the random variable will take on each of its possible values,
𝑃 𝑥 = 𝑃(𝑋 = 𝑥)
 Rules for probability distributions
 Distribution function, 𝐹 𝑥 = 𝑃 𝑋 ≤ 𝑥
IV. Descriptive Statistics
 Expected value
𝑛
𝐸 𝑋 = ෍ 𝑝𝑖 𝑥𝑖
𝑖=1
 Variance
𝑛
2 2
𝜎 = 𝑉𝑎𝑟 𝑋 = ෍ 𝑝𝑖 𝑥𝑖 − 𝐸 𝑋
𝑖=1
 standard derivation, 𝜎
 Covariance
𝐶𝑜𝑣 𝑋, 𝑌 = 𝐸 𝑋 − 𝐸 𝑋 𝑌−𝐸 𝑌
 correlation, 𝜌𝑋𝑌
IV. Descriptive Statistics
 Skewness
3
𝐸 𝑋−𝐸 𝑋
𝑠𝑘𝑒𝑤𝑛𝑒𝑠𝑠 = 3
𝜎2 2
 Kurtosis
4
𝐸 𝑋−𝐸 𝑋
𝐾𝑢𝑟𝑡𝑜𝑠𝑖𝑠 =
𝜎2 2
What you have learnt:
1. What is function?
2. How to find the root(s) and extreme value(s) for quadratic equations?
3. The basic operation rules for differentiation
4. The FOC(s) and SOC(s) for simple optimization problems
5. A brief introduction to Lagrange multiplier method
6. Basic matrix operations
7. The relationship between linear independence and determinant
8. Use the Cramer’s rule to solve linear equation
9. how to describe a random variable
10. Use covariance to describe relationship between two random variables

Topic One - Review of Math and Stat - Nov2nd

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Topic One - Review of Math and Stat - Nov2nd

Uploaded by

Copyright:

Available Formats

Topic One: Review of Math and Statistics

Functions  Probability distribution

o A mapping from each 𝑥 in 𝑋 (domain) to some 𝑦 in 𝑌 (range), 𝑓: 𝑋 → 𝑌

 any mapping that is not a function?

 any examples for functions?

 what 𝑋 can be?

 extreme values (local, global)

 Examples of common functions

Quadratic Equations Let 𝑓 𝑥 = 𝑎𝑥 2 + 𝑏𝑥 + 𝑐

 Example: 2-equation with 2 unknowns

 solve for 𝑥 and 𝑦

 how about there are 5 equations with 5 unknowns?

Matrices and Vectors

(when 𝑛 = 𝑚, a matrix is called square matrix)

Transpose of a Matrix, 𝐴′ (or 𝐴𝑇 )

Basic Matrix Operations

1. Addition and Subtraction (element-by-element)

2. Matrix Multiplication (not element-by-element)

Common Used Matrices

Given a square matrix, 𝐴, its inverse 𝐴−1 (if it exists), satisfies

 Derivative is difference quotient as Δ𝑥 → 0

 Derivative is slope of tangent line

 example: tax rate and tax revenue

 Calculus with many variables (multivariate)

 Partial derivative is change in 𝑦 in response to an infinitesimal change in a

𝜕𝑦 𝑓 𝑥1 , … , 𝑥𝑖−1 , 𝑥𝑖 + Δ𝑥𝑖 , 𝑥𝑖+1 , … , 𝑥𝑛 − 𝑓 𝑥1 , … , 𝑥𝑖 , … , 𝑥𝑛

 Example: Cobb-Douglas production function, 𝑄 = 𝐴𝐾 𝛼 𝐿1−𝛼

Definition of Taylor series

This result holds if 𝑓(𝑥) has continuous derivatives of order 𝑛 at last.

 Calculus with many variables (multivariate)

 Partial derivative is change in 𝑦 in response to an infinitesimal change in a

𝜕𝑦 𝑓 𝑥1 , … , 𝑥𝑖−1 , 𝑥𝑖 + Δ𝑥𝑖 , 𝑥𝑖+1 , … , 𝑥𝑛 − 𝑓 𝑥1 , … , 𝑥𝑖 , … , 𝑥𝑛

 Example: Cobb-Douglas production function, 𝑄 = 𝐴𝐾 𝛼 𝐿1−𝛼

Univariate function (𝑦 = 𝑓(𝑥))

If 𝑓(𝑥) is everywhere differentiable and reach a maximum or minimum at 𝑥 ∗ ,

if 𝑓 ′′ 𝑥 ∗ < 0, 𝑥 ∗ is a local maximum

 Second-order (sufficient) condition in the bivariate case, 𝑓(𝑥1 , 𝑥2 )

 Let us focus on bivariate case, 𝑓(𝑥1 , 𝑥2 ) subject to 𝑔 𝑥1 , 𝑥2 = 𝑐

 Optimization with inequality constraints [Kuhn-Tucker method]

Assuming a random variable, 𝑋, takes on finite number (n) of possible values

 The probability (mass) distribution of a random variable, 𝑋, gives the

 Rules for probability distributions

You might also like