You are on page 1of 10

UC Berkeley Quantitative Coursework

Hamilton Hoxie Ackerman


Department of Statistics, PhD Program
www.hhackerman.com
Spring 2012
CS61A: Structure and Interpretation of Computer Programs
Professor: Paul Hil
nger (http://www.cs.berkeley.edu/~hilfingr/)
Text: None, though it followed Structure and Interpretation of Computer Programs
by Abelson et al. closely.
Technologies: Python 3, Scheme, Emacs
Topics: Higher-order functions, environments, data abstraction, mutable data and
basic data types, object-
oriented programming, recursion, interpreters, client/server programming, concur
rency, the Halting Problem.
Grade: A
CS194-21: Networks, Crowds, and Markets
Professors: Christos Papadimitriou (http://www.cs.berkeley.edu/~christos/)
Richard Karp (http://www.eecs.berkeley.edu/~karp/)
Text: Networks, Crowds, and Markets by Easley and Kleinberg
Topics: Graph ideas such as triadic closure, homophily, and structural balance.
Basic game theory, opti-
mality, and Nash equilibria. Braess's Paradox. First and second price auctions.
Bargaining and power in
networks. Web structure and search. Online advertising and the VCG protocol. Inf
ormation cascades and
network eects. Power laws. The small-world phenomenon. Voting and Arrow's Impossi
bility Theorem.
Grade: A
AY250: Python Computing for Scienti
c Research
Professors: Josh Bloom (http://astro.berkeley.edu/~jbloom/)
Technologies: IPython, SciPy, NumPy, Matplotlib, Emacs
Topics: I dropped this course six weeks into the semester due to time con icts. I enjoyed six busy weeks,
though, covering object-oriented programming, advanced plotting and brushing via
Matplotlib, rejection
sampling, computer vision and classi
cation, XML-RPC servers, audio pitch recognition, databases and
Pythonic front ends, HTML parsing via BeautifulSoup, and GUI basics.
Fall 2011
STAT 204: Probability for Applications
Professor: Allan Sly (http://www.stat.berkeley.edu/~sly/)
Text: Probability and Random Processes, by Grimmett and Stirzaker
Topics: Probability basics, common distributions and inequalities, simple random
walks, distributional ap-
proximations, generating functions, renewal processes, branching processes, conv
ergence of random variables,
1
limit theorems, Markov chains, birth and death processes, Markov Chain Monte Car
lo, Metropolis-Hastings
algorithm, coupling, Poisson processes, the Stein-Chen method, martingales, diusi
ons, Brownian motion.
No measure theory was used.
Grade: A
STAT 241: Statistical Learning Theory
Professors: Martin Wainwright (http://www.cs.berkeley.edu/~wainwrig/)
Michael Jordan (http://www.cs.berkeley.edu/~jordan/)
Text: An Introduction to Probabilistic Graphical Models, by Jordan
Topics: Graphical models (GMs) as a unifying quantitative framework, GM basics,
directed and undirected
graphs, conditional independence, the Bayes Ball algorithm, graph factorizations
, inference algorithms (Elim-
ination, Sum-Product, Junction Tree), linear and logistic regressions via GMs an
d stochastic gradient descent,
exponential family, suciency, iterative proportional
tting, maximum likelihood and KL-divergence, mix-
ture models, K-means, EM algorithm, hidden Markov models, factor analysis, princ
ipal components analysis,
Monte Carlo methods, sampling procedures, MCMC, model selection, variational met
hods basics.
Grade: S (Satisfactory, i.e. Pass)
Spring 2010
STAT 215B: Statistical Models: Theory and Applications
Professor: Elizabeth Purdom (http://www.stat.berkeley.edu/~epurdom/)
Texts: Excerpts from various experimental design texts,
The Elements of Statistical Learning by Hastie, Tibshirani, and Friedman,
Linear Models with R by Faraway
Topics: A project-based applied statistics course, where projects involved data
analysis in R and com-
plemented the topics from lecture: ordinary least squares, ANOVAs, estimability
and identi
ability, ran-
domization and experimental design, 2k factorial and fractional designs, optimal
ity in experimental design,
nested designs, random and mixed eects models, parametric bootstrap, Bayesian mod
eling basics, logistic
regression, LDA/QDA, support vector machines, CART, bagging, random forests, boo
sting.
Grade: A
STAT 210B: Theoretical Statistics
Professor: Michael Jordan (http://www.cs.berkeley.edu/~jordan/)
Text: Asymptotic Statistics, by van der Vaart
Topics: Continuous Mapping Theorem, Prohorov's Theorem, Portmanteau Theorem, Gli
venko-Cantelli
Theorem, Lindeberg-Feller CLT, U statistics, the Delta Method, empirical process
basics, Hoeding's in-
equality, VC theory, entropy, covering numbers, M- and Z-estimation, Donsker the
ory, Radon-Nikodym The-
orem, Le Cam theory, asymptotic relative eciency, Neyman-Pearson testing, Rao sco
re statistics, likelihood
ratio tests, supereciency, minimax estimators, Functional Delta method, G^ateaux/
Hadamard/ Frechet dif-
ferentiability, Assouad's Lemma, bootstrapping, nonparametric regression, Bernst
ein-von Mises theorem.
Grade: A
2
Fall 2009
STAT 215A: Statistical Models: Theory and Applications
Professor: Bin Yu (http://www.stat.berkeley.edu/~binyu/)
Texts: Statistical Models: Theory and Practice by Freedman
The Elements of Statistical Learning by Hastie, Tibshirani, and Friedman
Topics: A project-based applied statistics course, where projects involved data
analysis in R and com-
plemented the topics from lecture: modeling philosophy, exploratory data analysi
s, human perception and
visualization, kernel density plots, bias/variance trade-o, LOESS, principal comp
onents analysis, k-means,
silhouette plots, EM algorithm, OLS and WLS, hypothesis testing, bootstrapping,
cross-validation, model
selection, logistic regression, Newton-Raphson algorithm, exponential family, GL
M, ridge regression and
LASSO, boosting.
Grade: A+
Note: For project samples, please visit www.hhackerman.com/professional/
STAT 210A: Theoretical Statistics
Professor: Haiyan Huang (http://www.stat.berkeley.edu/~hhuang/)
Text: Mathematical Statistics, Volume 1, by Bickel and Doksum
Topics: The decision-theoretic framework, admissibility, suciency, completeness,
Basu's Theorem, UMVUEs,
the exponential family, conjugate priors, Cramer-Rao bound and Fisher's Informati
on, Bayes risk and es-
timators, minimax estimators, method of moments and MLE, consistency, inequaliti
es and limit theorems,
hypothesis testing.
Grade: A??
3
Unit 1: Derivatives
A. What is a derivative?
? Geometric interpretation
? Physical interpretation
? Important for any measurement (economics, political science, finance, physics,
etc.)
B. How to differentiate any function you know.
d ? ?
For example: ex arctan x ? . We will discuss what a derivative is today. Figurin
g out how to
dx
differentiate any function is the subject of the first two weeks of this course.
Lecture 1: Derivatives, Slope, Velocity, and Rate
of Change
Geometric Viewpoint on Derivatives
Tangent line
Secant line
f(x)
P
Q
x0 x0+x
y
Figure 1: A function with secant and tangent lines
The derivative is the slope of the line tangent to the graph of f(x). But what i
s a tangent line,
exactly?
1
Lecture 1 18.01 Fall 2006
? It is NOT just a line that meets the graph at one point.
? It is the limit of the secant line (a line drawn between two points on the gra
ph) as the distance
between the two points goes to zero.
Geometric definition of the derivative:
Limit of slopes of secant lines PQ as Q P (P fixed). The slope of PQ:
P
Q
(x0+x, f(x0+x))
(x0, f(x0))
x
f
Secant Line
Figure 2: Geometric definition of the derivative
lim
f = lim f(x0 + x) ? f(x0)
= f?(x0)
x0 x x0 ? ??x ? ? ?? ?
difference quotient derivative of f at x0
1
Example 1. f(x) =
x
One thing to keep in mind when working with derivatives: it may be tempting to p
lug in x = 0
f 0
right away. If you do this, however, you will always end up with = . You will al
ways need to
x 0
do some cancellation to get at the answer.
f 1 1 1
?
x0 ? (x0 + x)
?
1
? ?
= x0+x ? x0 = = ?x = ?1
x x x (x0 + x)x0 x (x0 + x)x0 (x0 + x)x0
Taking the limit as x 0,
lim ?1
= ?1
x0 (x0 + x)x0 x20
2
Lecture 1 18.01 Fall 2006
y
x
x0
Figure 3: Graph of x
1
Hence,
f?(x0) = ?
2
1
x0
Notice that f?(x0) is negative ? as is the slope of the tangent line on the grap
h above.
Finding the tangent line.
Write the equation for the tangent line at the point (x0, y0) using the equation
for a line, which you
all learned in high school algebra:
y ? y0 = f?(x0)(x ? x0)
Plug in y0 = f(x0) =
1
and f?(x0) = ?
2
1
to get:
x0 x0
y ?
x
1
0
= ?
x20
1
(x ? x0)
3
Lecture 1 18.01 Fall 2006
y
x
x0
Figure 4: Graph of x
1
Just for fun, lets compute the area of the triangle that the tangent line forms wi
th the x- and
y-axes (see the shaded region in Fig. 4).
First calculate the x-intercept of this tangent line. The x-intercept is where y
= 0. Plug y = 0
into the equation for this tangent line to get:
0 ?
1
= ?
2
1
(x ? x0)
x0 x0
?1 ?1 1
= x + x 2 0 x0 x0
1 2
x = x20 x0
2
x = x20
( ) = 2x0 x0
So, the x-intercept of this tangent line is at x = 2x0.
1 1
Next we claim that the y-intercept is at y = 2y0. Since y = and x = are identica
l equations,
x y
the graph is symmetric when x and y are exchanged. By symmetry, then, the y-inte
rcept is at
y = 2y0. If you dont trust reasoning with symmetry, you may follow the same chain
of algebraic
reasoning that we used in finding the x-intercept. (Remember, the y-intercept is
where x = 0.)
Finally,
1 1
Area = (2y0)(2x0) = 2x0y0 = 2x0( ) = 2 (see Fig. 5)
2 x0
Curiously, the area of the triangle is always 2, no matter where on the graph we
draw the tangent
line.
4
Lecture 1 18.01 Fall 2006
y
x
x0 2x0
y0
2y0
x-1
Figure 5: Graph of x
1
Notations
Calculus, rather like English or any other language, was developed by several pe
ople. As a result,
just as there are many ways to express the same thing, there are many notations
for the derivative.
Since y = f(x), its natural to write
y = f = f(x) ? f(x0) = f(x0 + x) ? f(x0)
We say Delta y or Delta f or the change in y.
If we divide both sides by x = x ? x0, we get two expressions for the difference q
uotient:
y =
f
x x
Taking the limit as x 0, we get
y
x

dy
dx
(Leibniz notation)
f
x
f?(x0) (Newtons notation)
When you use Leibniz notation, you have to remember where youre evaluating the deriv
ative
? in the example above, at x = x0.
Other, equally valid notations for the derivative of a function f include
df
, f?, and Df
dx
5
Lecture 1 18.01 Fall 2006
Example 2. f(x)= xn where n =1, 2, 3...
d
What is xn?
dx
To find it, plug y = f(x) into the definition of the difference quotient.
nn
y =(x0 +x)n ? x0 =(x +x)n ? xx x x
(From here on, we replace x0 with x, so as to have less writing to do.) Since
(x +x)n =(x +x)(x +x)...(x +x) n times
We can rewrite this as ?? xn + n(x)xn?1 + O (x)2
O(x)2 is shorthand for all of the terms with (x)2, (x)3, and so on up to (x)n. (Th
of what is known as the binomial theorem; see your textbook for details.)
nn
y =(x +x)n ? x= xn + n(x)(xn?1)+ O(x)2 ? x= nxn?1 + O(x)x x x Take the limit:
y
lim = nxn?1
x0 x

Therefore,
d
n
x = nxn?1
dx
This result extends to polynomials. For example,
d 9
(x2 +3x10)=2x + 30x
dx
Physical Interpretation of Derivatives
You can think of the derivative as representing a rate of change (speed is one e
xample of this). On Halloween, MIT students have a tradition of dropping pumpkin
s from the roof of this building, which is about 400 feet high. The equation of
motion for objects near the earths surface (which we will just accept for now) imp
lies that the height above the ground y of the pumpkin is: y = 400 ? 16t2 y distan
ce travelled
The average speed of the pumpkin (difference quotient) = =
t time elapsed When the pumpkin hits the ground, y = 0,
400 ? 16t2 =0
6
Lecture 1 18.01 Fall 2006
Solve to find t = 5. Thus it takes 5 seconds for the pumpkin to reach the ground
. 400 ft
Average speed = = 80 ft/s
5 sec A spectator is probably more interested in how fast the pumpkin is going w
hen it slams into the ground. To find the instantaneous velocity at t = 5, lets ev
aluate y?: y? = ?32t =(?32)(5) = ?160 ft/s (about 110 mph) y? is negative becaus
e the pumpkins y-coordinate is decreasing: it is moving downward.
7

You might also like