Professional Documents
Culture Documents
Neural Networks
Vladimir Krasnopolsky
NCEP/NOAA (SAIC)
• Introduction: Motivation
• Classical Statistic Framework: Regression
Analysis
• Regression Models (Linear & Nonlinear)
• NN Tutorial
• Some Atmospheric & Oceanic Applications
– Accelerating Calculations of Model Physics in
Numerical Models
• How to Apply NNs
• Conclusions
4/4 & 25/4/2006 at EMC/NCEP/NOAA V.Krasnopolsky, "Nonlinear Statistics and NNs" 2
Motivations for This Seminar
Objects Complex,
Simple, linear or quasi-linear, single nonlinear, multi-disciplinary,
Studied: disciplinary, low-dimensional systemshigh-dimensional systems
Materials presented here reflect personal opinions and experience of the author!
Problem:
Information exists in the form of sets of values
of several related variables (sample or
training set) – a part of the population:
{(x1, x2, ..., xn)p, zp}p=1,2,...,N
– x1, x2, ..., xn - independent variables (accurate),
– z - response variable (may contain observation
errors ε)
We want to find responses z’q for another set of
independent variables {(x’1, x’2, ..., x’n)q}q=1,..,M
4/4 & 25/4/2006 at EMC/NCEP/NOAA V.Krasnopolsky, "Nonlinear Statistics and NNs" 4
Regression Analysis (1):
General Solution and Its Limitations
Sir Ronald A. Fisher ~ 1930
REGRESSION FUNCTION
z = f(X), for all X
INDUCTION
DEDUCTION
Ill-posed problem
Well-posed problem
IMITATION
IDENTIFICATION
p =1
z = a0 + a1 x1 + a2 x2 + ... + ε = a 0 + ∑ax
i =1
i i +ε
• Generalized Linear Regression: n
space ℜm.
⎡ y1 = f1 ( x1 , x2 ,..., xn ) ⎤
Y = F(X ) ⎫ ⎢ ⎥
n ⎪
y = f ( x , x ,..., x )
X = {x1 , x2 ,..., xn },∈ ℜ ⎬ ⇒ ⎢ 2 2 1 2 n ⎥
⎪ ⎢ M ⎥
Y = { y1 , y2 ,..., ym },∈ ℜ m ⎭ ⎢ ⎥
⎣ ym = f m ( x1 , x2 ,..., xn )⎦
4/4 & 25/4/2006 at EMC/NCEP/NOAA V.Krasnopolsky, "Nonlinear Statistics and NNs" 13
Mapping Y = F(X): examples
⎪ yq = aq 0 + ∑ aqj ⋅ t j = aq 0 + ∑ aqj ⋅ φ (b j 0 + ∑ b ji ⋅ xi ) =
⎪ j =1 j =1 i =1
Y = FNN(X) ⎨ k n
⎪= a + a ⋅ tanh(b + b ⋅ x ); q = 1,2,..., m
⎪⎩ q 0 ∑ ∑
Jacobian ! qj j0 ji i
j =1 i =1
4/4 & 25/4/2006 at EMC/NCEP/NOAA V.Krasnopolsky, "Nonlinear Statistics and NNs" 15
Some Popular Activation Functions
tanh(x) Sigmoid, (1 + exp(-x))-1
X X
X X
4/4 & 25/4/2006 at EMC/NCEP/NOAA V.Krasnopolsky, "Nonlinear Statistics and NNs" 16
NN as a Universal Tool for Approximation of
Continuous & Almost Continuous Mappings
Some Basic Theorems:
Any function or mapping Z = F (X), continuous on
a compact subset, can be approximately
represented by a p (p ≥ 3) layer NN in the sense of
uniform convergence (e.g., Chen & Chen, 1995;
Blum and Li, 1991, Hornik, 1991; Funahashi, 1989,
etc.)
The error bounds for the uniform approximation
on compact sets (Attali & Pagès, 1997):
||Z -Y|| = ||F (X) - FNN (X)|| ~ C/k
k -number of neurons in the hidden layer
C – does not depend on n (avoiding Curse of
Dimensionality!)
4/4 & 25/4/2006 at EMC/NCEP/NOAA V.Krasnopolsky, "Nonlinear Statistics and NNs" 17
NN training (1)
X Training Set Z
Desired
Z Output
Error
X Y E = ||Z-Y||
NN E#,
E≤ε
Yes
End
Input {W} Output E
No Training
BP
)∆W
Weight Adjustments
4/4 & 25/4/2006 at EMC/NCEP/NOAA V.Krasnopolsky, "Nonlinear Statistics and NNs" 19
Backpropagation (BP) Training Algorithm
E
• BP is a simplified steepest descent:
∂E ∂E
∆W = −η ∂W
> 0.
∂W
where W - any weight, E - error function, )W W
η - learning rate, and ∆W - weight increment W r+1
W r
• Overfitting, Extrapolation:
66%
5% 6%
89%
4/4 & 25/4/2006 at EMC/NCEP/NOAA V.Krasnopolsky, "Nonlinear Statistics and NNs" 31
Generic Problem in Numerical Models
Parameterizations of Physics are Mappings
x1
Parameterization
y1
x2 y2
x3
F y3
xn ym
GCM
Y=F(X)
4/4 & 25/4/2006 at EMC/NCEP/NOAA V.Krasnopolsky, "Nonlinear Statistics and NNs" 32
Generic Solution – “NeuroPhysics”
Accurate and Fast NN Emulation for Physics Parameterizations
Learning from Data
Original Parameterization
GCM NN Emulation
X FFNN Y
Training
Set …, {Xi, Yi}, … ∀Xi∈ Dphys
NN Emulation
X FNN Y
F ( p ) = B ( pt ) ⋅ ε ( pt , p ) + ∫ α ( p , p ) ⋅ d B ( p ′)
↓
t
pt
ps
∫ α ( p , p ′) ⋅ d B ( p ′)
↑
F ( p) = B ( ps ) −
p
∫ Bν ( p ) ⋅ (1 − τ ν ( p , p )) ⋅ dν
t t
ε ( pt , p ) = 0
B ( pt )
Bν ( p ) − the Plank function
4/4 & 25/4/2006 at EMC/NCEP/NOAA V.Krasnopolsky, "Nonlinear Statistics and NNs" 34
Magic of NN performance
Original
NN Emulation
Parameterization
Xi Yi Xi Yi
Y = F(X) YNN = FNN(X)
SWR
∼ 20
(°K/day) NCAR 6. 10-4 0.19 1.47 1.89
times faster
NN150
Black – Original
Parameterization
Red – NN with 100 neurons
Blue – NN with 150 neurons
PRMSE = 0.18 & 0.10 K/day PRMSE = 0.11 & 0.06 K/day PRMSE = 0.05 & 0.04 K/day
all in m/sec
all in °K
• Problem Analysis:
– Are traditional approaches unable to solve your problem?
• At all
• With desired accuracy
• With desired speed, etc.
– Are NNs well-suited for solving your problem?
• Nonlinear mapping
• Classification
• Clusterization, etc.
– Do you have a first guess for NN architecture?
• Number of inputs and outputs
• Number of hidden neurons
• Training
– Try different initializations
– If results are not satisfactory, then goto Data
Analysis or Problem Analysis
• Validation (must for any nonlinear tool!)
– Apply trained NN to independent validation data
– If statistics are not consistent with those for
training and test sets, go back to Training or Data
Analysis