Professional Documents
Culture Documents
Introduction
1 Course goals
Most of your engineering courses teach you theory and analytic methods using the techniques of
mathematics (calculus, linear algebra, differential equations, and so on). The purpose is to train
you to understand significant engineering problems from a fundamental theoretical perspective.
You need this foundation to be able to design effective solutions to difficult problems it's
practically impossible to solve a problem you don't thoroughly understand.
Unfortunately, the set of engineering problems that can be solved analytically (with pencil, paper
and even a calculator) is limited to relatively simple and idealized cases. Indeed, many textbook
problems are carefully designed so that you can end up with the exact answer. However, most
applications in your engineering career will involve more complex systems for which you can
mathematically formulate the problem (if you've learned your theory) but cannot solve the
resulting equations analytically. People have long realized that in these cases numerical solutions
are possible. Yet before the development of digital computers, when calculations had to be done
by hand or slide rule, the application of numerical computing was fairly limited. As a result in
the past engineers were forced to make many simplifying assumptions, build scale models, use
trial and error and so on.
In the last several decades the development of computer technology has revolutionized
engineering. Almost any problem that can be formulated mathematically can be solved using
numerical techniques. Systems as complex as the supersonic flow of air through a jet engine or
the operation of an integrated circuit containing millions of transistors are now routinely
analyzed in minute detail using numerical methods implemented on computers.
This is the motivation behind EE 221 Numerical Computing for Engineers. The purpose of this
course is for you to:
1. learn fundamental numerical methods so you can understand how to formulate numerical
solutions to difficult engineering problems, and
2. develop programing skills using Scilab/Matlab so you are able to effectively implement
your numerical solutions.
The beginning of the course will be devoted to basic programing techniques, control structures,
input/output, graphics and the like. For the remainder of the course we will use those skills to
learn and implement numerical methods for important classes of problems.
2 Course topics
The main topics we will cover are listed below. Additional topics and programming project
examples will be covered as time permits.
Scott Hudson
2016-01-08
Lecture 1: Introduction
2/16
Scilab/Matlab Basics
Introduction
Arrays
Programming structures
2D plots
Numerical methods
Root finding
Polynomials
Linear algebra
Linear systems
Nonlinear systems
Interpolation
Optimization
Curve fitting
Numerical calculus
Random numbers
Sparse systems
75% Exams, 25% for each of three exams. The exams will include programing problems
to be done in the lab during the specified exam times as well as pencil-and-paper
problems.
Scott Hudson
2016-01-08
Lecture 1: Introduction
3/16
4.1 History
(From http://en.wikipedia.org/wiki/Matlab)
Short for "matrix laboratory", MATLAB was invented in the late 1970s by Cleve Moler,
then chairman of the computer science department at the University of New Mexico. He
designed it to give his students access to LINPACK and EISPACK without having to
learn Fortran. It soon spread to other universities and found a strong audience within the
applied mathematics community. Jack Little, an engineer, was exposed to it during a visit
Moler made to Stanford University in 1983. Recognizing its commercial potential, he
joined with Moler and Steve Bangert. They rewrote MATLAB in C and founded The
MathWorks in 1984 to continue its development.
Scott Hudson
2016-01-08
Lecture 1: Introduction
4/16
(From http://en.wikipedia.org/wiki/Scilab)
Scilab was created in 1990 by researchers from INRIA and cole nationale des ponts et
chausses (ENPC). It was initially named lab[12] (Psilab). The Scilab Consortium was
formed in May 2003 to broaden contributions and promote Scilab as worldwide reference
software in academia and industry.[13] In July 2008, in order to improve the technology
transfer, the Scilab Consortium joined the Digiteo Foundation. [] Since July 2012,
Scilab is developed and published by Scilab Enterprises.
There are other free/open-source Matlab alternatives in addition to Scilab. One of the most
widely used is GNU Octave. In some ways Octave is arguably even more Matlab compatible than
Scilab. However, it's primarily written for Linux systems as is not as clean to install on
Windows PCs or Macs. The Python programing language has been gaining ground in the
numerical computing community, but differs significantly from Matlab. For these reasons I prefer
Scilab as the best free/open-source Matlab alternative.
In the middle is the console where we will do most of our work. The other windows can X'd
out for now. We can open them later if need be. This leaves just the console.
Type help at the prompt, or use the ? menu to open the help browser. From there you can
search or browse the documentation. This is a good way to find out about all the capabilities of
Scilab. In this class we will use only a small subset of these.
Scott Hudson
2016-01-08
Lecture 1: Introduction
5/16
6 Interactive use
There are two basic ways to use Scilab/Matlab. As an interactive environment you can type a
command directly into the console at the prompt and Scilab will print the results. Then type
another command and so on. This is how we will start out using Scilab, as essentially a fancy
calculator. Later we'll learn how to use the editor to write structured programs.
6.1 Arithmetic
From the command line you can enter arithmetic expressions involving the addition, subtraction,
multiplication and division operators (+ - * /). For example
-->10+6/3-4*2
ans =
4.
EE 221 Numerical Computing
Scott Hudson
2016-01-08
Lecture 1: Introduction
6/16
The ans variable stores the result of the last calculation and can be used in the next calculation.
Here is the same series of calculations performed one step at a time.
-->10
ans =
10.
-->ans+6/3
ans =
12.
-->ans-4*2
ans =
4.
Operator precedence rules apply. Operations are performed from left to right. Multiplication and
division are done before addition and subtraction. These can be overridden with parentheses. For
example
-->3+2/4
ans =
3.5
first divides 2 by 4 and then adds 3 to get 3.5; the + operator appears first (from left to right) but
multiplication has precedence over addition. However
-->(3+2)/4
ans =
1.25
Here the parenthesis tell Scilab to perform the addition operation first 3+2=5, followed by
division 5/4=1.25. I find it good practice in all but the simplest cases to use parentheses to
make the order of operations explicit. It is perfectly fine to use redundant parenthesis if this aids
the readability of an expression. For example the parenthesis in the expression
-->3+(2/4)
ans =
3.5
have no effect, but you might find they make the order of operations explicit.
A very useful feature of Scilab/Matlab is that by pressing the up or down arrow keys (Ctrl-P and
Ctrl-N) on your keyboard, you can cycle through the command line history. This allows you to
easily repeat previous commands. The commands can also be edited using the arrow and
Backspace keys.
Scott Hudson
2016-01-08
Lecture 1: Introduction
7/16
6.2 functions
Scilab/Matlab also has many built-in functions. The help menu provides complete listings. For
example, the trig functions sin, cos, tan as well as their inverses asin, acos, and
atan are available. You use parentheses to denote the argument.
-->sin(0.5)
ans =
0.4794255
-->asin(ans)
ans =
0.5
In Scilab the atan function is an overloaded function which allows you to provide different
numbers of arguments. The expression atan(z) returns the angle between -p/2 and p/2 for
which the tan function is z. It is a two-quadrant inverse tangent. The expression atan(y,x)
returns the angle between -p and p for which the tan function is y/x and the angle corresponds to
EE 221 Numerical Computing
Scott Hudson
2016-01-08
Lecture 1: Introduction
8/16
the polar angle of the rectangular coordinates point (x,y). In Matlab the same functionality is
provided by atan2(y,x).
The standard trig functions operate in radians. Thus cos(1) is the cosine of 1 radian. Since it is
often useful to work in degrees, Scilab provides the functions cosd, sind, atand and so
on where the appended d indicates degrees. So
-->atan(1)
ans =
0.7853982
-->atand(1)
ans =
45.
In many engineering books log x is used to refer to the base 10 logarithm. In Scilab/Matlab the
base 10 logarithm is denoted by log10(x).
-->log10(2)
ans =
0.30103
-->10^ans
ans =
2.
Scott Hudson
2016-01-08
Lecture 1: Introduction
9/16
This distinction has caused problems for many an engineering student because we are so used to
using log(x) to denote the base-10 logarithm. You should pay particular attention to this.
6.3 variables
You can define variables and then operate on those. For example,
-->x = 2
x =
2.
-->x^2+3*x+7
ans =
17.
Typing in the variable name displays the variable value. If no such variable exists you'll get an
error message.
-->x
x =
2.
-->y
!--error 4
undefined variable : y
6.4 strings
Variables are typically numerical in nature. However, variables can be assigned text values using
quotation marks, as in the following examples.
-->a = 'this is a string'
a =
this is a string
-->b = ' and here is some more text'
b =
and here is some more text
Note that in Scilab you can also use double quotes as in this is a string but not in
Matlab. Therefore if you use single quotes in Scilab your code will be more Matlab compatible.
You can concatenate two strings with the + operator in Scilab. In Matlab you can form an array or
use the strcat function.
-->a+b //a+b in Scilab, in Matlab [a,b] or strcat(a,b)
ans =
this is a string and here is some more text
String are particularly useful for labeling and describing numerical output.
Scott Hudson
2016-01-08
Lecture 1: Introduction
10/16
%pi
=
3.1415927
The % prefix protects the constant and keeps you from redefining it.
-->%pi = 3
!--error 13
redefining permanent variable
Many people find this notation messy but it has a very real advantage (see next section). If you
don't like this you can simply create a regular variable with the same value, as in
-->pi = %pi
pi =
3.1415927
This is a weakness of Matlab, in my opinion, because you can accidentally redefine these, as in
>> pi = 3
pi =
3
If you later use the variable pi thinking it's value is p you'll end up with erroneous results.
Note that you need to explicitly include the multiplication operator *. You cannot just juxtapose
two variables to denote multiplication. If you don't like the % sign notation, you can define a
variable to equal %i and use that instead. For example,
-->j = %i
j =
i
-->z = 1+2*j
z =
1. + 2.i
In Matlab the imaginary unit is the predefined variable i (also j). Again this is dangerous because
you can redefine this variable accidentally. In fact i and j are very commonly used as index
variables in for loops.
EE 221 Numerical Computing
Scott Hudson
2016-01-08
Lecture 1: Introduction
>> i
ans =
11/16
0 + 1.0000i
>> i = 2
i =
2
To avoid this problem Matlab defines 1i as a protected variable equal to the imaginary unit
>> 2+3*1i
ans =
2.0000 + 3.0000i
which is a nice feature that Scilab does not have. Scilab/Matlab functions can operate on and
return complex numbers.
-->exp(%i*3)
ans =
- 0.9899925 + 0.1411200i
Scott Hudson
2016-01-08
Lecture 1: Introduction
12/16
The D+01 notation represents the exponent of 10 in scientific notation. D stands for double
precision (more on that later). So, the last line above indicates a value 3.1412101 . You can also
control the number of digits that Scilab outputs using an optional second argument, as in
-->format('v',5);
-->%pi
%pi =
3.14
-->format('v',10);
-->%pi
%pi =
3.1415927
Scott Hudson
2016-01-08
Lecture 1: Introduction
13/16
-->sqrt(3)
ans =
1.7320508
-->diary(0)
that is, everything in the interactive session after the diary command. The diary command also
works in Matlab. The only difference is that the file closed with the command diary('off')
rather than diary(0).
8 Numerical limitations
If
x=(1+1020 )1
then what is x? Obviously x=1020 . Let's verify this with Scilab
-->x = (1+1e-20)-1
x =
0.
Scilab says x=0 , which is wrong. Why? Is 1020 too small for Scilab to represent?
-->x = 1e-20
x =
1.000D-20
Clearly that's not the problem. Instead we are seeing a round-off error. Scilab, like your
calculator, is limited in the number of digits it can use to represent a number.
Suppose we represent numbers in scientific notation as
EE 221 Numerical Computing
Scott Hudson
2016-01-08
Lecture 1: Introduction
14/16
x=d 1 . d 2 d 3 d 410e
1 2
dx 0
We might estimate this numerically as
f ( x+) f ( x)
df
dx
for very small. Ideally the approximation should get better and better.
df
dx
=e 0=1 .
x=0
-->delta = 1e-3;
-->(exp(delta)-exp(0))/delta
ans =
1.000500166708385
EE 221 Numerical Computing
Scott Hudson
2016-01-08
Lecture 1: Introduction
15/16
-->delta = 1e-6;
-->(exp(delta)-exp(0))/delta
ans =
1.000000499962184
-->delta = 1e-9;
-->(exp(delta)-exp(0))/delta
ans =
1.000000082740371
-->delta = 1e-12;
-->(exp(delta)-exp(0))/delta
ans =
1.000088900582341
-->delta = 1e-15;
-->(exp(delta)-exp(0))/delta
ans =
1.110223024625157
Notice as we decrease , initially the approximation to the derivative improves. For =109
we get
df
1.000000082740371
dx
which is accurate to about 8 digits. But further reduction of actually leads to worse accuracy.
For =1015 we have
df
1.110223024625157
dx
which is not even accurate to 2 digits! In fact going to =1020 results in
-->delta = 1e-20;
-->(exp(delta)-exp(0))/delta
ans =
0.
which is completely wrong! The lesson is that we need to consider numerical limitations very
carefully when we develop and implement numerical algorithms.
Scott Hudson
2016-01-08
Lecture 1: Introduction
16/16
9 References
Numerical methods books used as fundamental references throughout these notes
1. Press, W. H., Flannery, B. P., Teukolsky, S. A. and Vetterling, W. T., Numerical Recipes in
C, Cambridge, 1988, ISBN: 0-521-35465-X.
2. Recktenwald, G., Numerical Methods with Matlab: Implementation and Application,
Prentice Hall, 2000, ISBN: 0-201-30860-6.
3. Heath, M. T., Scientific Computing: An Introductory Survey, McGraw Hill, 2002, ISBN:
0-07-239910-4.
4. Urroz, G. E., Numerical and Statistical Methods with SCILAB for Science and
Engineering Vol. 1, BookSurge Publishing, 2001, ISBN-13: 978-1588983046.
5. http://www.mathworks.com/moler/
Software sites
1. http://www.scilab.org, official Scilab website
2. http://www.mathworks.com, official MatLab website
3. https://www.gnu.org/software/octave/, official GNU Octave website
4. https://www.python.org/, official Python website
5. http://www.gnu.org/software/gsl/, GNU Scientific Library
Wikipedia articles
1. http://en.wikipedia.org/wiki/Scilab
2. http://en.wikipedia.org/wiki/Matlab
3. http://en.wikipedia.org/wiki/GNU_Octave
4. http://en.wikipedia.org/wiki/Python_(programming_language)
Scott Hudson
2016-01-08
Lecture 2
Arrays
1 Introduction
As the name Matlab is a contraction of matrix laboratory, you would be correct in assuming
that Scilab/Matlab have a particular emphasis on matrices, or more generally, arrays. Indeed, the
manner in which Scilab/Matlab handles arrays is one of its great strengths. Matrices, vectors and
the operations of linear algebra are of tremendous importance in engineering, so this material is
quite foundational to much of what we will do in this course. In this lesson we are going to focus
on generating and manipulating arrays. In later lessons we will use them extensively to solve
problems. Consider the following session.
-->A = [1,2;3,4]
A =
1.
2.
3.
4.
-->x = [1;2]
x =
1.
2.
-->y=A*x
y =
5.
11.
Here we defined a 2-by-2 matrix A and a 2-by-1 vector x. We then computed the matrix-vector
product y=A*x which is a 2-by-1 vector.
There are various ways to create an array. In general the elements of an array are entered between
the symbols [...]. A space or a comma between numbers moves you to the next column while
a carriage return ("enter" key on the keyboard) or a semi-colon moves you to the next row. For
example
-->B = [ 1 2
-->
3 4]
B =
1.
2.
3.
4.
creates a 2-by-2 array using spaces and a carriage return. The following example shows how
either commas or spaces can be used to separate columns.
-->u = [1,2,3]
u =
1.
2.
3.
-->v = [1 2 3]
v =
1.
2.
3.
It's a matter of preference, but I find that using commas increases readability, especially when
EE 221 Numerical Computing
Scott Hudson
2016-01-08
Lecture 2: Arrays
2/15
entering more complicated expressions. The comma clearly delimits the column entries. Now we
demonstrate how either semi-colons, carriage returns or both can be used to separate rows.
-->x = [1;2;3]
x =
1.
2.
3.
-->y = [1
-->
2
-->
3]
y =
1.
2.
3.
-->z = [1;
-->
2;
-->
3]
z =
1.
2.
3.
2 Initializing an array
As shown above, arrays can be entered directly at the command line (or within a program). There
are some special arrays that are used frequently and can be created using built-in functions. An
array of all zeros can be created using the zeros(m,n)command
-->A = zeros(2,3)
A =
0.
0.
0.
0.
0.
0.
An array with 1's on the diagonal and 0's elsewhere is created using the eye(m,n) command
(eye as in identity matrix).
-->C = eye(3,3)
C =
1.
0.
0.
0.
1.
0.
0.
0.
1.
Here this creates a 3-by-3 identity matrix. However, the matrix need not be square, as in
Scott Hudson
2016-01-08
Lecture 2: Arrays
3/15
-->D = eye(2,3)
D =
1.
0.
0.
0.
1.
0.
To create a square matrix of all 0's except for specified values along the diagonal we use the
diag([...]) command
-->D = diag([1,2,3])
D =
1.
0.
0.
0.
2.
0.
0.
0.
3.
The diag command also allows you to extract the diagonal elements of a matrix
-->A = [1,2;3,4]
A =
1.
2.
3.
4.
-->diag(A)
ans =
1.
4.
If you only want the number or rows or columns of a matrix you can specify that
-->A = [1,2,3;4,5,6]
A =
1.
4.
2.
5.
3.
6.
Vectors are one-dimensional arrays. The size() command works on vectors, but the
length() command can be more convenient in that it returns a single number which is the
number of elements in the vector. Consider the following.
-->v = [1;2;3]
v =
1.
2.
3.
-->size(v)
ans =
3.
1.
EE 221 Numerical Computing
Scott Hudson
2016-01-08
Lecture 2: Arrays
4/15
-->length(v)
ans =
3.
An array operator that can be useful in entering array values is the transpose operator. In
Scilab/Matlab this is the single quote sign. Consider the following
-->v = [1 2 3]'
v =
1.
2.
3.
Sometimes it is convenient to produce arrays with random values. The rand(m,n) command
does this.
-->B = rand(2,3)
B =
0.2113249
0.0002211
0.7560439
0.3303271
0.6653811
0.6283918
The random numbers are uniformly distributed between 0 and 1. You can use commands like
zeros, ones and rand to create an array with the same dimensions as an existing array. For
example, in Scilab
-->A = eye(3,2)
A =
1.
0.
0.
1.
0.
0.
-->ones(A)
//Scilab specific
ans =
1.
1.
1.
1.
1.
1.
The ones(A) command uses the dimensions of A to form the array of ones. Matlab is slightly
different. In Matlab the corresponding command would be
>>A = eye(3,2)
A =
1.
0.
0.
1.
0.
0.
>>ones(size(A)) %Matlab specific
ans =
1.
1.
1.
1.
1.
1.
In Matlab you need to use the size() function to pass the dimensions of the array A. In Scilab
you do not.
In many applications you want to create a vector of equally spaced values. For example
-->t = [0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8]
t =
0.
0.1
0.2
0.3
0.4
0.5
0.6
EE 221 Numerical Computing
Scott Hudson
0.7
0.8
2016-01-08
Lecture 2: Arrays
5/15
Instead of entering all the values directly you can use the following short cut
-->t = 0:0.1:0.8
t =
0.
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
This "colon" notation tells Scilab/Matlab to create a vector named t, starting at 0 and
incrementing by 0.1 up to the value 1. This approach allows you to define the increment (0.1 in
this case). The default increment is 1 as shown here
-->x = 1:5
x =
1.
2.
3.
4.
5.
Sometimes you are more concerned about the total number of elements in the vector rather than a
specific increment. The linspace() command allows you to specify the start and end values
and the total number of elements. Consider the following.
-->x = linspace(0,1,6)
x =
0.
0.2
0.4
0.6
0.8
1.
This creates a vector with values ranging from 0 to 1 and a total of 6 elements. It is sometimes
convenient to be able to "reshape" an array. In Scilab the matrix() command allows you to do
this.
-->A = [1,2,3;4,5,6]
A =
1.
2.
3.
4.
5.
6.
-->matrix(A,3,2) //Scilab specific
ans =
1.
5.
4.
3.
2.
6.
Finally, the elements of an array are not limited to numerical values but can consist of defined
constants, variables or functions. For example,
-->x = 0.2
x =
0.2
Scott Hudson
2016-01-08
Lecture 2: Arrays
6/15
-->C = [cos(x),sin(x);-sin(x),cos(x)]
C =
0.9800666
0.1986693
- 0.1986693
0.9800666
3 Combining arrays
Two or more existing arrays can be combined into a single array by using the existing arrays as
elements in the new array. Consider the following.
-->v = [1,2,3]
v =
1.
2.
3.
-->A = [v;v]
A =
1.
2.
1.
2.
3.
3.
-->B = [v,v]
B =
1.
2.
3.
1.
2.
3.
The arrays being combined must fit together by having compatible dimensions, otherwise you
receive an error.
-->u = [4,5]
u =
4.
5.
-->C = [u;v]
!--error 6
Inconsistent row/column dimensions.
However consider
-->[u,0;v]
ans =
4.
5.
1.
2.
0.
3.
-->D = [u,v]
D =
4.
5.
1.
2.
3.
2.
4.
5.
7.
6.
8.
Scott Hudson
2016-01-08
Lecture 2: Arrays
-->[A;B]
ans =
1.
3.
5.
7.
7/15
2.
4.
6.
8.
These kinds of operations are very useful in many applications where large matrices are
constructed by stacking sub-matrices together. The submatrices might represent specific pieces of
a system, and the stacking operation corresponds to assembling the system.
= 10
2.
5.
10.
3.
6.
9.
We can both access and assign the value of A(3,2) directly. Often you want to extract a row or
column of a matrix. The "colon" notation allows you to do this. For example,
-->A(:,1)
ans =
1.
4.
7.
-->v = A(2,:)
v =
4.
5.
6.
In the first case we extract the 1st column of A. In the second case we extract the 2nd row of A
and assign it to the variable v. In general the semicolon tells Scilab/Matlab to run through all
values of the corresponding subscript or index. More generally you can extract a subrange of
values. Consider
-->A(1:2,2:3)
ans =
2.
3.
5.
6.
This extracts rows 1 through 2 and columns 2 through 3 to create a new 2-by-2 matrix from the
elements of the original 3-by-3 matrix. Elements of a vector can be deleted in the following
manner.
EE 221 Numerical Computing
Scott Hudson
2016-01-08
Lecture 2: Arrays
8/15
-->t = 0:0.2:1
t =
0.
0.2
0.4
0.6
0.8
-->t(2) = []
t =
0.
0.4
0.6
0.8
1.
1.
-->t(3:5) = []
t =
0.
0.4
In the first case we delete the second element of t. In the second case we delete elements 3
through 5. This technique applies to deleting rows or columns of a matrix.
A
=
1.
4.
7.
2.
5.
10.
-->A(1,:) = []
A =
4.
5.
7.
10.
3.
6.
9.
6.
9.
-->A(:,2) = []
A =
4.
6.
7.
9.
4.5
-->[1,2,3]/2
ans =
0.5
1.
1.5
An operation such as A+1 where A is a matrix makes no sense algebraically, but Scilab/Matlab
interprets this as a shorthand way of saying you want to add 1 to each element of A
-->[1,2;3,4]+1
ans =
2.
3.
4.
5.
Addition, subtract and multiplication of two arrays follow the rules of linear algebra. To add or
subtract arrays they must be of the same size. If they are not you get an error.
-->A = [1,2;3,4]
A =
1.
2.
3.
4.
Scott Hudson
2016-01-08
Lecture 2: Arrays
9/15
-->B = [1,2,3;4,5,6]
B =
1.
2.
3.
4.
5.
6.
-->A+B
!--error 8
Inconsistent addition.
Otherwise each element of the resulting array is the sum or difference of the corresponding
elements in the two arrays.
-->C = [2,1;4,3]
C =
2.
1.
4.
3.
-->A+C
ans =
3.
7.
7.
-->A-C
ans =
- 1.
- 1.
3.
1.
1.
To multiply two arrays as in A*B the number of columns of A must equal the number of rows of
B. If A is m-by-n then B must be n-by-p. The product A*B will be m-by-p.
-->A
A =
1.
3.
2.
4.
-->B
B =
1.
4.
2.
5.
-->A*B
ans =
9.
19.
3.
6.
12.
26.
15.
33.
-->B*A
!--error 10
Inconsistent multiplication.
n
The inner product, or dot product of two vectors can be calculated by transposing one and
performing an array multiplication.
-->u = [1;2;3]
u =
1.
2.
3.
EE 221 Numerical Computing
Scott Hudson
2016-01-08
Lecture 2: Arrays
10/15
-->v = [4;5;6]
v =
4.
5.
6.
-->u'*v
ans =
32.
Arrays cannot be divided, but we do have the concept of "inverting" a matrix to solve a linear
1
equation. If A x=b and A is a square, non-singular matrix, then x= A b is the solution to
this system of linear equations. This is sort of like "dividing A into b." In Scilab/Matlab we use
the notation A\b to represent this. You can also think of the \ operator as representing the inverse
1
operation, so that A\ functions as A . Consider the following.
-->A = [1,2;3,4]
A =
1.
2.
3.
4.
-->b = [5;6]
b =
5.
6.
-->x = A\b
x =
- 4.
4.5
-->A*x
ans =
5.
6.
If the matrix A is singular (has no inverse) you'll get an error. We'll talk more about solving
systems of linear equations later.
6 Vectorized functions
A very powerful feature of Scilab/Matlab is that functions can be vectorized. In a language
such as C, if I have an array x and I want to calculate the sin of each element of x, I need to use
a for loop, as in
for (i=1;i<=n;i++) {
y(i) = sin(x(i));
}
This creates a new array y of the same size as x. Each element of y is the sin of the
corresponding element of x. For example
EE 221 Numerical Computing
Scott Hudson
2016-01-08
Lecture 2: Arrays
11/15
-->x = [0,0.1,0.2,0.3]
x =
0.
0.1
0.2
-->y = sin(x)
y =
0.
0.0998334
0.3
0.1986693
0.2955202
7 Array operators
In linear algebra, arrays (e.g., vectors and matrices) are considered entities in their own right and
there are rules for operating on them, such as matrix multiplication and the inverse. In practice,
sometimes an array is just a convenient collection of numbers. In this case you might want to
perform operations on the elements of the array independent of the rules of linear algebra. For
example, suppose u=[1 2 3] and v=[4 5 6] and you want to multiply each element of u by
the corresponding element of v w=[14 25 36]=[ 4 10 18] . This is not a standard operation
in linear algebra. To perform component-by-component operations (or "array operations") you
prefix the operator with a period. For example,
-->u = [1,2,3]
u =
1.
2.
3.
-->v = [4,5,6]
v =
4.
5.
6.
-->u.*v
ans =
4.
10.
-->u.^2
ans =
1.
4.
18.
9.
10.
22.
4.
16.
Notice the very different results. The operation A^2 tells Scilab/Matlab to use the rules of matrix
multiplication to calculate A*A. The operation A.^2 tells Scilab/Matlab to square each element
of A.
When an array represents a collection of values, say measurements, we often want to look at
properties such as the sum or average of the values. Consider the following.
EE 221 Numerical Computing
Scott Hudson
2016-01-08
Lecture 2: Arrays
12/15
-->x = 0:0.1:1
x =
0. 0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.
-->sum(x)
ans =
5.5
-->mean(x)
ans =
0.5
The function sum() adds all the elements of an array while mean() calculates the average
value of the elements. If we wanted to find the mean-square value (average of the squared values)
we could use the command
-->mean(x.^2)
ans =
0.35
This first squares each element of x and computes the mean of those squared values. Here's an
important point. Consider the following
-->x = [1,2];
-->1./x
ans =
0.2
0.4
-->(1)./x
ans =
1.
0.5
In the first instance Scilab interpreted the dot as a decimal point and returned
-->(1.)/x
ans =
0.2
0.4
which is something called the pseudo inverse and not what we were after. We need to use
parentheses to avoid this interpretation
-->(1)./x
ans =
1.
0.5
1. + i
Scott Hudson
2016-01-08
Lecture 2: Arrays
13/15
2. - 2.i
- 1. + 2.i
The output can be a bit difficult to read with complex arrays. A^2 produces a 2-by-2 matrix of
complex values each expressed as real and imaginary parts. It can be helpful to view these
separately. The real and imag commands allow you to do this.
-->B = A^2
B =
1. + 2.i
2. - 2.i
1. + i
- 1. + 2.i
-->real(B)
ans =
1.
1.
2. - 1.
-->imag(B)
ans =
2.
1.
- 2.
2.
One subtle point is that the single-quote operator is actually the "transpose-conjugate" operator. It
takes the transpose of a matrix followed by its complex conjugate. For a real matrix this is simply
the transpose. But for a complex matrix you need to remember that there is also a conjugation
operation.
-->A
A =
1.
2.
i
- i
-->A'
ans =
1.
- i
2.
i
If you just want the transpose of a complex matrix use the .' operator.
-->A.'
ans =
1.
i
2.
- i
9 Structures
Arrays are convenient ways to organize data of a common type. Often you want to organize
different types of data as a single entity. Many programing languages allow you to do this using
structures. In Scilab/Matlab a structure is define as follows.
St = struct(field1,value1,field2,value2...);
Here's an example
-->Object = struct('name','ball joint','mass',2.7)
Object =
name: "ball joint"
mass: 2.7
Scott Hudson
2016-01-08
Lecture 2: Arrays
14/15
This creates a structure named object having two fields: name and mass. Fields are accessed
using the syntax struct.field. Here we change the mass
-->Object.mass = 3.2
Object =
name: "ball joint"
mass: 3.2
Fields can be strings, numbers or even arrays. For example, say we have a rigid body. The
orientation of this body in space is specified by six numbers: three coordinates of its center of
mass, and three rotation angles. We might define a structure such as
-->Body = struct('pos',[-2,0,3],'ang',[12,-20,28])
Body =
pos: [-2,0,3]
ang: [12,-20,28]
The use of structures can greatly streamline complicated programing projects by uniting various
data into a single logical entity. Arrays of structures are possible, as in
-->Person = struct('name','','age',0);
-->Member = [person,person,person];
-->Member(2).name = 'Tom';
-->Member(2).age = 32;
-->Member(2)
ans =
name: "Tom"
age: 32
This defines a structure person and then an array of three of these structures named member.
Each element could refer to a member of a three-member team. Fields are accessed as shown.
You can even have structures serve as fields of other structures. Here's an example.
-->Book = struct('title','','author','');
-->Class = struct('name','','text',Book);
-->Class.name = "Engl 101";
-->Class.text.title = "How to Write";
-->Class.text.author = "Kay Smith";
-->Class.text
ans =
title: "How to Write"
author: "Kay Smith"
You can skip the initializing of the structure and just start assigning values to the fields. For
example.
-->clear
-->Class.name = 'Intro Psych';
-->Class.enrollment = 24;
Scott Hudson
2016-01-08
Lecture 2: Arrays
15/15
0.
0.
0.
0.
This creates a three-dimensional array. Elements are accessed in the usual manner
-->A(2,1,2) = 3
A =
(:,:,1)
0.
0.
(:,:,2)
0.
3.
0.
0.
0.
0.
Scott Hudson
2016-01-08
Lecture 3
Programming structures
1 Introduction
In this lecture we introduce the basic programming tools that you will use to build programs. In
particular we will be introduced to the if, for and while loops.
2 Program files
2.1 Directories
The pwd command (print working directory) tells you what directory you are currently working
in.
-->pwd
ans =
C:\Users\Hudson
Use the cd (change directory) command followed by a valid directory name to move to another
directory.
-->cd Desktop
ans =
C:\Users\Hudson\Desktop
In Scilab the cd command with no argument moves you to your home directory.
-->cd
ans =
C:\Users\Hudson
Entering '..' for the directory name moves you up one directory level.
-->cd '..'
ans =
C:\Users
-->pwd
ans =
C:\Users
The dir and ls commands list the contents of the current directory. If you want to list specific
EE 221 Numerical Computing
Scott Hudson
2016-01-08
2/14
files you can enter a name. You can use the '*' character as a wild card. For example
-->ls *menu
ans =
Start Menu
lists all file names that end in 'menu' (note this is not case sensitive).
Scott Hudson
2016-01-08
3/14
Scilab program files have the extension *.sce. In Windows Scilab file icons look something like
this
In Windows you can right-click on the desktop, or in a directory, and you should see a menu
similar to the following. This will create an empty *.sce file named something like
New scilab-5.5.0 (64-bit) Application (.sce)
In either case double click on a Scilab file icon to open it in SciNotes. From there you can edit
and execute it.
Scott Hudson
2016-01-08
4/14
2.3 Comments
A program is just a file containing one or more Scilab commands. In principle you could have
entered these one at a time on the command line, but when structured as a program they are much
more convenient to develop, save, modify and rerun. This brings up a key point about
programming. Generally the goal when writing a program is to create a resource that can be
reused and modified in the future, either by yourself or by others. When you are writing a
program you (hopefully) are clear about the purpose of the various commands and how they fit
together to implement an algorithm that solves the problem of interest. However, if you come
back in a month and look at the list of commands in your program it is quite common to have
forgotten all this. And, obviously, if some else opens your program they probably will have little
to no idea of what it's supposed to do. For this reason, programming languages allow programs to
include comments designed to explain in words what the program commands are doing. A wellwritten program will contain comments that clearly explain what the program as a whole and its
various components do. This allows you or another programmer to easily understand how to use
and/or modify the program.
Scilab follows the C++ syntax for comments. Any text from the double back-slash symbol (//)
until the end of a line is treated as a comment. Matlab uses the percent sign (%). For example the
rather wordy code
//tryit.sce, Scott Hudson, 2011-06-03
//This program is an example, blah blah blah
x = 2; //this is an end-of-line comment
disp(sqrt(x)); //here's another
The comment text is there only for the benefit of the human trying to understand the program.
EE 221 Numerical Computing
Scott Hudson
2016-01-08
5/14
3 Flow control
Flow control refers to program statements that determine which parts of the program get
executed and under what conditions based on logical (true/false) conditions. The main flow
control statements we will use are: if, select, for, and while.
less than
less than or equal to
greater than
greater than or equal to
equal to
not equal to
logical and
logical or
logical not
The statement 1<2 asks Scilab/Matlab to answer the question, Is one less than 2? The answer is
yes, so Scilab returns T for true.
-->1<2
ans =
T
We use the equal sign = to assign values to variables. To test the equality of two expressions we
use the double-equal symbol ==. Here we have Scilab tell us that 1 is not equal to 2 (F stands for
false).
-->1==2
ans =
F
Scott Hudson
2016-01-08
6/14
We can combine multiple T/F statements using the and/or/not logical operators.
-->(1<2)&(3>=4)
ans =
F
This statement is false because the and operator & requires both expressions to be true. (1<2)
is true but (3>=4) is false. Therefore the entire statement is false. On the other hand
-->(1<2)|(3>=4)
ans =
T
An important point is that an interval test, such as a xb , has to be performed as the and of
two inequalities, such as
((a<=x)&(x<=b))
3.2 if statement
The if statement is one of the most commonly encountered in programming. It allows you to do
something in response to a logical condition. In Scilab/Matlab its syntax is
if expression
statements
end
Since Matlab does not use the then syntax we will avoid it in this class. This will make our
Scilab code more compatible with Matlab.
The way an if statement works is that Scilab/Matlab evaluates expression. If true, all
statements before the end keyword are executed. If false the statements are skipped and
execution continues at the statement immediate following the end keyword.
if (1<2)
disp('that is true');
end
This code will write that is true to the command environment since 1 is less than 2. On the
other hand
if (1>2)
disp('that is true');
end
will not write anything since the expression (1>2) is false. The statements can be of any number
and kind.
EE 221 Numerical Computing
Scott Hudson
2016-01-08
7/14
if (x>1)
y = x;
x = x^2;
end
Here expression1 is evaluated. If it's true then statements1 are executed and the
program continues after the end statement. If expression1 is false then expression2 is
evaluated. If it is true then statements2 are executed and the program continues after the
end statement. This can be repeated for as many expressions as you want. If the optional else
is present then if none of the statements are true, the else statements will execute. It's
important to note that no more than one set of statements will be executed, even if more than one
of the expressions are true. Only the statements corresponding to the first true expression are
executed, or failing that, the else statements if present.
As an example
if (x<1)
y = 0;
elseif (x>1)
y = 1;
else
y = x;
end
Here is an important error to watch out for. You should generally not use arrays in logical
expressions. For example, consider the following
x = [0,1];
disp(x<0.5);
Scilab goes through the array x and evaluates the statement (x<0.5) for each element, producing
an array out binary T/F values. Now consider
EE 221 Numerical Computing
Scott Hudson
2016-01-08
8/14
x = [0,1];
if (x<0.5)
disp('x<0.5');
end
This produces no output, even though 0<0.5. Scilab can't both execute the disp statement and not
execute it. It will only execute the statement (and only once at that) if (x<0.5) is true for all
elements of x. If your intention is to run the if statement for each element of x independently, then
you need to add a for loop (see below), as in
x = [0,1];
for i=1:2
if (x(i)<0.5)
disp('x<0.5');
end
end
If expression has the value v1 then the statements listed under case v1 are executed. If it has
value v2 then the v2 statements are executed. If it has none of specified values then the
optional else statements are executed. For example,
select x
case 1
msg = "x is 1";
case 2
msg = "x is 2";
else
msg = "x is something else";
end
As an example
x = zeros(1,5);
for i=1:5
x(i) = i;
end
EE 221 Numerical Computing
Scott Hudson
2016-01-08
1.
2.
3.
9/14
4.
5.
The for loop evaluates 1:5 as the list of numbers [1,2,3,4,5]. It initializes the index
variable i to the first value (i=1) then runs the statement x(i) = i causing x(1) to be
assigned the value 1. When the end statement is reached the for loop assigns to i the second
value in the list (2) and once again executes the statements causing x(2) to be assigned the
value 2. It continues until it exhausts the list.
The expression can be a sequence defined with colon notation, such as 1:5, or 2:0.1:3, or it
can be an array. An example of the latter case would be
y = [1,5,-3];
for x=y
x^3
end
The output is
ans
1.
ans
=
125.
ans =
- 27.
The loop evaluates expression. If it's false then execution skips to the statement following
end. If it's true then statements are executed and expression is evaluated again. If it's
still true then statements are once again executed. This continues until expression is
false. At that point execution continue with the first statement after the end keyword. As an
example, consider
x = 1;
while (x<10)
x = 2*x;
end
This produces
-->x
x =
16.
as follows
Scott Hudson
2016-01-08
10/14
1. (x=1)<10 so x=2*x=2
2. (x=2)<10 so x=2*x=4
3. (x=4)<10 so x=2*x=8
4. (x=8)<10 so x=2*x=16
5. (x=0) is not <10 so the loop terminates with x=16
4 Functions
Functions allow us to break large programs into conceptual blocks that can be reused and
reorganized. One form of "top down" programming is to take a large problem and break it into
manageable parts. If those parts are themselves large they can be broken up into subparts and so
on. A programing solution to some problem might then have the structure
Big problem
Part A
Subpart
Subpart
Part B
Subpart
Subpart
...
A1
A2
B1
B2
In this approach, each of the subparts might be a separate function that is called by its "parent"
part. Those parent parts might also be functions which are called by the main program. These
functions are logically separate and often exist in separate files. We can even build up libraries
of useful functions which can be used repeatedly in different programs.
There are a few important differences between the ways Scilab and Matlab treat functions.
This defines f(x) to return the value x^2+1 by first calculating z=x^2 followed by y=z+1.
Multiple statements can separated by commas as shown.
implements the absolute value operation. This function is now defined and can be called from the
EE 221 Numerical Computing
Scott Hudson
2016-01-08
11/14
command line
-->myabs(-3)
ans =
3.
3.
4.
5.
-->[a,b] = moms(x)
b =
11.
a =
3.
In Scilab multiple functions can reside in a single program file. The filename does not have to be
related to any of the function names.
In other words if you write a function y=myfunc(x) it must be saved in a file named myfunc.m
(an m-file). You can then call myfunc(x) from the command line or in other functions. The
Scilab approach is closer to languages such as C and Fortran.
If n=1 then n!=1 is the returned value. For n> 1 we use the fact that n!=n(n1)! . Suppose
we call myfact(3). The function wants to return m = 3*myfact(2). But myfact(2)
EE 221 Numerical Computing
Scott Hudson
2016-01-08
12/14
needs to be evaluated first. Scilab opens a new instance of the function and passes the argument
2. This second function call wants to return m = 2*myfact(1). But myfact(1) needs to
be evaluated. So Scilab opens a third instance of the function and passes the argument 1. In this
case the condition (n==1) is true and the value m = 1 is returned to the second function call.
The second function call now has the value of m = 2*1 and returns this to the first function
call. Finally the first function call can now evaluate m = 3*2 and return this to the user.
Obviously a recursive function will work only if eventually it stops calling itself and returns a
specific value.
Function g(x) is defined inside function f(x). When we evaluate f(x) at a value of 2, that
function sets y = g(2)*sin(2) where g(2)=cos(2) resulting in a returned value of
cos (2)sin (2)=0.3784012
When we try to call g(u) outside of f(x), however, we get an error. Since g(u) is defined
inside of f(x) it is not visible outside of g(u). We say that the scope of the function g(u) is
limited to inside of f(x). Now consider the following
function v = g(u)
v = sin(u);
endfunction
function y = f(x)
function v = g(u)
v = cos(u);
endfunction
y = g(x)*sin(x);
endfunction
-->f(2)
ans =
- 0.3784012
-->g(2)
ans =
0.9092974
We have two functions named g(u). When g(u) is called inside f(x) it clearly returns the
EE 221 Numerical Computing
Scott Hudson
2016-01-08
13/14
value g(2)=cos(2) since we get the same output for f(2) as before. But now calling g(2)
outside of f(x)does not produce an error. Instead it returns
g(2)=cos(2)=0.9092974
which is the definition of g(u)given outside of f(x). Generally speaking, variables and
functions are only visible inside the function in which they are defined, and inside all nested
functions. Thus variables in the main program are globally visible. If a variable or function is
defined a second time inside a function, that definition overrides the previous definition within
the scope of that function (and nested functions).
Scott Hudson
2016-01-08
14/14
global a //case 4
a = 2;
function y = f(x)
global a
a = 4;
y = a*x;
endfunction
disp('f(2)='+string(f(2))+', a='+string(a));
f(2)=8, a=4
However we can call it with a single input and assign its output to a single variable
u = f(3);
disp('u='+string(u));
u=9
Or supply both input variables and assign its output to two variables
[u,v] = f(3,2);
disp('u='+string(u)+' and v='+v);
u=18 and v=big
The first line of the function assigns values to nargin, the number of input arguments, and
nargout, the number of output arguments. (This assignment is needed in Scilab; in Matlab
these variables are automatically assigned.) If nargin==1 we know that only an x value was
passed to the function. Otherwise we know that both x and z values are available. Likewise, if
nargout==1 we know that only y needs to be calculated. Otherwise we also assign a string
value to ySize.
Scott Hudson
2016-01-08
Lecture 4
Input and output
1 Introduction
The deliverable of a computer program is its output. Output may be in graphical form as in a
two-dimensional function plot, or it may be in text form as in a table of data values. In addition
we often need to provide a program with input data, either interactively from the console or from
a disk file. We will cover graphics in future lectures. Here we look at various ways to input and
output data to and from the console and disk files.
In a program, on the other hand, the appearance of a variable name does not produce printed
output. Moreover, even with console I/O we may want more control over the output format. We
also need a way for a program to display values and ask for input, either from the user or from a
file. There are various ways to approach this. We start with the simplest.
4.1
-->disp([x,y])
2.3
4.1
This works from within a program also. You can also combine it with the string() function
(num2str() in Matlab) and the string concatenate operation. This creates one long string as in
the following
-->disp(string(x)+' plus '+string(y)+' equals '+string(x+y));
2.3 plus 4.1 equals 6.4
Scott Hudson
2016-02-02
2/9
and strings
-->fname = input('output file : ')
output file : 'test.txt'
fname =
test.txt
Here's a little snippet of code that prompts the user for an array of data and prints the average
value.
z = input('enter a 1-by-n array of numbers : ');
disp('the average value is '+string(mean(z)));
this produces
enter a 1-by-n array of numbers : [1,2,3,4,5,6]
the average value is 3.5
The save command saves all variable names and values, essentially your entire Scilab session.
You can exit Scilab and in a later session use the load command to recove these saved values.
-->load('test.dat')
-->A
A =
1.
3.
2.
4.
-->x
x =
1.
2.
3.
4.
5.
Scott Hudson
2016-02-02
3/9
-->s
s =
hello there
This works the same in Matlab, except the file name should not have an extension (for example
'test') as Matlab appends the .mat extension automatically. You can explicitly specify the
variables you want to save/load as in
-->save('Ax.dat','A','x'); //Scilab
>> save('Ax','A','x'); %Matlab
This saves only the variables A and x. To load one or more specific variables you do the
following
-->load('Ax.dat','A'); //Scilab
>> load('Ax','A'); %Matlab
3 Formatted I/O
Scilab/Matlab implement versions of the C fprintf and sprintf functions for formatted output.
fd is a file descriptor and format is a string that specifies how you want the output formatted.
var_1 through var_n are the variables you want displayed. In Scilab the number 6 is the file
descriptor for the console. This is also stored in the protected variable %io(2).
An example is
-->fd = %io(2);
-->x = 1.23;
-->mfprintf(fd,'the value of x is %f\n',x);
the value of x is 1.230000
The \n symbol denotes a "new line." If you omit this then subsequent mfprintf commands
will be appended to the same line. Consider the following.
EE 221 Numerical Computing
Scott Hudson
2016-02-02
4/9
produces
the value of x is 1.230000 and y is 3.210000
Finally, if you want to control the precise format of the numerical output, the syntax for a floating
point number is %m.nf where m is the total number of spaces (you need one for the decimal
point and you might need one for the sign) and n is the number of decimal places. For example
-->mfprintf(fd,'%4.2f\n',x);
1.23
-->mfprintf(fd,'%6.2f\n',x);
1.23
-->mfprintf(fd,'%6.3f\n',x);
1.230
For %d and %s formats you can use the syntax %md or %ms where m is the total number of
spaces to be displayed. Note that if you don't allocate enough, the full value will be printed
anyway. If you allocate "too much" then blank space will be added. In the following example we
generate a formatted table of trig values.
N = 4;
x = linspace(0,%pi/2,N);
y = sin(x);
z = cos(x);
mfprintf(fd,'\n'); //creates blank line at start
mfprintf(fd, '%6s %6s %6s\n','x','sin','cos');
for i=1:N
mfprintf(fd,'%6.3f %6.3f %6.3f\n',x(i),y(i),z(i));
end
Note the mfprintf(fd,'\n'); statement used to clear any previously "open" lines of
output. The output is
x
0.000
0.524
1.047
1.571
sin
0.000
0.500
0.866
1.000
cos
1.000
0.866
0.500
0.000
The format %-3d causes the output to be left aligned as opposed to the default right alignment.
The format %03d causes the output to be right aligned but all remaining space to the left is filled
with zeros.
Scott Hudson
2016-02-02
5/9
In keeping with the vectorized nature of Scilab/Matlab, the mfprintf (and fprintf)
function is also vectorized. For example
-->x = 1:3
x =
1.
2.
3.
-->mfprintf(fd,'%d %d %d\n',x)
1 2 3
Scilab/Matlab recognizes that x is an array. It fills in the 3 %d formats with x(1), x(2) and
x(3). Now consider this
A = [1,2;3,4];
-->mfprintf(fd,'%f %f\n',A)
1.000000 2.000000
3.000000 4.000000
mfprintf repeats itself for each row of matrix A. In addition to mfprintf there is a Scilab
function mprintf that does not require the file descriptor argument and prints directly to the
console.
-->x = 2;
-->mprintf('%f %f %f\n',x,x^2,x^3)
2.000000 4.000000 8.000000
The advantage of using mfprintf with fd = %io(2) for console output is that it is very
simple to modify your code to output to a file. You merely need to assign the fd variable using
the mopen command described below.
This has created a string with the value of k embedded. An example where this is very useful is
in creating frames for an animation where k goes from 1 to N and each file output is a single
frame.
This opens the file test.txt for output in the current directory. If it doesn't exist it is created.
If it does exist is it overwritten. The 'wt' notation indicates that we are opening this file for
writing in text format. You can also write in binary format, but we won't cover that. 'at'
designates a text file opened for appending. To open a text file for reading we use 'rt'.
EE 221 Numerical Computing
Scott Hudson
2016-02-02
6/9
[fd,err] = mopen('test.txt','rt');
The resulting file descriptor fd can be used to refer to the file and err is an error flag. It is zero
or empty if the open process worked properly and non-zero otherwise. You will get an error if
you try to open a file for reading that doesn't exist, or a file for writing in a directory where you
don't have write permission. Good programing practice is to always include error checking. For
example
[fd,err] = mopen('test.txt','rt');
if (err)
error('cannot open test.txt');
end
If there is an error opening 'test.txt' you will get a message and then Scilab will exit using
the error() function. It is very important to always close a file when you have finished reading
or writing. This is done with the mclose(fd) command (fclose(fd) in Matlab).
Here's an example
A = [1,2;3,4];
[fd,err] = mopen('A.txt','wt');
if (err)
error('cannot open file');
end
mfprintf(fd,'%f %f\n',A);
mclose(fd);
A common source of errors when trying to open a file for reading is that the file does not exist. A
way to test for this is using the isfile() command. In the following example a file named
'test.txt' exists the current directory but a file named 'test2.txt' does not.
-->isfile('test.txt')
ans =
T
-->isfile('test2.txt')
ans =
F
Scott Hudson
2016-02-02
7/9
We can read these numbers into a vector x as follows (note that for compactness we are not
including error checking for the mopen command).
[fd,err] = mopen('data.txt','rt');
[n,x] = mfscanf(6,fd,'%d');
mclose(fd);
disp(x');
1.
2.
3.
4.
5.
6.
Variable n indicates the number of successful reads. It is -1 if the end of file was reached before
all desired data were read. The 6 tells mfscanf to read six times in the %d format. These are
assigned as the elements of x. Here a variation
[fd,err] = mopen('data.txt','rt');
[n,x] = mfscanf(3,fd,'%d');
[n,y] = mfscanf(3,fd,'%d');
mclose(fd);
-->x'
ans =
1.
2.
3.
-->y'
ans =
4.
5.
6.
This reads 3 numbers and assigns them to x, then another 3 numbers are read and assigned to y.
In Matlab the ordering is slightly different
[fd,err] = fopen('data.txt','rt');
[n,x] = fscanf(fd,'%d',3);
[n,y] = fscanf(fd,'%d',3);
fclose(fd);
Here another example in Scilab. This time the file data.txt looks like this
east
west
south
north
2
4
1
3
23.75
-94.5
8.2
-7.9
Scott Hudson
2016-02-02
8/9
fill the arrays s, d and x with the corresponding string, decimal and floating point entries
-->s'
ans = !east
west
south
-->d'
ans =
2.
4.
1.
-->x'
ans =
23.75
- 94.5
north
3.
8.1999998
- 7.9000001
What do we do if we know the file contains some numbers but we don't know how many? One
way to read all the available numbers in the file is to read them one at a time followed by a check
for an end-of-file condition using the meof function (feof in Matlab). This returns a non-zero
value if the last input operation reached the end of the file. Here's an example of how we can use
this in a program.
[fd,err] = mopen('test.txt','rt');
i = 1; //use i for an array index
while (~meof(fd)) //while we haven't reach the end of the file
x(i) = mfscanf(fd,'%f'); //read the next number
i = i+1; //increment the array index
end
mclose(fd);
disp(x');
2.
3.
4.
We open the file and then as long as we have not reached the end-of-file condition we read a
single number into an element of an array x, increment the array index i and try again. This
reads in the values 1, 2, 3 and 4 then stops when the end of the file is reached. One thing to
notice is that Scilab ignores the white space in the file (spaces, tabs and line-feed characters)
and only looks for printable characters.
There are many other input/output functions. In the Scilab Help Browser see the sections titled
Files : Input/Output functions
Input/Output functions
Scott Hudson
2016-02-02
9/9
5 Spreadsheet support
Scilab/Matlab can read and write data in spreadsheet format. We will only consider spreadsheets
with numeric data and using comma-delimited text format (csv files). Suppose the spreadsheet
ss.csv looks like
2.
5.
3.
6.
It is possible to specify a separator other than a comma and to read and write strings. See the
Spreadsheet section of the help menu for more information.
Scott Hudson
2016-02-02
Lecture 5
2D plots
1 Introduction
The built-in graphics capabilities of Scilab/Matlab are one of its strongest features. Compiled
languages such as C and Fortran are generally text-based. Generating graphics requires either
separate programs or libraries which are often operating-system specific. Scilab/Matlab, on the
other hand, provides a complete environment in which graphical routines are an integral
component and consistent across different operating systems.
There are many graphics routines. As always the Help command is a good way to see what is
available. We are going to focus on a few of the most useful. This is one area where there is a fair
amount of difference between Scilab and Matlab. We will try to emphasize those aspects in
common, but our primary focus will be on Scilab.
To put multiple graphs on the same figure you can either execute multiple plot commands, or you
can enter multiple x,y vector pairs in a single plot command as in
x = linspace(0,6,100);
y1 = sin(x);
y2 = cos(x);
plot(x,y1,x,y2);
Scilab/Matlab will automatically assign line attributes and/or colors as well as numerical axis
labels. If you want to choose these yourself you can add a text argument after each x,y pair. For
example
x = linspace(0,6,100);
y1 = sin(x);
y2 = cos(x);
y3 = 0.5-sin(x).^2;
y4 = 0.5-cos(x).^2;
plot(x,y1,'r-',x,y2,'g-',x,y3,'b-.',x,y4,'k:');
The letters denote colors and the symbols denote line types. Note that the single quote marks are
required. Here are the basic color options
Scott Hudson
2016-01-08
Lecture 5: 2D plots
r
g
b
k
c
m
y
k
w
2/12
red
green
blue (default for first plot)
black
cyan
magenta
yellow
black
white
This produces a red, dashed plot of thickness 3 and a blue dotted plot of thickness 2.
In place of a line, you can plot your data as discrete points using various symbols. For example
plot(x,y,'r*')
plots red asterisks at each (x,y) pair. The available symbols are
+
o
*
.
x
s
d
^
v
>
<
p
plus sign
circle
asterisk
point
cross
square
diamond
up triangle
down triangle
right triangle
left triangle
pentagram (star)
A grid is often useful. To add a grid in Scilab use the xgrid command. The corresponding
command in Matlab is grid.
Scott Hudson
2016-01-08
Lecture 5: 2D plots
3/12
Produces the graphic window shown in Fig. 1. We can move and resize this window with the
Fig. 1
mouse. We can also use the Edit menu to select Axes properties. This opens the Axes Editor
(Fig. 2). In this window there are tabs labeled
X
Title
Style
Aspect
Viewpoint
Figures 2, 3 and 4 show the windows corresponding to the Style, X and Aspect tabs, respectively.
In the Style window (Fig. 2) we can select the font type, color and size used for the numerical
labels on the axes, in addition to other properties. These can best be learned by generating a plot
and playing around with the various settings.
Scott Hudson
2016-01-08
Lecture 5: 2D plots
4/12
Fig. 2
In the X window (Fig. 3) we can specify a text label for the x axis. This is entered into the Text
box. Note that it must be enclosed in double quotes. We can also choose the font type, color and
size of the label. The Location pull-down menu allows us to reposition the x axis. The Grid color
slider selected causes an x grid to appear of the chosen color. The Data bounds boxes initially
contain the minimum and maximum x values of the plotted data. Suppose we wanted our plot to
scale such that it displayed the x axis region 1x10 (for instance we might want to compare
it to another plot). We manually enter -1 and 10 in the Data bounds boxes to achieve this. We can
also select Linear (default) or Logarithmic axis scaling. (Log scaling can only be used if x> 0
for all data.)
The Aspect tab window is shown in Fig. 4. Checking the Isoview box causes the plot to be scaled
so that one unit has the same length along the y axis as it does along the x axis. This is very
useful if our (x,y) data represent distance measurements, say in cm. Then the resulting plot is
geometrically accurate. If the Isoview box is unchecked (default) the x and y data will be
separately scaled to fill up graphic window.
Scott Hudson
2016-01-08
Lecture 5: 2D plots
5/12
Fig. 3
In our example the data have x values over the range 0x2 =6.283 . By default Scilab will
display x values over a range such as 0x7 so that the plot begins and ends on a major tic
mark. Clicking the Tight bounds box overrides this behavior. The Margins boxes specify the
space between the plotted axes and the edges of the figure. These values are fractions of the plot
width or height, and we can change them as desired.
In the Objects Browser portion of the Axes Editor window appears the hierarchy Figure, Axes,
Compound. Expanding the Compound object (click on the + sign) shows a Polyline object.
Selecting this brings up the Polyline Editor (Fig. 5). Here we can modify the properties of the
plotted line(s). If we uncheck the Visibility box the corresponding line will become invisible. We
can also modify the line type, width and color. Alternately (as shown in the figure) we can
uncheck the Line mode box and check the Mark mode to cause the data to appear as discrete
symbols. The marker type, color and size can be selected as desired. The result is the graphic
illustrated in Fig. 7.
Scott Hudson
2016-01-08
Lecture 5: 2D plots
6/12
Fig. 4
Scott Hudson
2016-01-08
Lecture 5: 2D plots
7/12
Fig. 5
generate separate blank graphic windows 1 and 2. Plot commands are directed to the last set
figure. Therefore
scf(1);
plot(x1,y1);
scf(2);
plot(x2,y2);
would plot (x1,y1) in Figure 1 and (x2,y2) in Figure 2. The scf command returns a
handle to a structure defining the figure properties. So
Scott Hudson
2016-01-08
Lecture 5: 2D plots
8/12
fg = scf(1);
Scott Hudson
2016-01-08
Lecture 5: 2D plots
9/12
Fig. 6
generates four graphic windows (1,2,3,4) arranged in a 2-by-2 array as shown in Fig. 7.
We can then select graphic window 2 and draw a plot in it with the following commands
scf(2);
x = linspace(0,2*%pi,50);
y = sin(x);
plot(x,y,'r-','linewidth',3);
xgrid;
We have seen that repeated plot commands add curves to a figure. If we want to start over we
need to clear the figure using the clf() command. This applies to the currently selected figure.
If we want to clear a specific figure, say graphic window 3, we can specify clf(3).
Scott Hudson
2016-01-08
Lecture 5: 2D plots
10/12
assigns to variable ax a handle to a structure specifying the axes properties of the currently
selected figure. These properties are as follows.
Scott Hudson
2016-01-08
Lecture 5: 2D plots
11/12
Scott Hudson
2016-01-08
Lecture 5: 2D plots
12/12
Properties can be modified by assigning the corresponding variables new values. For example
ax.isoview = on;
causes the x and y axes to have the same scaling. To change the font style and font size of the
axes tic labels and add a black grid we might use
ax.font_size = 4;
ax.font_style = 3;
ax.grid = [1,1];
To add a label to the x axis with a desired font and font size we can execute the following
commands
ax.x_label.text = this is the x label;
ax.x_label.font_size = 5;
ax.x_label.font_style = 3;
Similarly for the y (and z) labels and title. To find out more about, say, font_style settings
see
help graphics_fonts
6 Learning more
In this lecture we have focused on rectangular x,y plots. These are not the only types of plots that
we may want to generate. Polar coordinate r,q plots are common. Other examples are plots of
vector fields (such as fluid velocity) and histograms. To learn more use the help window and
follow the links
Help => Graphics => 2d_plot //Scilab
Help => graph2d %Matlab
You'll find routines such as polarplot, histplot, and many others. Once you are comfortable with
rectangular plots you should find it easy to use the other plotting routines. The help sections on
most of these have examples that you can run and code you can examine or use as a template for
your own programs.
Scott Hudson
2016-01-08
Lecture 6
3D plots and animation
1 Introduction
The majority of graphing tasks we face are typically two-dimensional functions of the form
y= f ( x ) . However, not all functions have a single input and a single output. The motion of
a particle through space is described by vector position vs time
r (t)=[ x (t) , y (t ) , z (t )]
We could represent this by three 2D plots, but a more physical representation would be to trace
the particle trajectory in a single 3D plot. In this case the independent variable t does not form
one of the plot axes. Instead it is a parameter of the motion. The resulting graph is called a
parametric plot.
Some engineering problems deal with fields. A field is physical property which can vary
throughout space. For example, the variation of ground elevation across a region of Earth's
surface can be expressed as
z= f ( x , y)
Here the two coordinates x,y might correspond to longitude and latitude and z to ground
elevation, possibly obtained from surveying. Or, z might represent surface temperature or
atmospheric pressure. In those cases we might also be interested with variation through time as
well as through space.
Since a computer screen is two dimensional, plots in three (and higher) dimensions will
necessarily have to represent a single projection of the function. Different projections might
highlight certain aspects of the function and obscure others. This problem grows with the number
of dimensions and is why scientific visualization is an active field of research.
In this lecture we want learn a few basic 3D plotting techniques. We will use the following code
Nx = 80;
Ny = 40;
x = linspace(-6,6,Nx);
y = linspace(-3,3,Ny);
z = zeros(Ny,Nx);
for i=1:Nx
for j=1:Ny
z(j,i) = cos(x(i))*sin(y(j));
end
end
to generate an array of z values which we will plot in various ways. Notice that the first index of
the z array corresponds to the y coordinate and the second index to the x coordinate. This relates
to the raster scan format traditionally used on computer monitors and the way arrays appear in
graphics cards. Both Scilab and Matlab use this convention.
Scott Hudson
2016-01-08
2/9
will plot the z values as the elevation field of a 3D surface. The mesh command shows this
surface as a wire mesh while the surf command show it as a solid color-coded surface. The x
and y coordinates are the integer indicies of the array. Alternately we can explicitly provide x and
y values
surf(x,y,z);
The result for our data is shown in Fig. 3. Because this is a 2D projection of a 3D surface, some
parts of the surface may be obscured. To get different views use the Rotation tool from the Tools
menu. Click (right button in Scilab, left button in Matlab) and drag with your mouse to reorient
the surface. As in the 2D case, we can export a figure to a graphics file for inclusion in a
presentation or paper.
Scott Hudson
2016-01-08
3/9
The integer n determines the number of discrete colors. A larger number, such as 256 gives a
smooth variation of color throughout the figure. But, you may want to have only eight discrete
colors in which case use 8 as the argument.
Using the Axes Editor we can change axis labels, figure title and the numerical label font for the
x, y and also z axes. In the Aspect submenu deselecting Cube scaling will produce a more
geometrically accurate representation of the surface. By also selecting the Isoview option the
surface plot will correspond to a physical representation of the surface with equal scaling for the
x,y,z axes.
Another useful command is
colorbar(zmin,zmax);
which adds a color bar to the figure showing the relationship between color and z value. After
various formatting changes our figure appears as shown in Fig. 4.
Scott Hudson
2016-01-08
4/9
The get current entity command gce() and the color_flag subtly change the way the
EE 221 Numerical Computing
Scott Hudson
2016-01-08
5/9
surface color is interpolated. You can experiment with flag values of 0 through 4. If you use these
commands they must come immediately after the surf plotting command.
contour(x,y,z,n);
Here n is the number of (uniformly spaced) contour levels you want drawn on the figure.
Replacing contour by contourf creates a filled contour plot. One irritation in Scilab is that
if we following the raster scan format we used with our initial data-generating code, we have to
replace the z argument with its transpose z.' (this is not the case in Matlab). As with all
graphics, we can adjust the formatting to our liking to get something such as shown in Fig. 4.
For the plain contour command Scilab adds numerical labels to the contours by default. I find
these to be too messy to be of much use and prefer a color bar as shown in the figure. To turn off
labeling use the xset('fpf',' ') command before the plotting, as shown below.
fg = scf(0);
clf();
fg.figure_size = [800,400];
fg.color_map = jetcolormap(11);
xset('fpf',' ');
contourf(x,y,z.',9);
ax = gca();
ax.isoview = "on";
ax.auto_ticks = ["on","on","on"];
ax.font_style = 3;
ax.font_size = 4;
ax.x_label.text = "longitude";
ax.x_label.font_size = 4;
ax.y_label.text = "latitude";
ax.y_label.font_size = 4;
ax.title.text = "ground elevation";
ax.title.font_size = 6;
colorbar(min(z),max(z));
Scott Hudson
2016-01-08
6/9
Fig. 4: Output of the contour and countourf commands with added formatting
4 Parametric plots
Trajectory plots of the form [ x(t) , y (t ) , z (t)] can be generated by the commands
param3d(x,y,z); //Scilab
plot3(x,y,z); %Matlab
As an example, below is Scilab code to generate a spiral trajectory starting at the origin and
extending up along the z direction as time increases. After some interactive orientation and
formatting we end up with the graph of Fig. 5.
Scott Hudson
2016-01-08
7/9
Nt = 200;
t = linspace(0,10,Nt);
x = zeros(Nt,1);
y = zeros(Nt,1);
z = zeros(Nt,1);
for i=1:Nt
r = t(i)/10;
x(i) = r*cos(2*%pi*t(i));
y(i) = r*sin(2*%pi*t(i));
z(i) = r;
end
param3d(x,y,z);
5 Animation (Scilab)
Animation is simply the process of generating a series of graphic figures one for each frame of
the animation. The frames can be displayed in real time on the screen or saved as graphics files
which can later be assembled into a video file. There are a few subtle points which arise when
generating animations. Let's illustrate by an example. We start with the following code
x = linspace(0,2*%pi,100);
y = sin(x);
nFrames = 200;
t = linspace(0,4*%pi,nFrames);
This is going to represent a vibrating string. Let's try to generate an animation as follows.
EE 221 Numerical Computing
Scott Hudson
2016-01-08
8/9
for i=1:nFrames
plot(x,y*cos(t(i)));
end
As we should have guessed, this plots one after another position of the string on the same plot
producing the result in Fig. 6. We need to erase the old curve before plotting a new one. So we
try
for i=1:nFrames
clf();
plot(x,y*cos(t(i)));
end
This produces a blinking mess in which the y axis is constantly rescaling. The rescaling we can
get rid of by explicitly stating the data bounds
for i=1:nFrames
clf();
plot(x,y*cos(t(i)));
ax = gca();
ax.data_bounds = [0,-1;2*%pi,1];
end
The problem is that the screen is still a blinking mess. What is happening is that when we tell
Scilab to clear the figure, we see it go blank. Then when we tell Scilab to draw a new figure, we
see that appear. The result is irritating on/off video flicker. What we really want is that as we
are viewing one frame we are generating a new frame behind the scenes. When we are ready
for it we want the new frame to swap out the old frame instantly. This requires two segments of
memory or video buffers. One holds the currently visible frame. The other background buffer
is where the computer is generating the next frame. When ready, the computer rapidly copies the
background buffer contents into the visible buffer. In video systems this process is called double
buffering. Triple buffering is used in high-end video (e.g., gaming systems) so that while the
buffer copying is occurring the computer can already be working on another frame. Scilab
provides two commands to implement double buffering: drawlater() and drawnow().
They are very simple to use as shown in the following code.
EE 221 Numerical Computing
Scott Hudson
2016-01-08
9/9
for i=1:nFrames
drawlater(); //turn on double buffering so that operations
clf();
//occur in the background
plot(x,y*cos(t(i)));
ax = gca();
ax.data_bounds = [0,-1;2*%pi,1];
drawnow(); //copy the background buffer to the visible buffer
end
This solves our problems. Inbetween the drawlater() and drawnow() commands we can
modify the figure in anyway we wish adding labels, titles, changing color maps and so on.
The speed with which frames update depends on how long it takes to generate a new frame. If the
goal is to produce an independent video file then we want to save each frame to disk. One
approach is shown here.
scf(0);
for i=1:nFrames
drawlater();
//generate a new frame here
drawnow();
fname = msprintf("frames/f%03d.png",i);
while (~isfile(fname))
xs2png(0,fname);
end
end
First we create a subdirectory names frames before running the animation code. During the
animation rendering the msprintf function creates a series of file names from the frame index
i. If this png file does not already exist it is written from the current graphics frame. The result is
a sequence of png files
f001.png , f002.png , ...
in the subdirectory frames. From there video editing software can be used to produce an
animation file. A useful free and open-source program for this is Virtualdub (virtualdub.org).
Scott Hudson
2016-01-08
Lecture 7
Root finding I
1 Introduction
For our present purposes, root finding is the process of finding a real value of x which solves
f (x)=0 . Since the equation g x=h x can be rewritten as
the equation
f x =g xh x =0 , this encompasses the solution of any single equation in a single
unknown. Ideally we would want to know how many roots exist and what their values are. In
some cases, such as for polynomials, there are theoretical results for the number of roots (some of
which might be complex) and we have clues about what we are looking for. However, for an
arbitrary function f ( x ) there is not much we can say. Our equation may have no real roots, for
example 1+e x =0 , or, as in the case of sin( x)=0 with roots x=n , n=0,1,2, , there
may be an infinite number of roots. We will limit our scope to finding one root any root. If we
fail to find a root it will not mean that the function has no roots, just that our algorithm was
unable to find one.
To have any hope of solving the problem we need to make basic assumptions about f ( x) that
allow us to know if an interval contains a root, if we are close to a root, or in what direction
(left or right) along the x axis a root might lie. At a minimum we will have to assume that our
function is continuous. Intuitively, a continuous function is one that can be plotted without
lifting pen from paper, while the plot of a discontinuous function has breaks. Formally a
function f (x ) is continuous at x=c if for any >0 there exists a >0 such that
| xc|< | f ( x ) f (c)|<
If f (x ) is continuous at all points in some interval a xb then it is continuous over that
interval. Continuity allows us to assume that a small change in x results in a small change in
f (x ) . It also allows us to know that if f (a )>0 , f (b)<0 then there is some x in the interval
(a ,b) such that f ( x )=0 , because a continuous curve cannot go from above the x axis to
below without crossing the x axis.
In some cases we will also assume that f (x ) is differentiable, meaning the limit
f (x )=lim
0
f (x+) f ( x)
exists for all x of interest. This allows us to approximate the function by the line
2 Graphical solution
The easiest and most intuitive way to solve f (x )=0 is to simply plot the function and zoom in
on the region where the graph crosses the x axis. For example, say we want to find the first root
of cos ( x)=0 for x0 . We could run the command
x = 0:0.01:3;
y = cos(x);
plot(x,y);
EE 221 Numerical Computing
Scott Hudson
2015-08-18
2/12
to get the plot shown in Fig. 1. We can see that there is a root in the interval 1.5x 1.6 . We
then use the magnifying lens tool (Fig. 2) to zoom in on the root and get the plot shown in Fig. 3.
From this we can read off the root to three decimal places as
x=1.571
This approach is easy and intuitive. We can readily search for multiple roots by plotting different
ranges of x. However, in many situations we need an automated way to find roots. Scilab (and
Matlab) have built in functions to do this, and we will learn how to use those tools. But, in order
to understand what those functions are doing, and what limitations they may have, we need to
study root-finding algorithms. We start with one of the most basic algorithms called bisection.
Scott Hudson
2015-08-18
3/12
3 Bisection
If the product of two numbers is negative then one of the numbers must be positive and the other
must be negative. Therefore, if f (x ) is continuous over an interval a xb , and if
f (a ) f (b)<0 then f ( x ) is positive at one end of the interval and negative at the other end.
Since f ( x ) is continuous we can conclude that f (r )=0 for some r in the interior of the
interval, because you cannot draw a continuous curve from a positive value of f (x ) to a
negative value of f (x ) , or vice versa, without passing through f (x )=0 . This is the basis of
the bisection method, illustrated in Fig. 4.
Fig. 4: Bisection
Scott Hudson
2015-08-18
4/12
If f (a ) f (b)<0 then we can estimate the root by the interval's midpoint with an uncertainty of
half the length of the interval, that is, r =(b+a)/2(ba )/2 . To reduce the uncertainty by half
we evaluate the function at the midpoint x=(b+a)/2 . If f (x ) , f (a ) have the same sign (as in
the case illustrated in Fig. 4) we set a= x . If f (x) , f ( b) have the same sign we set b=x . In
the (very unlikely) event that f ( x )=0 then r =x is the root. In either of the first two cases we
have bisected the interval a x b into an interval half the original size. We then simply
repeat this process until the uncertainty | b a| /2 is smaller than desired. This method is simple
and guaranteed to work for any continuous function. The algorithm can be represented as follows
Bisection algorithm
repeat until |ba| /2 is smaller than tol
set x = midpoint of interval (a ,b)
if f (x ) has the same sign as f (a ) then set a= x
else if f (x ) has the same sign as f (b) then set b=x
else f (x) is zero and we've found the root!
Function rootBisection in the Appendix implements the bisection algorithm.
Scott Hudson
2015-08-18
5/12
loop that will be repeated many times, then a grid search is not likely to be practical.
Another way to approach root bracketing is to start at an arbitrary value x=a with some step
size h. We then move along the x axis such that f ( x) is decreasing (that is, we are moving
towards y=0 ) until we find a bracket interval (a ,b) . If f ( x) starts to increase before we
find a bracket then we give up. Increasing the step size at each iteration protects us from getting
into a situation where we are inching along a function that has a far-away zero. Function
rootBracket in the Appendix implements this idea.
3.2 Convergence
The uncertainty in the root, the maximum error in our bisection estimate, is =ba/2 . This
decreases by a factor of with each bisection. Therefore the relationship between the error at
step k and step k +1 is
1
k+1= k
2
(1)
(2)
The exponent q is called the order of convergence. If q=1 , as it is for bisection, we say the
convergence is linear, and we call the rate of convergence. For q>1 the convergence is said
to be superlinear, and specifically, if q=2 the convergence is quadratic. For superlinear
convergence is called the asymptotic error constant.
From (1) and (2) we see that bisection converges linearly with a rate of convergence of . If the
initial error is 0 then after k iterations we have
Scott Hudson
2015-08-18
6/12
k
()
1
k =0
2
In order to add one decimal digit of accuracy we need to decrease the error by a factor of 1/10. To
increase accuracy by n digits we require
k
()
1
1
= n
2
10
n
=3.32 n
log 2
It takes k 10 iterations to add n=3 digits of accuracy. All linearly converging root finding
algorithms have the characteristic that each additional digit of accuracy requires a given number
of algorithm iterations.
Now consider a quadratically convergent method with
2
k+1=k
4 Fixed-point iteration
The goal of root finding is to arrive at x=r where f (r )=0 . Now consider the following three
equations
0= f (x )
0=a f ( x)
x=x+a f ( x)
Multiplying the first equation through by a produces the second equation. Adding x to both sides
of the second equation produces the third. All three will have the same roots, provided a0 .
Now let's define a new function
g ( x)=x+a f ( x)
Then our root-finding problem can be written
x=g ( x)
the solution of which is
r =g (r )
What's attractive about this is that it has the form x equals something, which is almost an
explicit formula for x. The problem is that the something itself depends on x. So let's imagine
starting with a guess for the root x=x 0 . We might expect that x 1=g ( x 0) would be an improved
EE 221 Numerical Computing
Scott Hudson
2015-08-18
7/12
r + k+1=r+ g (r ) k
and
k+1=g (r ) k
It follows that
k
k =[ g (r ) ] 0
(3)
(4)
fixed-point iteration converges linearly with rate of convergence . The value r is said to be a
fixed point of the iteration since the iteration stays fixed at r =g ( r ) . Condition (4) requires
|1+a f ( r )|<1
Scott Hudson
2015-08-18
8/12
f ( x)=x 22=0
By inspection the roots are 2 . Let
2
g ( 2)=1+2 2>1
we expect that fixed point iteration will fail. Starting very close to a root with
x=1.4 . iteration gives the sequence of values
1.36 ,1.2096 , 0.6727322 ,0.8746993 ,2.1096004
which is clearly moving away from the root. On the other hand taking
g ( x)=x 0.25 f ( x)= x0.25(x 2 2)
for which
g ( 2)=0.293
we get the sequence of values
1.412975 , 1.4138504 ,1.4141072 , 1.4141824 ,1.4142044 , 1.4142109
which is converging to
2=1.4142136 .
Solving for a
a=
f (r )
(5)
It seems we might achieve superlinear convergence for this choice of a. This is one way to
motivate Newton's method which we cover in the next lecture.
Scott Hudson
2015-08-18
9/12
//////////////////////////////////////////////////////////////////////
// rootBisection.sci
// 2014-06-04, Scott Hudson, for pedagogic purposes
// Implements bisection method for finding a root f(x) = 0.
// Requires a and b to bracket a root, f(a)*f(b)<0.
// Returns root as r with maximum error tol.
//////////////////////////////////////////////////////////////////////
function r=rootBisection(a, b, f, tol)
fa = f(a);
fb = f(b);
if (fa*fb>=0) //make sure a,b bracket a root
error('rootBisection: fa*fb>=0');
end
while (abs(b-a)/2>tol) //stop when error in root < tol
x = (a+b)/2; //midpoint of interval
fx = f(x);
if (sign(fx)==sign(fa)) //r is in (x,b)
a = x;
//move a to x
fa = fx;
elseif (sign(fx)==sign(fb)) //r is in (a,x)
b = x;
//move b to x
fb = fx;
else //unlikely case that fx==0, sign(fx)==0, we found the root
a = x; //shrink interval to zero width a=b=x
b = x;
end
end
r = (a+b)/2; //midpoint of last bracket interval is root estimate
endfunction
Scott Hudson
2015-08-18
10/12
//////////////////////////////////////////////////////////////////////
// rootBracket.sci
// 2014-06-04, Scott Hudson, for pedagogic purposes
// Given a function f(x), starting point x=x0 and a stepsize h
// search for a and b such that f(x) changes sign over [a,b] hence
// bracketing a root.
//////////////////////////////////////////////////////////////////////
function [a, b]=rootBracket(f, x0, h)
a = x0;
fa = f(a);
b = a+h;
fb = f(b);
done = (sign(fa)~=sign(fb)); //if the signs differ we're done
if (~done) //if we don't have a bracket
if (abs(fb)>abs(fa)) //see if a->b is moving away from x axis
h = -h; //if so step in the other direction
b = a; //and we will start from a instead of b
fb = fa;
end
end
while (~done)
a = b; //take another step
fa = fb;
h = 2*h; //take bigger steps each time
b = a+h;
fb = f(b);
done = (sign(fa)~=sign(fb));
if ((abs(fb)>abs(fa))&(~done)) //we're now going uphill, give up
error("rootBracket: cannot find a bracket\n");
end
end
endfunction
Scott Hudson
2015-08-18
11/12
Scott Hudson
2015-08-18
12/12
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% rootBracket.m
%% 2014-06-04, Scott Hudson, for pedagogic purposes
%% Given a function f(x), starting point x=x0 and a stepsize h
%% search for a and b such that f(x) changes sign over [a,b] hence
%% bracketing a root.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [a,b] = rootBracket(f,x0,h)
a = x0;
fa = f(a);
b = a+h;
fb = f(b);
done = (sign(fa)~=sign(fb)); %%if the signs differ we're done
if (~done) %%if we don't have a bracket
if (abs(fb)>abs(fa)) %%see if a->b is moving away from x axis
h = -h; %%if so step in the other direction
b = a; %%and we will start from a instead of b
fb = fa;
end
end
while (~done)
a = b; %%take another step
fa = fb;
h = 2*h; %%take bigger steps each time
b = a+h;
fb = f(b);
done = (sign(fa)~=sign(fb));
if ((abs(fb)>abs(fa))&(~done)) %%we're now going uphill, give up
error('rootBracket: cannot find a bracket');
end
end
end
Scott Hudson
2015-08-18
Lecture 8
Root finding II
1 Introduction
In the previous lecture we considered the bisection root-bracketing algorithm. It requires only
that the function be continuous and that we have a root bracketed to start. Under those conditions
it is guaranteed to converge with error
k
()
1
2
k =
at the kth iteration. Thus bisection provides linear convergence with a rate of convergence of 1/2.
Because of its general applicability and guaranteed convergence, bisection has much to
recommend it.
We also studied fixed-point iteration
x k +1 =g ( x k )= x k +a f ( x k )
where a is a constant. We found that provided we start out close enough to a root r the method
converges linearly with error
k
k =[ g (r ) ] 0
(1)
As this is a root-polishing algorithm, it does not require an initial root bracketing, which might be
considered a plus. Still, for one-dimensional functions f (x ) , fixed-point iteration is not an
attractive algorithm. It provides the same linear convergence as bisection without any guarantee
of finding a root, even if one exists. However, it can be useful for multidimensional problems.
What we want to investigate here is the tantalizing prospect suggested by g (r )=0 which in
light of (1) suggests superlinear convergence of some sort.
2 Newton's method
f (r )
f (x k )
f (r)
The only problem is that to compute f ( r) we'd need to know r, and that's what we are
searching for in the first place. But if we are close to the root, so that f ( x k ) f ( r) , then it
makes sense to use
Scott Hudson
2015-08-18
2/10
x k +1 =x k
f ( xk )
f ( xk )
Let's derive this formula another way. If f (x) is continuous and differentiable, then for small
changes in x, f (x ) is well approximated by a first-order Taylor series. If we expand the Taylor
series about the point x= x k and take x k +1=x k +h then, assuming h is small, we can write
f (x k )
f ( xk )
f (x k )
(2)
f (x k )
which is called Newton's method. In Newton's method we model the function by the tangent
line at the current point (x k , f (x k )) . The root of the tangent line is our next estimate of the root
of f (x ) (Fig. 1).
Taking x k =r + k and assuming the error k is small enough that the 2nd order Taylor series
f (x ) f ( r )+ f ( r )+
1
1
f ( r ) 2= f (r )+ f ( r )2
2
2
is accurate, it's straight-forward (but a bit messy) to show that Newton's method converges
quadratically with
EE 221 Numerical Computing
Scott Hudson
2015-08-18
3/10
1 f (r ) 2
k+1=
2 f ( r ) k
Example: Let's take
f (x )=x 22=0
x 2
x k +1=x k k
2 xk
Let's start at x 0=1 . Four iterations produce the sequence of numbers
1.5 , 1.4166667 ,1.4142157 ,1.4142136
This is very rapid convergence toward the root 2=1.4142136 . In fact the
final value (as stored in the computer) turns out to be accurate to about 11
decimal places.
A Scilab implementation of Newton's is given in the Appendix. We require two functions as
arguments: f (x ) and f ( x) . Newton's method fails if f ( x k )=0 and more generally performs
poorly if
( x k ) is very small. It can also fail if we start far away from a root.
With root-bracketing methods, such as bisection, we know that the error in our root estimate is
less than or equal to half the last bracketing interval. This gives us a clear termination condition ;
we stop when the maximum possible error is less than some defined tolerance. With rootpolishing methods, such as Newton's method, we don't have any rigorous bound on the error in
our root. Deciding when to stop iterating involves some guess work. A simple criterion is
x k x k 1 tol
(3)
that is, we terminate when the change in root estimates is less than some specified tolerance. This
is a reasonable way to estimate the actual uncertainty in our root estimate, but keep in mind that
it is not a rigorous bound on the actual error, as is the case with bisection. Since Newton's
method is not guaranteed to converge it may just bounce around for ever it is a good idea to
also terminate if the number of iterations exceeds some maximum value.
It's natural to look for methods that converge with order q=3, 4, . Householder's method
generalizes Newton's method to higher orders. The q=3 version is called Halley's method. It
Scott Hudson
2015-08-18
4/10
f (x ) and f ( x) at each iteration. In some cases it may not be possible or practical to compute
f ( x) explicitly. We would like a method that gives us the rapid convergence of Newton's
method without the need of calculating derivatives.
3 Secant method
In the secant method we replace the derivative appearing in Newton's method by the
approximation
f ( x k )
f ( x k ) f ( x k1)
x k x k1
x k x k 1
f (x k ) f ( x k 1 )
(4)
Another way to view the secant method is as follows. Suppose we have evaluated the function at
two points ( x k , f k = f ( x k )) and (x k 1 , f k1= f ( x k1)) . Through these two points we can draw
a line, the formula for which is
y= f k +
f k f k1
( xx k )
x k x k1
Setting this formula equal to zero and solving for x we obtain (4). In the secant method we model
the function f (x ) by a line through our last two root estimates (Fig. 2). Solving for the root of
that line provides our next estimate.
When the secant method works it converges with order q1.6 , superlinear but not quadratic.
Comparison of Figs. 1 and 2 suggest why this is. The secant method will tend to underestimate or
EE 221 Numerical Computing
Scott Hudson
2015-08-18
5/10
overestimate (depending on the 2nd derivative of the function at r) the actual slope at the point
. We are effectively using an average of the slopes at (x k , f (x k )) and
( x k 1 , f ( x k1 )) whereas we use the true slope at ( x k , f ( x k )) in Newton's method. It can be
shown [2] that the secant method converges as
1 f (r )
k+1=
2 f (r ) k k1
or
0.618
1 f (r )
k +1
2 f (r )
1.618
so the secant method is of order q1.6 , which is superlinear but less than quadratic. A Scilab
implementation of the secant method is given in the appendix.
As is so often the case with numerical methods we are presented a trade off. The secant method
does not require explicit calculation of derivatives while Newton's method does. But, the secant
method does not converge as fast as Newton's method. As with all root-polishing methods,
deciding when to stop iterating involves some guess work. Criterion (3) is an obvious choice.
As illustrated in Fig. 3, the secant method can fail, even when starting out with a bracketed root.
There we start with points 1 and 2 on the curve. The line through those points crosses the x axis
at s. The corresponding point on the curve is point 3. Now we draw a line through points 2 and 3.
This gives the root estimate t. The corresponding point on the curve is point 4. We are actually
moving away from the root.
In the case illustrated points 1 and 2 bracket a root while points 2 and 3 do not. Clearly if we
have a root bracketed we should never accept a new root estimate that falls outside that bracket.
The false position method is a variation of the secant method in which we use the last two points
EE 221 Numerical Computing
Scott Hudson
2015-08-18
6/10
which bracket a root to represent our linear approximation. In that case we would have drawn a
line between point 3 and point 1. Of course, as in the bisection method, this would require us to
start with a bracketed root. Moreover in cases where both methods would converge the false
position method does not converge as fast as the secant method.
y= f ( x )c 1+c2 x+c3 x =0
might provide a better root approximation than the secant method. Unfortunately we would have
to solve a quadratic equation in this case. An alternative approach is inverse quadratic
interpolation where we represent x as a quadratic function of y
x= f 1 ( y)=c 1+c 2 y +c 3 y 2
Setting y= f ( x )=0 we simply have x=c1 (Fig. 4). It turns out there is an explicit formula for
this value
x=
x1 f 2 f 3
x2 f 3 f 1
x3 f 1 f 2
+
+
( f 1 f 2)( f 1 f 3 ) ( f 2 f 3)( f 2 f 1) ( f 3 f 1 )( f 3 f 2)
(We will understand this formula when we study Lagrange interpolation.) Inverse quadratic
EE 221 Numerical Computing
Scott Hudson
2015-08-18
7/10
interpolation allows us to exploit information about both the first derivative (slope) and second
derivative (curvature) of the function.
Like the secant method, inverse quadratic interpolation can also fail. But when it works it
converges with order q1.8 . Still not as rapid as Newton's method, but it does not require
evaluation of derivatives. It converges faster than the secant method ( q1.6 ) but at the cost of
more bookkeeping and a more complicated recursion formula.
5 Hybrid methods
In general, numerical root finding is a difficult problem. We are presented with various trade
offs, such as that between the guaranteed converge of the bisection method and the faster
convergence of the Newton, secant or inverse quadratic methods. Consequently people have
developed hybrid methods that seek to combine the best of two or more simpler methods. One of
the most widely used is Brent's method [1] (used in the Matlab fzero function). This method
combines the bisection, secant and inverse quadratic methods. Brent's method starts off with a
bracketed root. If we don't have a bracket then we have to search for one. With those two initial
points, Brent's method applies the secant method to get a third point. From there it tries to use
inverse quadratic interpolation for rapid convergence, but applies tests at each iteration to see if it
is actually converging superlinearly. If not, Brent's method falls back to the slow-but-sure
bisection method. At the next iteration it again tries inverse quadratic interpolation. Another
example is Powell's hybrid method. (used in the Scilab fsolve function).
There is no single agreed-upon one size fits all algorithm for root finding. Hybrid methods seek
to use the fastest algorithm that seems to be working with the option to fall back to slower, but
surer methods as a backup. These methods are recommended for most root-finding applications.
However there may be specific applications where a particular method will be superior. Newton's
Here x0 is an initial guess at a root of the function f(x). The returned value r is the estimated
root. For example
-->deff('y=f(x)','y=cos(x)');
-->r = fsolve(1,f)
r =
1.5707963
The Matlab equivalent is called fzero, although it has a slightly different syntax. Note,
however, that fsolve may fail to find a root but may still return a value r. For example
Scott Hudson
2015-08-18
8/10
-->deff('y=f(x)','y=2+cos(x)');
-->r = fsolve(1,f)
r =
3.1418148
The function 2+cos ( x) is never 0 yet Scilab returned a value for r. This value is actually where
the function gets closest to the x axis; it's where f ( x) is a minimum. Therefore, you should
always check the value of the function at the reported root. Running the command using the
syntax
[r,fr] = fsolve(x0,f);
1.
=
3.1418148
shows us that r is not actually a root since f (r )=1 . On the other hand
-->deff('y=f(x)','y=0.5+cos(x)');
-->[r,fr] = fsolve(1,f)
fr =
2.220D-16
r =
2.0943951
makes it clear that r is a root in this case. By default, fsolve tries to find the root to within an
estimated tolerance of 1010 . You can specify the tolerance explicitly as in
[r,fr] = fsolve(x0,f,tol);
Because of limitations due to round-off error it is not recommended to use a smaller tolerance
than the default. There may be situations where you don't need much accuracy and using a larger
tolerance might save a few function calls.
7 References
1. Brent, Richard P. Algorithms for Minimization Without Derivatives. Dover Publications.
Kindle edition. ASIN: B00CRW5ZTK. 2013 (Originally published 1973)
2. http://www.math.drexel.edu/~tolya/300_secant.pdf
Scott Hudson
2015-08-18
9/10
//////////////////////////////////////////////////////////////////////
// rootNewton.sci
// 2014-06-04, Scott Hudson
// Implements Newton's method for finding a root f(x) = 0.
// Requires two functions: y=f(x) and y=fp(x) where fp(x) is
// the derivative of f(x). Search starts at x0. Root is returned as r.
//////////////////////////////////////////////////////////////////////
function r=rootNewton(x0, f, fp, tol)
MAX_ITERS = 40; //give up after this many iterations
nIters = 1; //1st iteration
r = x0-f(x0)/fp(x0); //Newton's formula for next root estimate
while (abs(r-x0)>tol) & (nIters<=MAX_ITERS)
nIters = nIters+1; //keep track of # of iterations
x0 = r; //current root estimate is last output of formula
r = x0-f(x0)/fp(x0); //Newton's formula for next root estimate
end
endfunction
//////////////////////////////////////////////////////////////////////
// rootSecant.sci
// 2014-06-04, Scott Hudson
// Implements secant method for finding a root f(x) = 0.
// Requires two initial x values: x1 and x2. Root is returned as r
// accurate to (hopefully) about tol.
//////////////////////////////////////////////////////////////////////
function r=rootSecant(x1, x2, f, tol)
MAX_ITERS = 40; //maximum number of iterations allowed
nIters = 1; //1st iteration
fx2 = f(x2);
r = x2-fx2*(x2-x1)/(fx2-f(x1));
while (abs(r-x2)>tol) & (nIters<=MAX_ITERS)
nIters = nIters+1;
x1 = x2;
fx1 = fx2;
x2 = r;
fx2 = f(x2);
r = x2-fx2*(x2-x1)/(fx2-fx1);
end
endfunction
Scott Hudson
2015-08-18
10/10
Scott Hudson
2015-08-18
Lecture 9
Polynomials
1 Introduction
The equation
p ( x)=c1 +c 2 x+c 3 x 2 +c 4 x 3=0
(1)
is one equation in one unknown, and the root finding methods we developed previously can be
applied to solve it. However this is a polynomial equation, and there are theoretical results that
can be used to develop specialized root-finding methods that are more powerful than generalpurpose methods.
For polynomials of order n=1,2,3,4 there are analytic formulas for all roots. The single root of
c 1+c 2 x=0 is x=c 1 /c2 . The quadratic formula gives the two roots of
2
c 1+c 2 x+c 3 x =0
as
x=
c2 c22 4 c 1 c3
2 c3
The formulas for order 3 and 4 polynomials are too complicated to be of practical use. Therefore
to find the roots of order 3 and higher polynomials we are forced to use numerical methods. Yet
it is still the case that we have important theoretical results to guide us.
The fundamental theorem of algebra states that a polynomial of degree n has precisely n roots
(some of which may be repeated). However, these roots may be real or complex. Most of the root
finding algorithms we have studied so far apply only to a real function of a real variable. They
can find the real roots of a polynomial (if there are any) but not complex roots.
If the polynomial has real coefficients c 1 , c 2 , then complex roots (if any) come in complexconjugate pairs. Let z= x+i y be a general complex number with real part x and imaginary part
y. If
c 1+c 2 z+c3 z 2++c n+1 z n=0
then taking the complex conjugate of both sides we have
c 1 +c2 z +c3 (z 2 )++cn+1( z n )=0
(2)
where z =xi y . Suppose the coefficients are real so that c k =c k . Since (z k ) =( z )k (2)
becomes
which tells us that if z is a root of the polynomial then so is z . Therefore complex roots must
come in complex-conjugate pairs. From this we know that a polynomial of odd order has at least
one real root since we must always have an even number of complex roots.
Scott Hudson
2015-08-18
Lecture 9: Polynomials
2/14
Another way to see this is in terms of root bracketing. For very large (real) values of x, an nth
order polynomial is dominated by its highest power term
c 1+c 2 x+c 3 x 2++cn +1 x nc n+1 x n
For n odd, x n is positive for positive x and negative for negative x. Therefore the polynomial
must change sign between x and x . Since polynomials are continuous functions we
conclude that there must be a real root somewhere on the x axis.
For a complex root
p ( z )=c 1+c 2 (x+i y )+c 3 (x+i y) 2++cn +1 ( x+i y)n
Expanding each term into real and imaginary parts, along the lines of
( x+i y )2=x 2 y 2 +i 2 x y
( x+i y )3=x 33 x y 2 i y ( y 2 3 x 2)
we end up with an equation of the form
p ( z )= f ( x , y )+i g ( x , y)=0
which is actually two real equations in two real unknowns
)()
f (x , y)
0
=
g (x , y )
0
Scott Hudson
2015-08-18
Lecture 9: Polynomials
3/14
-->p = poly([1,2,3],'x','coeff')
p =
2
1 + 2x + 3x
Notice that Scilab outputs an ascii typeset polynomial in the variable of interest. To form a
polynomial with roots at x=1 , x=2 we use the command
-->q = poly([1,2],'x','roots')
q =
2
2 - 3x + x
Note that
2
3
4
2 + x + x - 7x + 3x
-->p/q
ans =
1 + 2x + 3x
----------2
2 - 3x + x
In the second case we obtain a rational function of x. Now consider the somewhat redundant
appearing command
-->x = poly(0,'x')
x =
x
This assigns to the Scilab variable x a polynomial in the symbolic variable x having a single root
at x=0 , that is, it effectively turns x into a symbolic variable. Now we can enter expressions
such as
-->h = 3*x^3-4*x^2+7*x-15
h =
2
3
- 15 + 7x - 4x + 3x
-->g = (x-1)*(x-2)*(x-3)
g =
2
3
- 6 + 11x - 6x + x
To evaluate a polynomial (or rational function) at a specific number we use the horner
command
-->horner(h,3)
ans =
51.
Scott Hudson
2015-08-18
Lecture 9: Polynomials
4/14
-->horner(h,1-%i)
ans =
- 14. - 5.i
3.
-->horner(h,v)
ans =
- 9.
7.
51.
As always, we are only scratching the surface. See the Polynomials section of the Scilab Help
Browser for more information.
Notice that the common factor of x1 has been canceled. This can be used for deflation
-->p/(x-2)
ans =
- 1 + x
----1
We need to consider the effect of finite precision. Suppose we've calculated a polynomial root r
EE 221 Numerical Computing
Scott Hudson
2015-08-18
Lecture 9: Polynomials
5/14
numerically. We don't expect it to be exact. Will Scilab be able to factor it out of the polynomial?
Look at the following
-->p = (x-sqrt(2))*(x-%pi)
p =
4.4428829 - 4.5558062x + x
-->p/(x-%pi*(1+1e-6))
ans =
4.4428829 - 4.5558062x + x
-------------------------- 3.1415958 + x
-->p/(x-%pi*(1+1e-9))
ans =
- 1.4142136 + x
------------1
The polynomial has a factor of (x) . In the first ratio we are trying to cancel a factor of
6
(x [1+106 ]) . Now [1+10 ] is very close to , but not close enough for Scilab to
consider them the same numbers, so no deflation occurs. On the other hand, in the second ratio
Scilab treats (x [1+109 ]) as numerically equivalent to (x) and deflates the polynomial
by that factor. The lesson is that a root estimate must be very accurate for it to be successfully
factored out of a polynomial.
4 Horner's method
Let's turn to the numerical mechanics of evaluating a polynomial. To compute
2
for some value of x, it is generally not a good idea to directly evaluate the terms as written.
Instead, consider the following factorization of a quadratic
2
Scott Hudson
2015-08-18
Lecture 9: Polynomials
6/14
We must have
c1+c2 x+c3 x 2+c4 x 3=( xr )( b1+b 2 x+b3 x 2)
=r b 1rb 2 xr b 3 x 2
+b1 x + b2 x 2+b 3 x 3
Equating coefficients of like powers of x we have
c4 =b3
c3=r b 3+b 2
c2 =r b2+b1
c1=r b 1
We can rearrange this to get
b 3=c 4
b 2=c 3+r b3
b 1=c 2+r b2
0=c 1+r b1
Generalizing to an arbitrary order polynomial we have
b n=c n+1
b k =c k+1+r bk +1 for k=n1, n2, , 1
(3)
Additionally, the equation c 1+r b 1=0 must automatically be satisfied if r is a root. This
algorithm appears in the Appendix as polyDeflate.
Scott Hudson
2015-08-18
Lecture 9: Polynomials
7/14
then
Now, one subtle point. If the coefficients c k are all real, then so are the coefficients b k . It
follows that if x is real then x f ( x)/ f ( x) is real also. Therefore, if Newton's method starts on
the real axis, it can never leave the real axis. For that reason we need to start with a complex
value of x 0 .
The function polyRoots shown in the Appendix is our implementation. We use
polyHorner to evaluate the polynomials. Iteratively we use rootNewton to find a (any)
root. Then we use polyDeflate to remove that root's factor and reduce the order of the
polynomial.
This simple code actually works pretty well. Run on several hundred randomly generated 7 th
order polynomials it only failed about one percent of the time. However, those failures
demonstrate why numerical analysis is an active field of research. In any type of problem there
are almost always some hard cases which thwart a given algorithm. This motivates people to
develop more advanced methods. For polynomial root finding some of the more advanced
methods are Laguerre's method, Bairstow's method, the DurandKerner method, the Jenkins
Traub algorithm and the companion-matrix method. In its built-in root finding function Matlab
uses the companion-matrix method while in Scilab you can use either the companion-matrix
method (default) or the JenkinsTraub algorithm.
where p is a polynomial and r is an array containing all roots of p. For example, in Scilab
Scott Hudson
2015-08-18
Lecture 9: Polynomials
8/14
-->p = poly([1,2,3,4,5,6],'x','coeff')
p =
2
3
4
5
1 + 2x + 3x + 4x + 5x + 6x
-->r = roots(p)
r =
0.2941946
0.2941946
- 0.6703320
- 0.3756952
- 0.3756952
+ 0.6683671i
- 0.6683671i
+ 0.5701752i
- 0.5701752i
Once you know the roots you can, if you wish, write the polynomial in factored form. If the
polynomial has real coefficients then complex roots come in conjugate pairs. A product of the
0.5332650 - 0.5883891x + x
0.4662466 + 0.7513904x + x
0.6703320 + x
7 References
1. http://www.ece.rice.edu/dsp/software/FVHDP/horner2.pdf (retrieved 2014-06-04)
Scott Hudson
2015-08-18
Lecture 9: Polynomials
9/14
//////////////////////////////////////////////////////////////////////
// polyHorner.sci
// 2014-06-04, Scott Hudson
// Horner's method for polynomial evaluation. c=[c(1),c(2),...,c(n+1)]
// are coefficients of polynomial
// p(z) = c(1)+c(2)*z+c(3)*z^2+...+c(n+1)*z^n
// z is the number (can be complex) at which to evaluate polynomial
//////////////////////////////////////////////////////////////////////
function w=polyHorner(c, z)
n = length(c)-1;
w = c(n)+c(n+1)*z;
for i=n-1:-1:1
w = c(i)+w*z;
end
endfunction
//////////////////////////////////////////////////////////////////////
// polyDeflate.sci
// 2014-06-04, Scott Hudson
// Given the coefficients c = [c(1),c(2),...,c(n+1)] of polynomial
// p(z) = c(1)+c(2)*z+c(3)*z^2+...+c(n+1)*z^n
// and root r, p(r)=0, remove a factor of (z-r) from p(z) resulting in
// q(z) = b(1)+b(2)*z+...+b(n)*z^(n-1) of order one less than p(z)
// Return array of coefficients b = [b(1),b(2),...,b(n)]
//////////////////////////////////////////////////////////////////////
function b=polyDeflate(c, r)
n = length(c)-1;
b = zeros(1,n);
b(n) = c(n+1);
for k=n-1:-1:1
b(k) = c(k+1)+r*b(k+1);
end
endfunction
Scott Hudson
2015-08-18
Lecture 9: Polynomials
10/14
//////////////////////////////////////////////////////////////////////
// polyRoots.sci
// 2014-06-04, Scott Hudson, for pedagocic purposes only!
// Given an array of coefficients c=[c(1),c(2),...,c(n+1)]
// defining a polynomial p(z) = c(1)+c(2)*z+...+c(n+1)*z^n
// find the n roots using Newton's method (with complex arguments)
// followed by polynomial deflation. The derivative polynomial is
// b(1)+b(2)*z+b(3)*z^2+...+b(n)*z^(n-1) =
// c(2)+2*c(3)*z+3*c(4)*z^2+...+n*c(n+1)*z^(n-1)
//////////////////////////////////////////////////////////////////////
function r=polyRoots(c)
n = length(c)-1; //order of polynomial
b = zeros(1,n); //coefficients of polynomial derivative
b = c(2:n+1).*(1:n); //b(k) = c(k+1)*k
deff('y=f(z)','y=polyHorner(c,z)'); //f(x) for Newton method
deff('y=fp(z)','y=polyHorner(b,z)'); //fp(x) for same
r = zeros(n,1);
z0 = 1+%i; //initial search point, should not be real
for i=1:n-1
r(i) = rootNewton(z0,f,fp,1e-8);
c = polyDeflate(c,r(i));
m = length(c)-1; //order of deflated polynomial
b = c(2:m+1).*(1:m); //b(k) = c(k+1)*k
end
r(n) = -c(1)/c(2); //last root is solution of c(1)+c(2)*z=0
endfunction
Scott Hudson
2015-08-18
Lecture 9: Polynomials
11/14
Scott Hudson
2015-08-18
Lecture 9: Polynomials
12/14
Scott Hudson
2015-08-18
Lecture 9: Polynomials
13/14
Scott Hudson
2015-08-18
Lecture 9: Polynomials
14/14
Scott Hudson
2015-08-18
Lecture 10
Linear algebra
1 Introduction
Engineers deal with many vector quantities, such as forces, positions, velocities, heat flow, stress
and strain, gravitational and electric fields and on and on. In this lecture we want to review basic
concepts and operations on vectors. We will see how a linear system of equations can naturally
arise from physical constraints on linear combinations of vectors, and how the resulting
bookkeeping naturally leads to the idea of a matrix and matrix-vector equations.
2 Vectors
Abstractly a vector is simply a one-dimensional array of numbers. The dimension of the vector is
the number of elements in the array. When arranged vertically we call this a column vector. The
following are three-dimensional column vectors
() ()
u1
u= u2
u3
4
v= 3
1
r=( x , y , z )
In engineering applications a vector almost always represents a physical quantity that has both
magnitude and direction. The elements of the array are the components of the vector along the
corresponding coordinate axis. It's very useful to visualize a vector as an arrow in space with the
same direction and the arrow length representing the magnitude.
As an example, the two-dimensional vector
()
r=
3
2
can be graphically represented (Fig. 1) as an arrow from the origin to a point with rectangular
coordinates (3,2) . We say r has component 3 in the x direction and component 2 in the y
direction. Converting the x,y coordinates into polar form
x= cos
y=sin
= x 2+ y 2 = 133.61
y
2
=tan 1 =tan1 33.7 o
x
3
We identify the length as the magnitude of the vector and as its direction (relative to the x
axis).
Here's a potential source of confusion. There is not necessarily anything physically significant
EE 221 Numerical Computing
Scott Hudson
2015-08-18
2/13
about the arrow's endpoint, (3,2) in this case. Say this vector represents a force acting on a
particle at the origin. Then the force exists at a single point, the origin. It does not exist at the
point (3,2) or anywhere except the origin. A force acting on a point particle has no extension in
space. We are simply using a physical arrow to visualize the magnitude and direction of the
force. The coordinates in this case might have units of newtons. On the other hand, suppose the
vector r represents a displacement in which a particle originally at the origin is moved to the
point x=3 , y=2 . In this case there is a physical significance to the arrow's endpoint. This is just
something to keep in mind. We get so accustomed to representing physical quantities such as
forces and velocities by arrows that it's easy to forget that they do not necessarily physically
coincide with these arrows.
A vector does not have to start at the origin. Suppose a particle is at location x=3 m , y=2 m
and moving with velocity v x =2 m/s , v y =1 m/s . We could illustrate this situation as shown in
Fig. 2.
In this case it's more physically meaningful to put the tail of the v vector at the particle's location.
Again, the location of the head of the v vector is not physically significant. In fact the r and v
vectors don't even have the same units m vs. m/s. However, could say that if the particle
traveled at velocity v for 1 second it would end up at the head of the vector v, which is location
Scott Hudson
2015-08-18
3/13
x=5 , y=1 . In fact, assuming v remains constant, we could write the particle position as the
vector
( )()( ) ( )
r (t)=
x (t)
3
2
3+2 t
= +
t=
y (t)
2
1
2t
This illustrates the concept of vector addition. Often, however, v will represent the instantaneous
velocity of a particle following a curved trajectory, as illustrated in Fig. 3. In this case the particle
will not following the v vector (expect for an instant) or arrive at its head.
(1)
Even though the length of a ten-dimensional vector doesn't have a direct physical meaning, the
norm concept is very useful.
This formula is actually only one way to define a vector norm. The p-norm is defined as
Scott Hudson
2015-08-18
4/13
1
p p
The Euclidean norm of the Pythagorean theorem would then be called the 2-norm. When
p we obtain the infinity-norm
u =max | u k|
which has important applications in, among other things, control systems theory. Suppose the kth
element of u represents the distances traveled during the kth segment of a trip. Then the 1-norm
u 1=| u1| +|u 2|++|u n|
is just the total length of the trip. When vectors are represented by boldface letters the
corresponding italic letter is often taken to represent the norm
w= w
In both Scilab and Matlab the function norm(x,p) calculates the p-norm of vector x. For
example
-->x = [1;2;3]
x =
1.
2.
3.
-->norm(x,1)
ans =
6.
-->norm(x,2)
ans =
3.7416574
-->norm(x,'inf')
ans =
3.
-->norm(x)
ans =
3.7416574
Note that norm(x) gives the default 2-norm or Euclidean norm. In this class we will take
norm to mean Euclidean norm, unless explicitly stated otherwise.
A vector with norm of 1 is called a unit vector. We can make any non-zero vector a unit vector
by dividing it by its norm. Commonly a hat or the letter a with a subscript is used to denote a
unit vector, for example
a u= u^ =
u
u
A unit vector represents a pure direction. In two and three dimensions this is literally a
EE 221 Numerical Computing
Scott Hudson
2015-08-18
5/13
direction in space (Fig. 4), but in higher dimensions it's a direction only in an abstract sense.
The concept is still very useful, however.
3 Scalar/inner/dot product
In three dimensions the scalar product (also called the inner product or dot product) of two
vectors is
uv=u 1 v 1+u 2 v 2+u3 v 3=u v cos uv
(2)
where uv is the angle between the vectors. The scalar product readily generalizes to ndimensional vectors as
n
uv= ui v i
(3)
uv=u v cos uv
(4)
i=1
cos uv
uv
=u^
^ v
u v
(5)
Two vectors with a zero scalar product are said to be orthogonal. Since cos 90 o=0 this means
that the vectors form a right angle, they are perpendicular.
4 Vector/cross product
In three dimensions the vector product (also called the cross product) of two vectors is
u 2 v 3u 3 v 2
w=uv= u3 v 1u 1 v3
u1 v 2u 2 v 1
(6)
The vector product is specific to three dimensions; it does not readily generalize to n dimensions.
It is very important in many applications. For example torque about the origin is the cross
product of force and position. The magnitude of the cross product is
w=u v sin uv
(7)
5 Matrix-vector product
Suppose we have two, two-dimensional vectors
() ()
u=
u1
u2
v=
v1
v2
(8)
Scott Hudson
2015-08-18
6/13
x1
() ()( )
u1
v
y
+ x2 1 = 1
u2
v2
y2
(9)
In terms of components
u1 x1 +v 1 x 2= y 1
u 2 x1+v 2 x 2= y 2
(10)
( )( ) ( )
u 1 v1
u2 v2
x1
y
= 1
x2
y2
(11)
where the two-dimensional array is a matrix, the columns of which are the vectors u and v.
Thinking of this array as a single entity A we can specify its elements using two indicies
a 11 a12
a 21 a22
)( ) ( )
x1
y
= 1
x2
y2
(12)
(13)
A=
) () ()
a 11 a12
x
y
, x= 1 , y= 1
a 21 a 22
x2
y2
(14)
a11 a 12
A= a 21 a 22
am1 am2
a1n
a2n
a mn
(15)
()
x1
x= x 2
xn
(16)
Scott Hudson
2015-08-18
7/13
Fig. 5 The product Ax=y can be thought of as summing scaled versions of the column vectors of A.
()
y1
y= y 2
ym
(17)
with components
n
y i= aij x j
(18)
j =1
Note that the product Ax as defined by (18) only works if number of columns of A is equal to
the number of elements of x (in this case both are n); each column of A gets multiplied by the
corresponding element of x.
We are most often (but not always) interested in the case m=n where the matrix A is square. In
any case we can visualize the linear system
Ax=y
(19)
as (Fig. 5)
In this visualization we assume that x is known and we want to calculate y. Often we are faced
with the inverse problem where y is known and we want to calculate x. We formally write
1
x=A y
(20)
where A is the inverse of matrix A. Solving this problem will be the topic of the next lecture.
For now we want to motivate our study of linear systems of equations by considering two
important problems which give rise to such systems.
Scott Hudson
2015-08-18
8/13
(21)
at each joint.
To be specific let's take the system illustrated in Fig. 7. There are four joints located at
(x k , y k ) , k=1,2,3,4 and five members with compression forces u k , k =1,2,3,4,5 . An external
force is applied at joint 4 with components ( f 4 x , f 4 y ) . A reaction force (due to the mounting of
the system) is applied at joint 1 with components (r 1 x , r 1 y ) and a reaction force with y
component r 2 y is applied at joint 2.
Assuming the force ( f 4 x , f 4 y ) is known, there are eight unknowns which form the components
of an eight-dimensional vector:
Scott Hudson
2015-08-18
9/13
Fig. 7: Geometry of the truss problems. Member m i has compression force u i and connects to other
members at some joints j k . Applied forces f and reaction forces r also acts on two or more joints.
()
u1
u2
u3
u4
u=
u5
u 6=r x 1
u 7=r y1
u8=r y 2
(22)
y 2 y3
x 2 x 3
(23)
tan ij =
yi y j
x i x j
(24)
Scott Hudson
(25)
2015-08-18
10/13
At j 2
u 1 cos 21 +u 3 cos 23 +u5 cos 24=0
u 1 sin 21+u 3 sin 23+u 5 sin 24+u 8=0
(26)
(27)
(28)
At j 3
At j 4
A u=b
with
cos 12
sin 12
cos 21
sin 21
A=
0
0
0
0
cos 13
0
sin 13
0
0
cos 23
0
sin 23
cos 31 cos 32
sin 31 sin 32
0
0
0
0
0
0
0
0
cos 34
sin 34
cos 43
sin 43
0
0
cos 24
sin 24
0
0
cos 42
sin 42
1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
1
0
0
0
0
(30)
and
()
0
0
0
0
b=
0
0
f 4x
f 4y
(31)
Notice that the applied forces shown up in the b (the knowns) vector while the member and
reaction forces form the u vector (the unknowns).
From a programming perspective, the challenge would be to generalize this process to allow the
solution of an arbitrary truss problem. This would mostly involve figuring out a systematic way
to do the bookkeeping involved in forming the A matrix and the b vector.
EE 221 Numerical Computing
Scott Hudson
2015-08-18
11/13
(32)
(33)
f i , j=
or equivalently
We assume the f values on the boundary are specified. These form the boundary conditions. Our
task is then to calculate the interior f values such that Laplace's equation is satisfied.
To use our matrix-vector formalism (29) the unknowns need to be arranged in a one-dimensional
column vector. As illustrated (Fig. 8), one way to do this is to number the interior points from 1
to 16 as unknowns
u k = f i , j where k =4( j2)+(i1)
Just running through the 16 points and considering (33) we can, by inspection, obtain a 16-by-16
linear system of the form (29) with
Scott Hudson
2015-08-18
12/13
4 1
0
0 1
0
0
0
0
0
0
0
0
0
0
0
1
4 1
0
0 1
0
0
0
0
0
0
0
0
0
0
0
1
4
1
0
0 1
0
0
0
0
0
0
0
0
0
0
0 1
4
0
0
0 1
0
0
0
0
0
0
0
0
1
0
0
0
4 1
0
0 1
0
0
0
0
0
0
0
0 1
0
0 1
4 1
0
0 1
0
0
0
0
0
0
0
0 1
0
0 1
4 1
0
0 1
0
0
0
0
0
0
0
0 1
0
0 1
4
0
0
0 1
0
0
0
0
A=
0
0
0
0 1
0
0
0
4 1
0
0 1
0
0
0
0
0
0
0
0 1
0
0 1
4 1
0
0
1
0
0
0
0
0
0
0
0 1
0
0 1
4 1
0
0
1
0
0
0
0
0
0
0
0 1
0
0 1
4
0
0
0 1
0
0
0
0
0
0
0
0 1
0
0
0
4 1
0
0
0
0
0
0
0
0
0
0
0 1
0
0 1
4 1
0
0
0
0
0
0
0
0
0
0
0 1
0
0 1
4 1
0
0
0
0
0
0
0
0
0
0
0 1
0
0 1
4
(34)
and
()
f 1,2 + f 2,1
f 3,1
f 4,1
f 5,1
f 1,3
0
0
f 6,3
b=
f 1,4
0
0
f 6,4
f 1,5 + f 2,6
f 3,6
f 4,6
f 5,6 + f 6,5
(35)
The kth row of this system is a statement of (33) for unknown u k . This illustrates a few things.
First, the dimension n of our linear system is determined by the number of unknowns, not by the
2 or 3 dimensions of physical space. This number can easily be very large. Suppose we wanted to
have a 100-by-100 two-dimensional grid of unknown field values. This is not actually very large,
after all, a 100-by-100 pixel image is essentially a thumbnail. Yet this results in n=10,000
EE 221 Numerical Computing
Scott Hudson
2015-08-18
13/13
unknowns and a matrix A that is 10,000-by-10,000 in size. In three dimensions the problem
dimension would be n=1003=1,000,000 , and our matrix would have (106 )2 or one trillion
entries! Yet these are often the size of problems we need to solve in engineering applications.
Second, and fortunately for us, a glance at (34) and consideration of the way it was built using
(33) reveals that the great majority of entries in A will be zeros. We say that A is a sparse matrix.
So, even if it does contain a trillion elements, only a tiny fraction are non-zero. Sparse matrix
techniques exploit this fact to store and manipulate such matrices using orders-or-magnitude less
resources than would be needed for dense matrices, and they enable use to solve physically
significant problems using available computing power. We will take a look at sparse matrix
techniques in a later lecture.
Scott Hudson
2015-08-18
Lecture 11
Linear systems of equations I
1 Introduction
In this lecture we consider ways to solve a linear system Ax=b for x when given A and b.
Writing out the components our system has the form
a 11
a 21
a 31
an1
a 12
a 22
a 32
an2
a 13
a 23
a 33
an3
a1 n
a2 n
a3 n
a nn
)( ) ( )
x1
b1
x2
b2
=
x3
b3
xn
bn
(1)
Each row of A corresponds to a single linear equation in the unknown x values. The ith equation
is
a i 1 x1 +a i 2 x 2++a in x n=b i
and the jth equation is
a j 1 x 1+a j 2 x 2++a j n x n=b j
We will use the following facts to transform the system (1) into a form in which the solution is
trivially apparent or at least can be easily calculated.
Fact 1: We can scale any equation by a non-zero constant ( c0 ) without changing the
solution
a i 1 x1 +a i 2 x 2++a in x n=b i ca i 1 x1 +ca i 2 x 2++ca in x n=cbi
Fact 2: We can replace an equation by its sum or difference with another equation without
changing the solution
a i 1 x1 +a i 2 x 2++a in x n=b i (a i 1+a j 1) x 1+(ai 2+a j 2) x 2++(ai n+a jn ) x n =bi +b j
2 Gauss-Jordan elimination
If A was the identity matrix
1
0
0
0
1
0
0
0
1
)( ) ( )
0
0
0
x1
b1
x2
b2
=
x3
b3
xn
bn
(2)
Scott Hudson
2015-08-18
2/13
Gauss-Jordan elimination is a process to convert an arbitrary system (1) into the trivial system
(2). Since we want to end up with a 11=1 , we use Fact 1 to multiply the first row of A and b by
1/a 11
a 1 j a 1 j / a11 , j=1,2, , n , b1 b 1 /a11
to obtain the form
1
a 21
a 31
an1
a 12
a 22
a 32
an2
a 13
a 23
a 33
an3
a1n
a2n
a3n
a nn
)( ) ( )
x1
b1
x2
b2
=
x3
b3
xn
bn
(note a 12 , a 13 etc. will have changed values). Now we use Fact 2 to eliminate the elements
a 21 , a 31 , , a n 1 in the first column by subtracting a i 1 times the first row from the ith row
a ij a ij a i 1 a 1 j , j=1,2, , n , bi biai 1 b1
for i=2,3, , n resulting in
1
0
0
a 12
a 22
a 32
an2
a 13
a 23
a 33
an3
a1 n
a2n
a3n
a nn
)( ) ( )
x1
b1
x2
b2
=
x3
b3
xn
bn
The first column is now in the desired form. Let's move to the second column. Since we want to
end up with a 22=1 , use Fact 1 to multiply the second row by 1/a 22
a 2 j a 2 j /a 22 , j=2,3, , n , b 2 b2 /a 22
Note that we don't bother with j=1 since a 21=0 . Then we use Fact 2 to eliminate all elements
a i 2 , i2
a ij a ij a i 2 a 2 j , j=2,3, , n , bi biai 2 b 2
Again we don't bother with j=1 since a 21 =0 . We end up with
1
0
0
0
1
0
a13
a 23
a33
an3
a1n
a2n
a3n
a nn
)( ) ( )
x1
b1
x2
b2
=
x3
b3
xn
bn
We now move on to the third column and so on, continuing until we have the form shown in (2).
Notice that the element bi is transformed in the same manner as the elements a ij . This suggests
that we form an augmented matrix as A plus b as an extra column and then simply transform the
~
a values as a whole. Let's call the augmented matrix A . Then
EE 221 Numerical Computing
Scott Hudson
2015-08-18
a 11
a 21
~
A= a
31
an1
a 12
a 22
a 32
an2
a 13
a 23
a 33
an3
3/13
)(
a1n
a2n
a3n
a nn
b1
a 11
b2
a 21
=
b3
a 31
bn
an1
a 12 a13
a 22 a 23
a 32 a 33
an2 an3
a1n
a2n
a3n
a nn
a1 n +1
a2 n+1
a3 n +1
a n n+1
( ) ( )
A=
1 2
3 4
4
, b=
10
~
A= 1 2 4
3 4 10
The first pivot, a 11 , is already 1. We now subtract 3 times row 1 from row 2
~ 1 2
4
A=
0 2 2
~
A= 1 2 4
0 1 1
~ 1 0 2
A=
0 1 1
Scott Hudson
2015-08-18
4/13
() () () ()
( )
1
0
A x1= 0 , A x2 =
0
Then
0
1
0 , A x3 =
1
0
A ( x1 , x2 , x3 , , x n)= 0
0
0
1 , , A xn =
0
1
0
0
0
1
0
0
0
0
0
0
0
1
a 11
a 21
~
A= a
31
an1
a 12
a 22
a 32
an 2
a 13
a 23
a 33
an3
a1n
a2n
a3n
a nn
1
0
0
0
1
0
0
0
1
0
0
0
~
and perform Gauss-Jordan elimination on A to effectively solve these n problems in parallel.
~
When complete, the last n columns of A will be the inverse A1 . In the Appendix, program
linearGaussJordanInverse gives a Scilab implementation of this algorithm. Note the
slight differences between this and linearGaussJordan.
Notice that the Gauss-Jordan elimination algorithm consists of three nested levels of for loops.
Each (runs roughly speaking) over on order of n values. It follows that the total number of
operations is on the order of nnn=n3 . This gives a measure of the computational complexity
and therefore of the amount of cpu time required for the algorithm. For this reason you will often
see statements of the form matrix inversion is an O(n3) operation, where the big O
represents order of.
3 Gaussian elimination
Gauss-Jordan elimination is a logical way to solve Ax=b or to find A1 . However, there are
faster and more robust methods. In particular we can solve Ax=b with only about 1/3 the
number of operations using the so-called LU decomposition that we will develop in the next
lecture. The price we pay is that getting the solution is more convoluted than it is for GaussEE 221 Numerical Computing
Scott Hudson
2015-08-18
5/13
Jordan elimination; it involves various substitution operations. Here we'll introduce this idea
by considering the so-called Gaussian elimination algorithm.
In Gauss-Jordan elimination we zero-out all elements of A except those on the diagonal, and
we normalize the diagonal elements to 1. In Gaussian elimination we don't bother with the
elements above the diagonal; we only zero-out the elements below the diagonal. We also don't
bother to normalize the diagonal elements to 1. The result is a system in the form
a 11 a 12 a 13
0 a 22 a 23
0
0 a 33
0
0
0
a1 n
a2 n
a3 n
a nn
)( ) ( )
x1
b1
x2
b2
x3 = b3
xn
bn
(3)
The matrix is in upper-triangular form; all elements below the diagonal are zero. The algorithm
to achieve this is an obvious variation of Gauss-Jordan elimination.
Gaussian elimination algorithm
~
Form the augmented matrix A=( A , b )
for k =1,2, , n1
for i=k +1, k +2, , n
for j =n+1, n , n1, , k
a ij a ij a kj a ik /a kk
Since it has three nested for loops, each running over on the order of n values, this is also an
O(n3) process. A problem with (3) is that the solution is not obvious (as it is for (2)) except for
the last row which gives us
a nn x n=b n x n=b n /a nn
So, x n is easy to get. The next-to-last equation is
a n1 , n1 x n1+a n1 ,n x n=bn1
But, we already know x n , so we can solve for
x n1=
1
a n1 , n1
(b n1a n1 , n x n )
1
a n 2 , n2
(b n2[ a n2 ,n 1 x n1+a n2 , n x n ])
Scott Hudson
2015-08-18
6/13
Back-substitution algorithm
If A is upper-triangular, the solution to Ax=b is given by
for i=n , n1, n2, , 1
1
x i=
b a x
a ii i j=i+1 ij j
(
(
a 11
a 21
a 31
an1
a 12
a 22
a 32
an2
a 13
a 23
a 33
an3
a1 n
a2 n
a3 n
a nn
)( ) ( )
)( ) ( )
x1
b1
x2
b2
x3 = b3
xn
bn
Each row represents a single linear equation in the unknowns. In what way will the solution
change if we swap, say, the first and third rows of both A and b?
a 31
a 21
a 11
an1
a 32
a 22
a 12
an2
a 33
a 23
a 13
an3
a3 n
a2 n
a1 n
a nn
x1
b3
x2
b2
=
x3
b1
xn
bn
The answer is that it won't. We've just rearranged the same n equations in n unknowns. It doesn't
EE 221 Numerical Computing
Scott Hudson
2015-08-18
7/13
matter in what order we write them; the solution will remain the same.
~
Fact 3: Any two rows of the augmented matrix A can be swapped without changing the
solution vector x.
What about swapping columns? Say we swap the first and third columns of A.
a 13 a12
a 23 a 22
a 33 a32
an3 an2
a 11
a 21
a 31
a n1
a1 n
a2 n
a3 n
a nn
)( ) ( )
x1
b1
x2
b2
x3 = b3
xn
bn
Thinking of Ax as a linear combination of the columns of A, the ith column of A gets multiplied
by x i . Swapping the columns is equivalent to relabeling the two corresponding components of x.
In other words if we write the system as
a 13 a12
a 23 a 22
a 33 a32
an3 an2
a 11
a 21
a 31
a n1
a1 n
a2 n
a3 n
a nn
)( ) ( )
x3
b1
x2
b2
x1 = b3
xn
bn
5 References
1. Golub, G.H. and C.F. Van Loan, Matrix Computations, Johns Hopkins University Press,
1983, ISBN: 0-8018-3011-7.
Scott Hudson
2015-08-18
8/13
//////////////////////////////////////////////////////////////////////
// linearGaussJordan.sci
// 2014-06-23, Scott Hudson, for pedagogic purposes
// Solves Ax=b for x using Gauss-Jordan elimination.
// No pivoting is performed.
//////////////////////////////////////////////////////////////////////
function x=linearGaussJordan(A, b)
n = length(b);
A = [A,b]; //form augmented matrix
for k=1:n //A(k,k) is the pivot
for j=n+1:-1:k //normalize row k so A(k,k)=1
A(k,j) = A(k,j)/A(k,k);
end
for i=1:n //eliminate a(i,k) for all i~=k
if (i~=k) //a Pivot does not eliminate itself
for j=n+1:-1:k
A(i,j) = A(i,j)-A(k,j)*A(i,k);
end
end
end //i loop
end //k loop
x = A(:,n+1); //last column of augmented matrix is now x
endfunction
//////////////////////////////////////////////////////////////////////
// linearGaussJordanInverse.sci
// 2014-06-23, Scott Hudson, for pedagogic purposes
// Forms inverse of matrix A using Gauss-Jordan elimination.
// No pivoting is performed.
//////////////////////////////////////////////////////////////////////
function Ainv=linearGaussJordanInverse(A)
n = size(A,'r');
A = [A,eye(A)]; //form augmented matrix
for k=1:n //A(k,k) is the pivot
for j=2*n:-1:k //normalize row k so A(k,k)=1
A(k,j) = A(k,j)/A(k,k);
end
for i=1:n //eliminate a(i,k) for all i~=k
if (i~=k) //a Pivot does not eliminate itself
for j=2*n:-1:k
A(i,j) = A(i,j)-A(k,j)*A(i,k);
end
end
end //i loop
end //k loop
Ainv = A(:,n+1:2*n); //last column of augmented matrix is now x
endfunction
Scott Hudson
2015-08-18
9/13
//////////////////////////////////////////////////////////////////////
// linearGaussian.sci
// 2014-06-25, Scott Hudson, for pedagogic purposes
// Solves Ax=b for x using Gaussian elimination and backsubstitution.
// No pivoting is performed.
//////////////////////////////////////////////////////////////////////
function x=linearGaussian(A, b)
n = length(b);
A = [A,b]; //form augmented matrix
//Gaussian elimination loop
for k=1:n-1 //A(k,k) is the pivot
for i=k+1:n //eliminate a(i,k) for all i>k
for j=n+1:-1:k
A(i,j) = A(i,j)-A(k,j)*A(i,k)/A(k,k);
end
end //i loop
end //k loop
//Backsubstitution loop
x = zeros(n,1);
x(n) = A(n,n+1)/A(n,n);
for i=n-1:-1:1
x(i) = A(i,n+1);
for j=i+1:n
x(i) = x(i)-A(i,j)*x(j);
end
x(i) = x(i)/A(i,i);
end
endfunction
Scott Hudson
2015-08-18
10/13
//////////////////////////////////////////////////////////////////////
// linearGaussianPivot.sci
// 2014-06-25, Scott Hudson, for pedagogic purposes
// Solves Ax=b for x using Gaussian elimination and backsubstitution.
// Partial pivoting is performed.
//////////////////////////////////////////////////////////////////////
function x=linearGaussianPivot(A, b)
n = length(b);
A = [A,b]; //form augmented matrix
//Gaussian elimination loop
for k=1:n-1 //A(k,k) is the pivot
//see if there is a larger pivot below this in the kth column
Amax = abs(A(k,k));
imax = k;
for i=k+1:n
if abs(A(i,k))>Amax
Amax = abs(A(i,k));
imax = i;
end
end
if (imax~=k) //we found a larger pivot, swap rows
w = A(k,:); //copy the kth row
A(k,:) = A(imax,:); //replace it with the imax row
A(imax,:) = w; //replace the imax row with the original kth row
end
//pivoting complete
for i=k+1:n //eliminate a(i,k) for all i>k
for j=n+1:-1:k
A(i,j) = A(i,j)-A(k,j)*A(i,k)/A(k,k);
end
end //i loop
end //k loop
//Backsubstitution loop
x = zeros(n,1);
x(n) = A(n,n+1)/A(n,n);
for i=n-1:-1:1
x(i) = A(i,n+1);
for j=i+1:n
x(i) = x(i)-A(i,j)*x(j);
end
x(i) = x(i)/A(i,i);
end
endfunction
Scott Hudson
2015-08-18
11/13
Scott Hudson
2015-08-18
12/13
Scott Hudson
2015-08-18
13/13
Scott Hudson
2015-08-18
Lecture 12
Linear systems of equations II
1 Introduction
We have looked at Gauss-Jordan elimination and Gaussian elimination as ways to solve a linear
system Ax=b . We now turn to the LU decomposition, which is arguably the best way to
solve a linear system. We will then see how to use the backslash operator that is built in to
Scilab/Matlab.
2 LU decomposition
Generally speaking, a square matrix A, for example
a 11
a
A= 21
a 31
a 41
a12
a 22
a32
a 42
a 13
a 23
a 33
a 43
a 14
a 24
a 34
a 44
(1)
1 0 0
l 21 1 0
L=
l 31 l 32 1
l 41 l 42 l 43
0
0
0
1
) (
u11 u 12 u13
0 u 22 u 23
U=
0
0 u33
0
0
0
u14
u 24
u34
u 44
(2)
u11
u12
u 13
u 14
u l
u 22+u 12 l 21
u23+u13 l 21
u24+u14 l 21
LU= 11 21
u 11 l 31 u22 l 32+u12 l 31 u 33+u 23 l 32 +u 13 l 31
u 34+u 24 l 32+u14 l 31
u 11 l 41 u 22 l 42+u12 l 41 u 33 l 43+u 23 l 42+u13 l 41 u 44+u 23 l 43+u 24 l 42+u 14 l 41
(3)
Comparing (1) and (3) we see immediately from the first row that
u 11=a 11 , u 12 =a 12 , u 13=a 13 , u 14 =a 14
(4)
(5)
Scott Hudson
(6)
2015-08-18
2/11
(7)
(8)
(9)
(10)
u11 u12
l
u 22
A 21
l 31 l 32
l 41 l 42
u 13
u 23
u 33
l 43
u 14
u 24
u 34
u 44
The diagonal values of L (all 1's) are understood. The algorithm to do this can be stated very
concisely as
LU decomposition algorithm
for k =1,2, , n1
for i=k +1, k +2, , n
a ik a ik /a kk
for j=k+1, k +2, , n
a ij a ij a ik a kj
A Scilab implementation of this appears in the Appendix as linearLU. Note the three nested
for loops, implying that this is an O(n3) process.
(
EE 221 Numerical Computing
1 0 0
l 21 1 0
l 31 l 32 1
l 41 l 42 l 43
0
0
0
1
)( ) ( )
y1
b1
y2
b
= 2
y3
b3
y4
b4
Scott Hudson
2015-08-18
3/11
y i=bi l ik y k
k =1
This process used to solve for y is called forward substitution. The complete algorithm is
Forward-substitution algorithm
The solution of Ly=b where L is unit-lower-triangular is
for i=1,2, , n
i1
u11 u12 u 13
0 u 22 u 23
0
0 u 33
0
0
0
u 14
u 24
u 34
u 44
)( ) ( )
x1
y1
x2
y
= 2
x3
y3
x4
y4
1
y u x
u ii i j=i+1 i , j j
and
back-substitution
is
given
as
Scott Hudson
2015-08-18
4/11
Moreover, in a matrix-vector equation of the form Ax=b , typically A represents the geometry,
structure and/or physical properties of the system being analyzed, x is the response of the system
(displacements, temperature, etc.) and b represents the inputs (such as forces or heat flows). It
is not uncommon to want to calculate the system response for several different inputs (different b
vectors). The beauty of the LU decomposition is that it only needs to be performed once at a
computational cost of O( n3) . The solution x for a new input b then simply requires
application of the forward-substitution and back-substitution algorithms with are only O( n2 ) .
This is a tremendous benefit over repeatedly solving Ax=b for each b.
2.
5.
8.
3.
6.
9.
-->P = [0,1,0;0,0,1;1,0,0]
P =
0.
0.
1.
1.
0.
0.
0.
1.
0.
5.
8.
2.
6.
9.
3.
-->P*A
ans =
4.
7.
1.
P is an identity matrix in which the rows have been swapped. In this case the identity matrix
rows 1,2,3 are reordered 2,3,1. The product PA then simply reorders the rows of A in the same
manner. With row pivoting, we actually compute the LU decomposition of PA. Multiplying both
sides of Ax=b by P we have PAx=Pb , and then
LUx=Pb
This shows that we need only apply the permutation to a b vector and then we can obtain the
EE 221 Numerical Computing
Scott Hudson
2015-08-18
5/11
correct x using forward- and back-substitution. Instead of generating the matrix P we can form a
permutation vector p which lists the reordering of the rows of A, for example
()
2
p= 3
1
This tells us to perform the reordering
()
b2
b b3
b1
As a check
-->A*x
ans =
1.
2.
3.
3 Matrix determinant
In your linear algebra course you learned about the determinant of a square matrix, which we will
write as det A . The determinant is very important theoretically. Numerically it does not find
much application, but on occasion you may want to compute it. We know from the properties of
the determinant that if A=LUP then
Scott Hudson
2015-08-18
6/11
5 Singular matrices
Even with pivoting, all of the methods we have studied may fail to solve Ax=b because there
may be no solution! In fact we know that if det A=0 then A is a singular matrix, A1 fails to
exist, and it is therefore impossible to calculate x=A1 b . Consider the following example
-->A = [1,2,3;4,5,9;6,7,13]
A =
1.
4.
6.
2.
5.
7.
3.
9.
13.
-->[A,p] = linearLUP(A)
p =
3.
1.
2.
Scott Hudson
2015-08-18
7/11
=
6.
0.1666667
0.6666667
7.
0.8333333
0.4
13.
0.8333333
0.
Notice that a 33 =u 33=0 , even with pivoting. It follows that det A=0 and the matrix is singular.
If we try to perform forward- and back-substitution we will fail when we come to the step where
we are supposed to divide by u 33 .
The problem with this A is that the third column is the sum of the first two columns; the three
columns are linearly dependent, and the matrix is singular. There is no solution to Ax=b . If we
ask Scilab to find one we are told
-->x = A\[1;2;3]
Warning :
matrix is close to singular or badly scaled. rcond =
computing least squares solution. (see lsq).
x
0.0000D+00
=
0.
0.7105263
- 0.1578947
-->A*x
ans =
0.9473684
2.1315789
2.9210526
We're warned that the matrix is close to singular and given a least squares solution. Testing
Ax we get something close to, but no equal to b. There is no true solution, but Scilab gives us
the best that can be achieved in its place.
6 References
1. Golub, G.H. and C.F. Van Loan, Matrix Computations, Johns Hopkins University Press,
1983, ISBN: 0-8018-3011-7.
Scott Hudson
2015-08-18
8/11
//////////////////////////////////////////////////////////////////////
// linearLU.sci
// 2014-06-24, Scott Hudson, for pedagogic purposes
// Given an n-by-n matrix A, calculate the LU decomposition
// A = LU where U is upper triangular and L is unit lower triangular.
// A is replaced by the elements of L and U.
// No pivoting is performed.
//////////////////////////////////////////////////////////////////////
function A=linearLU(A)
n = size(A,1); //A is n-by-n
for k=1:n-1
for i=k+1:n
A(i,k) = A(i,k)/A(k,k);
for j=k+1:n
A(i,j) = A(i,j)-A(i,k)*A(k,j);
end
end
end
endfunction
//////////////////////////////////////////////////////////////////////
// linearLUsubstitute.sci
// 2014-06-24, Scott Hudson, for pedagogic purposes
// A has been replaced by its LU decomposition. This function applies
// forward- and back-substitution to solve Ax=b. Note that if LUP
// decomposition was performed then b(p) should be used as the b
// argument where p is the permutation vector.
//////////////////////////////////////////////////////////////////////
function x=linearLUsubstitute(A, b)
n = length(b);
y = zeros(b);
for i=1:n //forward-substitution
y(i) = b(i);
for j=1:i-1
y(i) = y(i)-A(i,j)*y(j); //a(i,j) = l(i,j) for j<i
end
end
x = zeros(b);
for i=n:-1:1 //back-substitution
x(i) = y(i);
for j=i+1:n
x(i) = x(i)-A(i,j)*x(j); //a(i,j) = u(i,j) for j>i
end
x(i) = x(i)/A(i,i);
end
endfunction
Scott Hudson
2015-08-18
9/11
//////////////////////////////////////////////////////////////////////
// linearLUP.sci
// 2014-06-24, Scott Hudson, for pedagogic purposes
// Given an n-by-n matrix A, calculate the LU decomposition
// A = LU where U is upper triangular and L is unit lower triangular.
// A is overwritten by LU. Partial pivoting is performed.
// Vector p lists the rearrangement of the rows of A. Given a
// vector b, the solution to A*x=b is the solution to L*U*x=b(p).
//////////////////////////////////////////////////////////////////////
function [A, p]=linearLUP(A)
n = size(A,1); //a is n-by-n
//Replace A with its LU decomposition
p = [1:n]';
for k=1:n-1 //k indexes the pivot row
//pivoting - find largest abs() in column k
amax = abs(A(k,k));
imax = k;
for i=k+1:n
if abs(A(i,k))>amax
amax = abs(A(i,k));
imax = i;
end
end
if (imax~=k) //we found a larger pivot
w = A(k,:); //copy row k
A(k,:) = A(imax,:); //replace it with row imax
A(imax,:) = w; //replace row imax with original row k
t = p(k); //perform same swap of elements of p
p(k) = p(imax);
p(imax) = t;
end
//pivoting complete, continue with LU decomposition
for i=k+1:n
A(i,k) = A(i,k)/A(k,k);
for j=k+1:n
A(i,j) = A(i,j)-A(i,k)*A(k,j);
end
end
end
endfunction
Scott Hudson
2015-08-18
10/11
Scott Hudson
2015-08-18
11/11
Scott Hudson
2015-08-18
Lecture 13
Nonlinear systems of equations
1 Introduction
We have investigated the solution of one nonlinear equation in one unknown: f (x )=0 . What
about multiple nonlinear equations in multiple unknowns? To get started, consider one equation
in two unknowns
f (x , y )=0
To be specific, let's take
2
f ( x , y )=x + y 4=0
y=sin( x)
and a contour in the x,y plane. An intersection of those contours (Fig. 1) is the solution of the
system of equations
f (x , y )=0
g ( x , y )=0
(1)
2 Notation
For a general system of n equations in n unknowns it's not convenient to use different letters x,y,z
for the unknowns or f,g,h for the functions. A better notation is to use indices so that the
unknown variables are represented as x i :1in . Our system of 2 equations (1) would then be
written as
f 1 ( x 1 , x 2 )=0
f 2 ( x 1 , x 2 )=0
EE 221 Numerical Computing
Scott Hudson
2015-08-18
2/11
Fig. 1: Intersections of the f(x,y)=0 and g(x,y)=0 contours define the solutions of the
system of two equations in two unknowns.
f n ( x 1 , x 2 , , x n )=0
(2)
f (x)=0
The notation in (3) is simply a shorthand representation for the system of (2), but when written
using vectors it displays the same form as the scalar root-finding problem f (x )=0 .
3 Challenges
In general solving a system of the form (3) is a difficult problem. To quote the classic work
Numerical Recipes in C [1,Section 9.6]
There are no good, general methods for solving systems of more than one nonlinear
equation. Furthermore, it is not hard to see why (very likely) there never will be any good,
general methods.
The single largest problem is that in two or more dimensions we loose the concept of bracketing
a root. Going back to the notation (1), suppose the point a is (x a , y a) and the point b is ( x b , y b )
and
Scott Hudson
2015-08-18
3/11
f (x a , y a) f (x b , y b )<0
g (x a , y a) g ( x b , y b)<0
Provided f and g are continuous along a path connecting a and b, on that path there will be a
point c where f (x c , y c )=0 . Likewise there will be a point d where g ( x d , y d )=0 . However we
have no way of knowing if these two points will coincide, as they must at a solution of (1). In
fact, based on Fig. 2 we can see that it is quite unlikely that they will. Thus there is no procedure
analogous to bisection that can guarantee we find a root with any given precision.
Without bracketing and bisection methods we are left with the possibility of implementing some
form of root polishing. Recall that in one-dimension these methods were not guaranteed to find a
solution, even if one or more exists. Typically they need to start reasonably close to a solution to
converge.
The one-dimensional root-polishing methods we investigated were: fixed-point iteration,
Newton's method and the secant method. We will develop multidimensional versions of those
below.
4 Fixed-point iteration
Consider the following equations
0=f (r)
0=A f (r )
r=r+A f (r )=g (r)
where r is an n-dimensional column vector that forms a solution of our problem and f is an ndimensional column-vector function. The first equation is simply a statement of our root finding
problem. In the second equation we have multiplied both sides by an n-by-n matrix A. In the last
equation we have added r to both sides and defined the right-hand side as the function g(r) .
EE 221 Numerical Computing
Scott Hudson
2015-08-18
4/11
(4)
This has solution x=r . Following the one-dimensional algorithm we expect that the sequence
of vectors x k where
x k +1=g( x k )
(5)
might converge to r under the right conditions. Note that the notation x k refers to the kth vector
in a sequence of vectors and not the kth component of the vector x, which would be written x k .
In terms of components (4) gives us n equations
x i=g i ( x 1 , x 2 , x3 , , x n) , i=1,2,3, , n
Let's write
x i=r i+ei
where e i=x i r i is the error in the ith component of x. Supposing that all the e i values are small
enough that first-order Taylor series are accurate we write (5) as
r i+e i= g i (r)+
gi
g
g
e 1+ i e2 ++ i e n
x1
x2
xn
(6)
or
n
e i =
j =1
gi
e
xj j
(7)
since r i=g i (r) . Now, define the n-by-n matrix J to have the elements
J ij =
gi
xj
This is called the Jacobian matrix of the vector function g(x) . We also define the n-by-1
column vector e to have elements e i . Then (7) can be written
e k+1=J e k
(8)
where, again, e k is the kth vector in the sequence of error vectors, and is not to be confused with
e k which is the kth element in a particular vector.
Intuitively the sequence (8) will converge provided J e k e k for some 0<1 . That is,
the norm of the error vector decreases by a fixed factor at each iteration so that
e k k e 0 0 as k
In principle it is always possible to find a matrix A so that (8) converges. However it's a difficult
problem since A has n 2 components, and it is rarely practical to do so. Instead, we might try to
manipulate our system of equations into the form (4), in various ways until we find one that
converges.
Scott Hudson
2015-08-18
5/11
f ( x , y)=x + y 4=0
g ( x , y)= ysin (x)=0
Let's write this in the iterative form
2
x=x+x + y 4
y= y+ ysin( x)
From Fig. 1 we see that x=2 , y=1 is near a root. Starting at this point our
iteration gives us
k
x
y
1
3
1.85888
2 11.455435 4.6138744
3 159.97026 8.9794083
which is clearly not converging. However writing
x= 4 y 2
y=sin ( x)
we obtain
k
1
2
3
x
1.7320508
1.7394765
1.7401679
1.7402407
y
0.9870266
0.9858072
0.9856909
0.9856786
f ( xk )
f ( xk )
In the n-dimensional case, if we know f i (x ) , for a small change to x we can approximate the
change in the function by a first-order Taylor series
Scott Hudson
2015-08-18
6/11
fi
fi
fi
u 1+
u2 ++
u
x1
x2
xn n
where the derivatives are evaluated at x and u is the small change in x. Setting this expression
equal to zero we have
fi
fi
fi
u 1+
u 2++
u = f i (x)
x1
x2
xn n
Doing this for i=1,2, , n we get the system of equations
f1
f1
f1
u1 +
u 2++
u = f 1 (x)
x1
x2
xn n
f2
f2
f2
u1+
u 2++
u = f 2 (x)
x1
x2
xn n
fn
fn
fn
u1+
u 2++
u = f n (x)
x1
x2
xn n
Defining the Jacobian matrix J to have elements
J ij =
fi
xj
fi
evaluated at x k
xj
x 21+ x 224
f ( x)=
x 2sin( x1 )
The Jacobian matrix is
EE 221 Numerical Computing
Scott Hudson
2015-08-18
7/11
J=
2 x1
2 x2
cos (x 1) 1
The iteration
x xJ1 f
starting at x=2 , y=1 produces
-->deff('y=f(x)','y=[x(1)^2+x(2)^2-4;x(2)-sin(x(1))]');
-->deff('y=J(x)','y=[2*x(1),2*x(2);-cos(x(1)),1]');
-->x = [2;1];
-->x = x-J(x)\f(x)
x
=
1.7415812
1.0168376
-->x = x-J(x)\f(x)
x =
1.7405501
0.9856269
-->x = x-J(x)\f(x)
x =
1.7402407
0.9856787
When started near a root, the Newton-Raphson method converges quadratically. Like all rootpolishing methods there is no guarantee that it will converge, even if a root exists. A Scilab
implementation of the Newton-Raphson method is given in the Appendix as
rootNewtonRaphson.
6 Broyden's method
The Newton-Raphson method requires the calculation of the Jacobian matrix, n derivatives of n
functions, at each iteration. It might not be possible to calculate this analytically, or it might not
be convenient to do so. In the one-dimensional problem we avoided calculating derivatives by
approximating the derivative by
f ( x k )
f ( x k+1 ) f (x k )
x k+1x k
This led to the secant method. A similar approach in the multidimensional case leads to
Broyden's method. We approximate the Jacobian J by a matrix B and otherwise follow the
Newton-Raphson method. Assume we start with some x k , f k =f (x k ) and some estimate B k
for the Jacobian. We solve for
EE 221 Numerical Computing
Scott Hudson
2015-08-18
8/11
u k =B1
k fk
update our root estimate
x k +1=x k +uk
calculate
f k+1=f (x k +1)
and update our Jacobian estimate using Broyden's formula
Bk +1=Bk +f k +1
uTk
u k 2
uk =1
we have
B k+1 uk =Bk uk +f k+1=f k +1f k
In summary:
Broyden's method
Given x k , f k =f (x k ) and B k
solve for u k =B1
k fk
update x k +1=x k +uk
calculate f k+1=f (x k +1)
update B k+1=B k +f k +1
u Tk
u k 2
Scott Hudson
2015-08-18
9/11
The only difference is that r, x0 and f are now n-dimensional. If the Jacobian can be explicitly
calculated that can be added as an additional argument
r = fsolve(x0,f,J);
7 References
1. Press, W. H., Flannery, B. P., Teukolsky, S. A. and Vetterling, W. T., Numerical Recipes
in C, Cambridge, 1988, ISBN: 0-521-35465-X.
Scott Hudson
2015-08-18
10/11
//////////////////////////////////////////////////////////////////////
// rootNewtonRaphson.sci
// 2014-06-04, Scott Hudson, for pedagogic purposes
// Implements Newton-Raphson method for finding a root f(x) = 0
// where f and x are n-by-1 vectors.
// Requires two functions: y=f(x) and y=J(x) where J(x) is the n-by-n
// Jacobian of f(x). Search starts at x0. Root is returned as r,
// niter is number of iterations performed. Termination when change
// in x is less than tol or MAX_ITERS exceeded.
//////////////////////////////////////////////////////////////////////
function [r, nIters]=rootNewtonRaphson(x0, f, J, tol)
MAX_ITERS = 40; //give up after this many iterations
nIters = 1; //1st iteration
r = x0-J(x0)\f(x0); //Newton's formula for next root estimate
while (max(abs(r-x0))>tol) & (nIters<=MAX_ITERS)
nIters = nIters+1; //keep track of # of iterations
x0 = r; //current root estimate is last output of formula
r = x0-J(x0)\f(x0); //Newton's formula for next root estimate
end
endfunction
//////////////////////////////////////////////////////////////////////
// rootBroyden.sci
// 2014-06-12, Scott Hudson, for pedagogic purposes
// Implements Broyden's method for finding a root of f(x)=0
// where f and x are n-by-1 vectors. x0 is initial guess for root
// and tol is termination tolerance for change in x.
//////////////////////////////////////////////////////////////////////
function [r, nIters]=rootBroyden(x0, f, tol)
MAX_ITERS = 40; //give up after this many iterations
xk = x0;
n = length(xk);
fk = f(xk);
Bk = eye(n,n);
uk = -Bk\fk;
nIters = 0;
while (max(abs(uk))>tol) & (nIters<=MAX_ITERS)
xk = xk+uk;
fk = f(xk);
Bk = Bk+fk*(uk')/(uk'*uk);
uk = -Bk\fk;
nIters = nIters+1;
end
r = xk+uk;
endfunction
Scott Hudson
2015-08-18
11/11
Scott Hudson
2015-08-18
Lecture 14
Interpolation I
1 Introduction
A common problem faced in engineering is that we have some physical system or process with
input x and output y. We assume there is a functional relation between input and output of the
form y= f ( x ) , but we don't know what f is. For n particular inputs x 1<x 2 <<x n we
experimentally determine the corresponding outputs y i= f ( x i ) . From these we wish to estimate
the output y for an arbitrary input x where x 1x x n . This is the problem of interpolation.
If the experiment is easy to perform then we could just directly measure y= f ( x ) , but typically
this is not practical. Experimental determination of an input-output relation is often difficult,
expensive and time consuming. Or, y= f ( x ) may represent the result of a complex numerical
simulation that is not practical to perform every time we have a new x value. In either case the
only practical solution may be to estimate y by interpolating known values.
To illustrate various interpolation methods, we will use the example of
x
y= f ( x )=3 e sin x
sampled at x=0,1,2,3,4,5 . These samples are the black dots in Fig. 1.
Scott Hudson
2015-08-18
2/13
3 Linear interpolation
An obvious and easy way to interpolate data is to connect the dots with straight lines. This
produces a continuous interpolation (Fig. 1) but which has kinks at the sample points where the
slope is discontinuous. The algorithm is
Linear interpolation
Find k such that x k <x< x k+1
Set y= y k +
y k+1 y k
( xx k )
x k+1x k
If the samples are closely spaced, linear interpolation works quite well. In fact it's used by default
for the plot() routine in Scilab/Matlab. However, when sampling is sparse (as in Fig. 1), linear
interpolation is unlikely to give an accurate representation of a smooth function. This
motivates us to investigate more powerful interpolation methods.
4 Polynomial interpolation
Through two points (x 1 , y 1) ,(x 2 , y 2 ) we can draw a unique line, a 1st order polynomial.
Through three points ( x 1 , y 1) ,( x 2 , y 2 ) ,( x3 , y 3 ) we can draw a unique parabola, a 2 nd order
polynomial. In general, through n points (x i , y i ) , i=1,2, , n we can draw a unique polynomial
of order (n1) . Although this polynomial is unique, there are different ways to represent and
derive it. We start with the most obvious approach, the so-called monomial basis.
Scott Hudson
2015-08-18
3/13
c1+c2 x n +c 3 x 2n ++cn x nn 1= y n
which has the form of n equations in n unknown coefficients c i . We can express this as the
linear system
A c=y
where
) () ()
1 x 1 x 21 x n1
c1
y1
1
2
n1
A= 1 x 2 x 2 x 2
, c= c 2 , y= y 2
2
n1
c
y
1 xn xn xn
n
n
(1)
A matrix of the form A, in which each column contains the sample values to some common
power is called a Vandermonde matrix. The coefficient vector c is easily calculated in
Scilab/Matlab as
n =
A =
for
A
end
c =
length(x);
ones(x);
k=1:n-1
= [A,x.^k];
A\y;
Here we've assumed that the vector x is a column vector. If it is a row vector replace it by the
transpose (x'). Likewise for the vector y.
Example 1:Suppose x T =[0,1,2] and y T =[2,1,3] . The coefficients of the 2nd
order polynomial that passes through these points are found from
( )( ) ( )
1 0 0
1 1 1
1 2 4
c1
2
c2 = 1
3
c3
()
2
5
c= 2
3
2
so the interpolating polynomial is
EE 221 Numerical Computing
Scott Hudson
2015-08-18
4/13
Fig. 2 5th order polynomial interpolation. Solid dots are samples; squares are actual
function values (for comparison); line is interpolation.
5
3
y=2 x + x 2
2
2
A Scilab program to interpolate n samples appears in Appendix 2 as interpMonomial().
Applying this to the sample data shown in Fig. 1 produces the results shown in Fig. 2. The
resulting interpolation gives a good representation of the underlying function except near x=4.5
.
Numerically that is all there is to polynomial interpolation. However, it does require the solution
of a linear system which for a high-order polynomial must be done numerically. The monomial
approach starts having problems for very high-order polynomials due to round-off error. We'll
come back to this.
In additional to numerical considerations, there are times when we would like to be able to write
the interpolating polynomial directly, without solving a linear system. This is particularly true
when we use polynomial interpolation as part of some numerical method. In this case we don't
know what the x and y values are because we are developing a method that can be applied to any
data. Therefore we want to express the interpolating polynomial coefficients as some algebraic
function of the sample points. There are two classic way to do this: Lagrange polynomials and
Newton polynomials.
Scott Hudson
2015-08-18
5/13
( xx 2 )( xx 3)
(x 1x 2 )( x 1x 3)
From the form of this quadratic it is immediately clear that p (x 1)=1 and p ( x 2 )= p (x 3)=0 .
We'll call this
L1 ( x)=
(x x 2)( xx 3)
( x1 x 2)( x1 x 3)
and L1 ( x1 )=1 , L 1( x 2)=0 , L 1 (x 3)=0 . Now consider the data points ( x 1 , 0), ( x 2 , 1),( x3 , 0) . By
the same logic the interpolating polynomial must be
L 2( x)=
(xx 1 )( xx 3 )
( x 2x 1 )( x 2x 3 )
and L 2( x 1)=0 , L 2 ( x 2)=1 , L 2( x 3)=0 . Finally consider the data points (x 1 , 0), ( x 2 , 0) ,( x 3 , 1) .
These are interpolated by
L3 ( x)=
( xx 1)(x x 2)
( x3 x 1)(x 3 x 2)
( x1)( x2)
( x0)(x2)
( x0)( x1)
, L 2( x)=
, L3 ( x)=
(01)(02)
(10)(12)
(20)(21)
so
2
1
3
y= ( x1)(x2)+
x ( x2)+ x ( x1)
2
1
2
Expanding out the terms and collecting like powers we obtain
Scott Hudson
2015-08-18
6/13
3
3
y=2 x + x 2
2
2
which is the result we obtained in Example 1.
In general, if we have n sample points ( x i , y i ) , i=1,2, , n we form the n polynomials
L k (x )=
ik
x xi
x k xi
(2)
for k =1,2, , n , where the P symbol signifies the product of the n1 terms indicated. The
interpolating polynomial is then
n
y= p ( x )= y k Lk (x)
(3)
k =1
Lagrange polynomials are very useful analytically since they can be written down by inspection
without solving any equations. They also have good numerical properties. A Scilab program to
interpolate n samples using the Lagrange method appears in Appendix 2 as
interpLagrange(). It's important to remember that there is a unique polynomial of order
n1 which interpolates a given n points. Whatever method we use to compute this must
produce the same polynomial (to within our numerical limitations).
y 2c 1 y 2 y1
=
x 2x 1 x2 x 1
so
p 1 (x)= y 1+
y 2 y 1
( xx 1)
x 2 x1
Scott Hudson
2015-08-18
7/13
series.
Now suppose we add a third point ( x 3 , y 3) . Our interpolating polynomial will now need to be
quadratic, but we want to write it in a way that preserves the fit we've already obtained for the
first two points. Therefore we write
p 2 ( x )=c 1+c 2 (xx 1 )+c 3 ( xx 1)( xx 2 )
which guarantees that p 2 ( x 1)=c 1= y1 = p 0 ( x 1) and p 2 (x 2 )=c 1+c 2 ( x 2x 1)= y 2 = p 1( x 2) . The
single new coefficient c 3 is obtained from
p 2 (x 3)= y 3=c 1+c 2 ( x 3x 1)+c 3 (x 3x 1)( x3 x 2)
so that
y 3 y 1
y 3 y 1 y 2 y 1
c2
y 3c1c2 ( x3 x 1) x 3x 1
x3 x1 x 2x 1
c 3=
=
=
( x 3 x1 )(x 3x 2 )
x 3x 2
x 3x 2
The coefficient c 3 has a form that suggests difference in the derivative of y over difference in
x the form of a second derivative. Our interpolating polynomial
y 3 y 1 y 2 y 1
y 2 y 1
x 3 x1 x 2x 1
p 2 (x )= y 1+
( x x1 )+
( xx 1)( xx 2)
x 2x 1
x 3x 2
is roughly analogous to a second-order Taylor series
1
f ( x1 )( xx 1 )2
2
)( ) ( )
1
0
0
0
1 ( x 2x 1 )
0
0
1 ( x 3x 1) (x 3x 1)( x3 x 2)
0
1 ( x 4x 1 ) ( x 4x 1)( x 4x 2 ) ( x 4x 1)( x 4x 2 )(x 4 x3 )
c1
y1
c2
y
= 2
c3
y3
c4
y4
(4)
Scott Hudson
2015-08-18
8/13
Fig. 3: Monomial (thick green line), Lagrange (medium blue line) and Newton (thin dashed red line)
polynomials. At left N=11 point (hece 10 th order polynomials). Agreement is good. At right N=13 points (12 th
order polynomials). The monomial polynomial fails to interpolate the data. The y data values were chosen
randomly. The x values are 0,1,2,...,N-1.
1
0
0
0
1 (x 2x 1 )
0
0
1 (x 3x 1) ( x 3x 1)( x3 x 2)
0
1 (x 4x 1 ) ( x 4x 1)( x 4x 2 ) ( x 4x 1)( x 4x 2 )( x 4 x3 )
y1
y2
y3
y4
1
0
0
0
0 ( x 2 x 1)
0
0
0 ( x3 x1 ) ( x 3 x1 )(x 3x 2 )
0
0 ( x 4 x 1) ( x 4 x1 )(x 4 x 2) ( x 4 x1 )( x 4 x 2)(x 4 x 3)
y1
y 2 y 1
y 3 y 1
y 4 y 1
1 0
0 1
0 1 ( x 3 x 2)
0 1 ( x 4 x 2) (x 4 x 2)( x 4x 3)
y1
y 2 y1
x 2 x 1
y 3 y 1
x3 x1
y 4 y1
x 4 x 1
Note the cancellation of the term (x 3x 1) in row 3 and (x 4 x1 ) in row 4. Now substract row 2
from rows 3 and 4.
Scott Hudson
2015-08-18
9/13
1 0
0 1
0 0 ( x 3 x 2)
0 0 ( x 4 x 2) (x 4 x 2)( x 4x 3)
y1
y 2 y 1
x 2 x1
y 3 y 1 y 2 y 1
x3 x1 x 2x 1
y 4 y1 y 2 y 1
x 4 x 1 x 2x 1
1 0 0
0 1 0
0 0 1
0 0 1 ( x 4 x3 )
y1
y 2 y 1
x 2 x1
y 3 y 1 y 2 y 1
x3 x1 x 2x 1
x 3x 2
y 4 y1 y 2 y 1
x 4 x 1 x 2x 1
x 4x 2
1 0 0
0 1 0
0 0 1
0 0 0 ( x 4 x3 )
y1
y 2 y 1
x 2x 1
y 3 y 1 y 2 y1
x 3 x1 x2 x 1
x 3x 2
y 4 y1 y 2 y 1 y 3 y 1 y 2 y 1
x 4 x 1 x 2x 1 x 3 x1 x 2x 1
x 4x 2
x 3x 2
Scott Hudson
2015-08-18
10/13
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
y1
y 2 y 1
x 2x 1
y 3 y 1 y 2 y 1
x 3x 1 x 2 x1
x3 x 2
y 4 y 1 y 2 y1 y 3 y 1 y 2 y 1
x 4 x1 x 2 x 1 x 3x 1 x 2 x1
x 4x 2
x3 x 2
x 4 x 3
The 5th column now contains the coefficients of the Newton polynomial. Following through these
steps we arrive at the very compact algorithm for calculating the Newton coefficients.
Calculating Newton coefficients
c = y(:); //c is a column vector of the y values
for i=1:n-1
for j=i+1:n
c(j) = (c(j)-c(i))/(x(j)-x(i));
end
end
This is used in the Scilab program interpNewton which appears in Appendix 2. This
interpolates n sample points using the Newton method.
Scott Hudson
2015-08-18
11/13
xx 1
h
so that
x= x1+h t
Written at (t i , y i) samples, our data have the form
(0 , y1 ),(1 , y 2) , ,(n1 , y n )
Suppose we fit a polynomial y= p(t) through these data. Then the polynomial
( )
x x1
h
interpolates the ( x i , y i ) samples. We limit consideration to the Lagrange basis, as it is the most
useful theoretically. Here we list the p (t) polynomials of orders 0,1,2,3,4 which interpolate
n=1,2,3,4,5 sample points. These can easily be derived from formulas (2) and (3).
For n=1
p 0 (t )= y 1
For n=2
p 1(t)= y 2 t y 1 (t1)
For n=3
1
1
p 2 (t)= y 3 t (t1) y 2 t(t2)+ y 1 (t1)(t2)
2
2
For n=4
1
1
1
1
p 3 (t)= y 4 t (t1)(t2) y3 t (t 1)( t3)+ y 2 t( t2)(t3) y 1 (t1)(t 2)(t3)
6
2
2
6
For n=5
p 4 (t )=
1
1
1
y 5 t (t 1)(t 2)( t3) y 4 t(t 1)(t 2)(t4)+ y 3 t (t 1)(t 3)(t4)
24
6
4
1
1
y 2 t ( t2)(t 3)(t 4)+
y (t1)(t2)(t 3)( t4)
6
24 1
Scott Hudson
2015-08-18
12/13
//////////////////////////////////////////////////////////////////////
// interpMonomial.sci
// interpLagrangeCoeff.sci
// 2014-06-25, Scott Hudson, for pedagogic purposes only
// Given n samples x(i),y(i), in the column vectors x,y
// calculate the coefficients c(i) of the (n-1) order
// monomial interpolating polynomial and evaluate at points xp.
//////////////////////////////////////////////////////////////////////
function yp = interpMonomial(x, y, xp)
n = length(x); //x and y must be column vectors of length n
A = ones(x); //build up the Vandermonde matrix A
for k=1:n-1
A = [A,x.^k]; //each column is a power of the column vector x
end
c = A\y; //solve for coefficients
yp = ones(xp)*c(1); //evaluate polynomial at desired points
for k=2:n
yp = yp+c(k)*xp.^(k-1);
end
endfunction
//////////////////////////////////////////////////////////////////////
// interpLagrange.sci
// 2014-06-25, Scott Hudson, for pedagogic purposes only
// Given n samples x(i),y(i), in the column vectors x,y
// evaluate the Lagrange interpolating polynomial at points xp.
//////////////////////////////////////////////////////////////////////
function yp = interpLagrange(x, y, xp)
n = length(x);
yp = zeros(xp);
for k=1:n //form Lagrange polynomial L_k
L = 1;
for i=1:n
if (i~=k)
L = L.*(xp-x(i))/(x(k)-x(i));
end
end
yp = yp+y(k)*L;
end
endfunction
Scott Hudson
2015-08-18
13/13
//////////////////////////////////////////////////////////////////////
// interpNewton.sci
// 2014-06-25, Scott Hudson, for pedagogic purposes
// Given n-dimensional vectors x and y, compute the coefficients
// c(1), c(2), ..., c(n) of the Newton interpolating polynomial y=p(x)
// and evaluate at points xp.
//////////////////////////////////////////////////////////////////////
function yp = interpNewton(x, y, xp)
n = length(y);
c = y(:);
for i=1:n-1
for j=i+1:n
c(j) = (c(j)-c(i))/(x(j)-x(i));
end
end
yp = ones(xp)*c(1);
u = ones(xp);
for i=2:n
u = u.*(xp-x(i-1));
yp = yp+c(i)*u;
end
endfunction
//////////////////////////////////////////////////////////////////////
// interpNewtonCoeffs.sci
// 2014-06-25, Scott Hudson, for pedagogic purposes only
// Given n samples x(i),y(i), in the column vectors x,y
// calculate the coefficients c(i) of the
// (n-1) order interpolating Newton polynomial
//////////////////////////////////////////////////////////////////////
function c = interpNewtonCoeffs(x, y)
n = length(y);
c = y(:);
for j=2:n
for i=n:-1:j
c(i) = (c(i)-c(i-1))/(x(i)-x(i-j+1));
end
end
endfunction
Scott Hudson
2015-08-18
Lecture 15
Interpolation II
1 Introduction
In the previous lecture we focused primarily on polynomial interpolation of a set of n points. A
difficulty we observed is that when n is large, our polynomial has to be of high order, namely
n1 . Unfortunately, high-order polynomials tend to suffer from wiggle, and this limits their
practical usefulness for interpolation. In this lecture we will explore how we can use polynomials
of moderate order to achieve smooth interpolations while avoiding the problems associate with
high-order polynomials.
Fig. 1: Piecewise interpolation: linear (left) and cubic (right). The sample points at which
pieces join (or tie together) are called knots.
Scott Hudson
2015-08-18
2/12
y= y i+ y i (xx i )
This trivially satisfies y= y i when x= xi . For x= xi +1 the requirement
y i+1= y i+ y i ( x i+1x i )
y i=
y i+1 y i
x i+1x i
xx i+1
xx i
+ yi +1
x i x i+1
x i+1x i
We now extend this idea of piecewise interpolation to polynomials. While a piecewise linear
interpolation is continuous, the derivative is clearly not continuous at the sample points. Suppose
now that for each x i we know both the function value y i and the function slope y i . Let's build
a piecewise polynomial interpolation that has the specified function and slope values at the knots.
These polynomial pieces are known as splines. This term comes from the practice of bending
strips of wood or plastic to form smooth curves, a technique often used in ship building and precomputer-era drafting.
For each segment we have four equations to satisfy, the two endpoint function values and the two
endpoint slope values. Our interpolation function must therefore have four unknown coefficients.
Since a 3rd order polynomial has four coefficients we write (Fig. 1)
f (x)=a i x 3+bi x 2+ci x+d i if x i x<x i+1
In each interval we have four unknowns a i ,b i , c i , d i satisfying four equations
y i=a i x3i +bi x 2i +ci x i+d i
y i =3 ai xi +2 bi x i +c i
3
2
y i+1=ai x i+1+bi x i+1+c i x i+1+d i +1
(1)
xx i
xx i
=
xi +1x i
hi
(2)
Scott Hudson
2015-08-18
3/12
y=
dy dy du 1 dy
=
=
dx du dx hi du
or
dy
=hi y
du
We now write
2
y=a+b u+c u +d u
dy
2
=b+2 c u+3 d u
du
hi yi =b
y i+1=a+b+c+d
(3)
hi yi +1=b+2 c+3 d
The simplification from (1) is significant. These can easily be solved to give
a= yi
b=h i y i
Scott Hudson
2015-08-18
4/12
Fig. 2: y=3 e sin x . solid circles: sample points, squares: function values, dashed
line: linear interpolation, solid line: Hermite-spline interpolation. Derivative values
were estimated numerically.
If we had samples of the form ( x i , y i , y i , yi ) we could find 5th order interpolation polynomials
for each interval and so on, in principle, for any number of known derivatives at each sample
point. If we have function and derivative values up to d m y /dx m , the two endpoints of each
interval will provide 2(m+1) equations. A polynomial with this many coefficients has order
n=2 m+1 .
3 Cubic splines
If we know function and derivative values at n points, we can interpolate each interval with
Hermite splines. Often, however, we only know the function values and not the derivative values.
This provides only enough information to uniquely determine a piecewise-linear interpolation.
But the smoothness of a piecewise-cubic interpolation is highly desirable, and we would like to
find a way to keep that property even when we lack derivative information. We will refer to
piecewise cubic interpolation without specific derivative values as cubic splines.
One approach would be to treat the y i as unknowns and find the values that optimize some
desirable property of the curve. Smoothness is an intuitively appealing property to have. A
smooth curve is one in which the slope does not change rapidly. A sudden change in slope
produces a kink in the curve, which is about as unsmooth as you can get.
Therefore the second derivative the rate of change of the slope should be small. Let's write
the integral of the square of the second derivative of S ( x ) as a function of the unknown y i
EE 221 Numerical Computing
Scott Hudson
2015-08-18
5/12
values:
xn
( y , y , , y )= [ S ( x) ] dx
(4)
x1
1
d2
=
S (u) du
2 i
dt
i=1 h i 0
For evenly space samples where hi =h we show in Appendix 1 that minimizing leads to the
equations
3
3
(5)
(6)
for the y i values. The case of nonuniform samples is similar but a bit messier because we have
to keep track of different h values.
are obtained by setting S (x )=0 at the endpoints x 1 , x n , in other words, we let the
interpolation go straight at both ends. This leads to equations (5). Therefore a cubic spline
interpolation with natural end conditions is precisely the optimally smooth Hermite spline
interpolation we derived above.
Another option is to specify the end-point slopes y 1 , y n . This is called the fixed-slope end
conditions. If we have a good estimate of these slopes then this makes sense. Otherwise the
choice is arbitrary.
Finally we can choose the so-called not-a-knot conditions where we require the third derivative
of the interpolation to be continuous at the first and last knots. At these knots, therefore, the
cubic functions and the first, second and third derivatives are continuous. But cubics that agree in
this manner are simply the same cubic; there is no other possibility. So what used to be the first
and last knots are no longer knots, hence the name not-a-knot. For the uniformly sampled case
these equations read
EE 221 Numerical Computing
Scott Hudson
2015-08-18
6/12
Fig. 3 A case where natural conditions produce a more accurate interpolation than not-aknot conditions.
3
3
(7)
Which end conditions should we choose? The natural conditions are attractive because of their
maximally smooth feature. However, in many cases the not-a-knot conditions provide a more
accurate interpolation. It depends on the underlying function f (x ) (see Fig. 3 and Fig. 4).
Actual functions are not necessarily as smooth as possible! Common practice is to use the nota-knot conditions. In practice the two end conditions produce very similar results except,
possibly in the first and last intervals.
[S
x1
xn
(x ) ] dx [ f ( x) ] dx
x1
This is the sense in which we can say that no function provides a smoother interpolation of a
set of data points than does the natural cubic spline. However, as shown in Fig. 3 and Fig. 4,
smoother is not necessarily better.
Scott Hudson
2015-08-18
7/12
Fig. 4 A case where not-a-knot conditions produce a more accurate interpolation than
natural conditions.
Fig. 5: Ten randomly selected points (dots). Thick (green) line: ninth-order polynomial.
Thin (blue) line: not-a-knot spline. Thin dashed (red) line: natural spline.
Scott Hudson
2015-08-18
8/12
Scott Hudson
(x) .
2015-08-18
9/12
Fig. 8: Convolution of impulses with triangle function. Thick gray curve is sum of all
triangle functions and interpolates the data points.
| | | |
( x / h )= 1 x /h x h
| x| >h
0
f (x)= y i
i=1
( )
xx i
h
produces the interpolation shown in Fig. 8. We recognize this as the piecewise linear
interpolation of the data with the addition of linear extrapolations at the two ends. This naturally
leads us to wonder if using a different impulse response function might produce a better
interpolation.
Thinking of x as time and y as the amplitude of an audio signal, there is a remarkable theorem
Scott Hudson
2015-08-18
10/12
due to Nyquist which says that provided: 1) the signal from which the audio samples were draw
contains frequency components only within a limited range, and 2) the sample separation h is
properly chosen, then a convolution interpolation using the sinc function (pronounced sink)
will exactly recreate the original function f (x ) . Mathematically
y= f ( x )= y i sinc
i=
( )
xx i
h
sin x
x
and is plotted in Fig. 9. Note that sinc 0=1 and sinc n=0 for n a non-zero integer.
Unfortunately the sinc function extends to x , although the amplitude of the bumps drop
off as 1/ | x| . If we are interpolating many points, we'll have to add a contribution from each
point. A compromise proposed by Lanczos is to window the sinc function by another sinc
function to produce the Lanczos kernel
L( x)=
sinc ( x ) sinc
( ) ||
x
a
x a
| x|>a
where a is typically chosen to be a small integer (most often 2 or 3). This is plotted in Fig. 10.
The Lanczos interpolation is
i+a
y=
j=i +1a
yj L
( )
xx j
h
Scott Hudson
a=1,2,3 .
2015-08-18
11/12
where i is the index such that x i x<x i+1 . In this expression only the 2 a nearest sample points
contribute to the interpolation at a given value of x. An example of Lanczos interpolation (with
a=3 ) is shown in .
//////////////////////////////////////////////////////////////////////
// interpHermite.sci
// 2014-06-25, Scott Hudson, for pedagogic purposes
// Given n samples x(i),y(i),y1(i) in the column vectors x,y,y1
// where y(i)=f(x(i)) and y1(i) is the derivative of f(x) at x(i),
// interpolate at points xp using Hermite splines.
// Note: x and xp values must be in ascending order.
//////////////////////////////////////////////////////////////////////
function yp=interpHermite(x, y, y1, xp)
n = length(x);
m = length(xp);
yp = zeros(xp);
i = 1; //start linear search at first element
for j=1:m
while (xp(j)>x(i+1)) //find j so that x(j)<=u(i)<=x(j+1)
i = i+1;
end
h = x(i+1)-x(i);
t = (xp(j)-x(i))/h;
yp(j) = (t-1)^2*(y(i)*(2*t+1)+y1(i)*h*t) ..
+t^2*(y(i+1)*(3-2*t)+y1(i+1)*h*(t-1));
end
endfunction
Scott Hudson
2015-08-18
12/12
n1
1
d2
=
S (t) dt
2 i
i=1 h 0 dt
where
] [
d2
1
d2
1
d2
4 w=
S
(t)
dt+
S (t) dt
h 0 dt 2 i1
h 0 dt 2 i
we have
[
y ) 3 h [ ( y
] (
y )] +h (( y ) + y y
)
))
i+1
+ y i )( y i+1
2
i
i+1
+( y i+1
Setting
w
y
we have
3
(8)
)}
y1
gives us
3
y 2+2 y 1= ( y 2 y 1 )
h
(9)
y n+2 y n1 = ( y n y n1)
h
(10)
Scott Hudson
2015-08-18
Lecture 16
Interpolation in 2D
1 Introduction
The two-dimensional (2D) interpolation problem is as follows. We are given n samples
( x i , y i , z i) assumed to be drawn from some function z= f ( x , y) . How can we best estimate
z= f ( x , y) for arbitrary x and y values?
In many practical cases our samples are drawn from a uniform rectangular grid which allows us
to separate the bookkeeping for x and y values. A digital photograph, with its distinct rows and
columns, is an example. We will limit consideration to this case and restate the problem as
follows. Given nm samples of a 2D function
z ij = f ( xi , y j ) , i=1,2, , m , j=1,2, , n
(1)
xi =x 1+(i1) x
y j = y 1+( j1) y
(2)
where
2 Bookkeeping
For a uniform rectangular grid a point (x , y ) will fall inside of a single rectangle
x i x<x i+1 , y j y< y j+1 as shown in Fig. 1. We will call each of these rectangles a unit cell.
Even more than in the 1D case, it is convenient to define normalized coordinates
u=
xx i
y y j
, v=
xi +1x i
y j +1 y j
(3)
Fig. 1 Left: A point ( x , y) will fall within a unit cell, in the case shown cell i=3 , j=2 .
Position in the cell is specified by (u , v) . Right: The integer part of ( xx 1)/ x
determines i while the fractional part determine u.
Scott Hudson
2015-08-18
2/12
x x1=(i1) x+u x
(4)
Rearranging we have
(i1)+u=
xx 1
x
(5)
where (i1) is an integer and 0u<1 is a fractional value. In Scilab the command int(z)
returns the integer part of a real number. Given an x value we can solve for i and u using
i=1+int
( )
x x1
x
, u=
xx 1
+1i
x
(6)
( )
, v=
y y 1
+1 j
y
(7)
y y1
y
(8)
and defines a plane in 3D space. A plane cannot pass through four arbitrary points, so a function
of this form cannot represent f ( x , y ) over a unit cell with four corners. However, we can
divide a unit cell into two triangles, as shown in Fig. 2. Each of those triangles has three vertices
through which we can pass a plane. For the lower triangle ( vu ) this requires
z i , j=a
z i +1, j=a+b
z i +1, j+1=a+b+c
Scott Hudson
(9)
2015-08-18
3/12
(10)
(11)
(12)
5 Bilinear interpolation
The linear function (8) cannot pass through the four corners of a unit cell. However a bilinear
function
z= f ( u , v)=a+b u+c v +d uv
EE 221 Numerical Computing
Scott Hudson
2015-08-18
4/12
can with proper choice of the coefficients a,b,c,d. The equations are
z i , j=a
z i+1, j=a+b
z i , j+1=a+c
z i+1, j+1=a+b+c+d
(13)
(14)
(15)
(16)
(17)
Indeed one way to think of bilinear interpolation is illustrated in Fig.3. First perform linear
interpolation in u along the top and bottom sides of the cell to get the values
z j=z i , j +(z i+1, j z i , j ) u
z j+1=z i , j +( z i+1, j z i , j )u
(18)
at the locations marked X. Now perform linear interpolation in v between those values to get
z= z j+(z j +1z j ) v
(19)
Scott Hudson
2015-08-18
5/12
Substituting (18) into (19) and do a bit of algebra results in the bilinear interpolation formula.
This idea is easily extended into 3 or more dimensions, and Scilab provides a function for
performing this calculation. In 2D we execute
[xp,yp] = ndgrid(xx,yy);
zp = linear_interpn(xp,yp,x,y,z);
where x and y are the arrays of sample x i , y j values and z is the array of sample z i , j values.
The (two-dimensional) arrays xp,yp are the coordinates of the points were we want interpolated
values and zp is the array of those values. In this example we used the ndgrid function to
convert one-dimensional arrays xx and yy into two-dimensional arrays xp and yp. Function
interpBilinear in the appendix implements this algorithm.
6 Bicubic interpolation
In the 1D case piecewise cubic interpolation offered dramatic improvements over piecewise
linear interpolation at the expense of extra calculation and bookkeeping. We might expect a
similar improvement in the 2D case. In 2D we have to consider bicubic functions over rectangles
in the x,y plane. Just as a bilinear expression has a cross term uv, a bicubic expression has
various cross terms of the form
1
v
v2
v3
u u v u v2 u v3
u 2 u 2 v u2 v 2 u2 v 3
u 3 u3 v u3 v 2 u3 v 3
There are sixteen such terms, and each requires its own coefficient to form the bicubic
interpolation
4
z= f ( u , v)= c kl u k 1 v l 1
(20)
k=1 l =1
There are four corners to a rectangle (Fig. 2), so we need four equations at each corner to get a
total of sixteen equations in sixteen unknowns. One way to obtain these equations is to specify
the four values
z,
z z 2 z
,
,
x y x y
at each sample point. Although the bookkeeping is messy, it's then straight-forward to set up
equations to solve for the coefficients c kl for each rectangle.
As in 1D, in 2D we most often do not have known derivative values at the sample points, so we
must estimate these somehow. A simple approach is to approximate the derivatives using central
differences of the z values (we will study central differences in the numerical derivatives lecture).
The x and y partial derivatives are approximated by
z z ( x+ x , y)z ( x x , y ) z i +1, jz i 1, j
=
x
2 x
2x
(21)
and
EE 221 Numerical Computing
Scott Hudson
2015-08-18
6/12
z z (x , y+ y )z ( x , y y) z i , j+1z i , j 1
=
y
2y
2x
(22)
+z
z
z
[z
]
x y (2 y )( 2 x) i +1, j+1 i i , j1 i 1, j+1 i+1, j1
(23)
Another approach is to pass 1D splines through the z data in various ways and use those splines
to estimate the derivative values. This leads to the idea of bicubic splines. In Scilab we perform
bicubic spline interpolation as follows
[xp,yp] = ndgrid(xx,yy);
zp = interp2d(xp,yp,x,y,splin2d(x,y,z));
The x,y,z,xp,yp,zp arrays are the same as in linear_interpn above. The splin2d
function calculates the c kl coefficients for each rectangle which then becomes the last argument
of the interp2d function.
7 Lanczos interpolation
The Lanczos interpolation method readily translates from 1D to 2D. In 2D we write
i+a
z=
j+a
L( x)=
zk , l L
k=i+1a l= j +1a
( )( )
xx k
y y l
L
x
y
(24)
where as before
sinc ( x ) sinc
( ) ||
x
a
x a
(25)
| x|>a
and
sinc(x )=
sin x
x
(26)
Typically a=3 is used. In (24) we take z k , l=0 if either k or l falls outside the grid. Function
interpLanczos2D in the appendix implements this algorithm.
The various 2D interpolation methods we have looked at are commonly used for image resizing
(resampling is just a form of interpolation) in graphics manipulation programs such as
Photoshop and Gimp. Fig. 4 shows a screen shot of the Interpolation menu of Gimp. Finally
Fig. 5 compares interpolation performed by the various methods we have been discussing.
Scott Hudson
2015-08-18
7/12
//////////////////////////////////////////////////////////////////////
// interpUnitCell.sci
// 2014-11-07, Scott Hudson, for pedagogic purposes
// Given sample locations x(i),y(j) 1<=i<=m , 1<=j<=n
// with x and y sorted in ascending order x(1)<x(2)<x(3) etc.
Scott Hudson
2015-08-18
8/12
Scott Hudson
2015-08-18
9/12
//////////////////////////////////////////////////////////////////////
// interpNeighbor2D.sci
// 2014-11-07, Scott Hudson, for pedagogic purposes
// Given samples (x(i),y(j),z(i,j)) 1<=i<=m , 1<=j<=n
// use nearest-neighbor interpolation over triangles to estimate
// zp(k,l) = f(xp(k),yp(l)) 1<=k<=p , 1<=l<=q
// x and y must be sorted in ascending order x(1)<x(2)<x(3) etc.
//////////////////////////////////////////////////////////////////////
function zp=interpNeighbor2D(x, y, z, xp, yp);
m = length(x);
n = length(y);
p = length(xp);
q = length(yp);
zp = zeros(p,q);
Dx = (x(m)-x(1))/(m-1);
Dy = (y(n)-y(1))/(n-1);
for k=1:p
for l=1:q
[i,j,u,v] = interpUnitCell(x,y,xp(k),yp(l));
if (u<=0.5)
kk = i;
else
kk = i+1;
end
if (v<=0.5)
ll = j;
else
ll = j+1;
end
zp(k,l) = z(kk,ll);
end
end
endfunction
Scott Hudson
2015-08-18
10/12
//////////////////////////////////////////////////////////////////////
// interpTriangle.sci
// 2014-11-07, Scott Hudson, for pedagogic purposes
// Given samples (x(i),y(j),z(i,j)) 1<=i<=m , 1<=j<=n
// use linear interpolation over triangles to estimate
// zp(k,l) = f(xp(k),yp(l)) 1<=k<=p , 1<=l<=q
// x and y must be sorted in ascending order x(1)<x(2)<x(3) etc.
//////////////////////////////////////////////////////////////////////
function zp=interpTriangle(x, y, z, xp, yp);
m = length(x);
n = length(y);
p = length(xp);
q = length(yp);
zp = zeros(p,q);
Dx = (x(m)-x(1))/(m-1);
Dy = (y(n)-y(1))/(n-1);
for k=1:p
for l=1:q
[i,j,u,v] = interpUnitCell(x,y,xp(k),yp(l));
if ((i>=1)&(i<=m-1)&(j>=1)&(j<=n-1))
if (v<=u)
zp(k,l) = z(i,j)+u*(z(i+1,j)-z(i,j))+v*(z(i+1,j+1)-z(i+1,j));
else
zp(k,l) = z(i,j)+v*(z(i,j+1)-z(i,j))+u*(z(i+1,j+1)-z(i,j+1));
end
end
end
end
endfunction
Scott Hudson
2015-08-18
11/12
//////////////////////////////////////////////////////////////////////
// interpBilinear.sci
// 2014-11-07, Scott Hudson, for pedagogic purposes
// Given samples (x(i),y(j),z(i,j)) 1<=i<=m , 1<=j<=n
// use bilinear interpolation to estimate
// zp(k,l) = f(xp(k),yp(l)) 1<=k<=p , 1<=l<=q
// x and y must be sorted in ascending order x(1)<x(2)<x(3) etc.
//////////////////////////////////////////////////////////////////////
function zp=interpBilinear(x, y, z, xp, yp);
m = length(x);
n = length(y);
p = length(xp);
q = length(yp);
zp = zeros(p,q);
Dx = (x(m)-x(1))/(m-1);
Dy = (y(n)-y(1))/(n-1);
for k=1:p
for l=1:q
[i,j,u,v] = interpUnitCell(x,y,xp(k),yp(l));
if ((i>=1)&(i<=m-1)&(j>=1)&(j<=n-1))
zp(k,l) = z(i,j)+u*(z(i+1,j)-z(i,j))+v*(z(i,j+1)-z(i,j))..
+u*v*(z(i+1,j+1)+z(i,j)-z(i+1,j)-z(i,j+1));
end
end
end
endfunction
Scott Hudson
2015-08-18
12/12
//////////////////////////////////////////////////////////////////////
// interpLanczos2D.sci
// 2014-11-07, Scott Hudson, for pedagogic purposes
// Given samples (x(i),y(j),z(i,j)) 1<=i<=m , 1<=j<=n
// use Lanczos3 interpolation to estimate
// zp(k,l) = f(xp(k),yp(l)) 1<=k<=p , 1<=l<=q
// x and y must be sorted in ascending order x(1)<x(2)<x(3) etc.
//////////////////////////////////////////////////////////////////////
function zp=interpLanczos2D(x, y, z, xp, yp);
m = length(x);
n = length(y);
p = length(xp);
q = length(yp);
zp = zeros(p,q);
Dx = (x(m)-x(1))/(m-1);
Dy = (y(n)-y(1))/(n-1);
a = 3;
function w=L(z) //Lanczos kernel
if abs(z)<1e-6
w = 1;
elseif (abs(z)>=a)
w = 0;
else
w = sin(%pi*z)*sin(%pi*z/a)/(%pi^2*z^2/a);
end
endfunction
for k=1:p
for l=1:q
[i,j,u,v] = interpUnitCell(x,y,xp(k),yp(l));
for kk=i+1-a:i+a
if ((kk>=1)&(kk<=m))
hx = L((xp(k)-x(kk))/Dx);
for ll=j+1-a:j+a
if ((ll>=1)&(ll<=n))
hy = L((yp(l)-y(ll))/Dy);
zp(k,l) = zp(k,l)+z(kk,ll)*hx*hy;
end
end
end
end
end
end
endfunction
Scott Hudson
2015-08-18
1/13
Lecture 17
Optimization in one dimension
1 Introduction
Optimization is the process of finding the best of a set of possible alternatives. If the
alternatives are described by a single continuous variable x, and the goodness of an alternative
is given by the value of a function y= f ( x ) , then optimization is the process of finding the
value x=x 0 where f (x ) takes on its maximum value. In many applications f (x ) will
measure the badness of an alternative (error, cost, etc.) and then our goal is to find where
f (x ) takes on its minimum value. Since finding the minimum of g ( x) is equivalent to finding
the maximum of g (x) a simple sign change converts a minimization problem into a
maximization problem and conversely. Therefore, we can focus on minimization alone with no
loss of generality.
From calculus we know that at the extreme values of a continuous, differentiable function f (x)
the derivative f (x ) is zero. This suggests that we might simply apply our root finding
minimum. If f ( x)<0 the point is a maximum and if f (x)=0 it may be an inflection point.
To uniquely identify a minimum we must have two conditions satisfied: f (x )=0 , f ( x)>0 .
Therefore a useful algorithm must do more than just find a root of f ( x) . Nevertheless, as we
Fig. 1 The condition f ( x)=0 can correspond to (left point) a maximum, (middle point)
Scott Hudson
2015-08-18
2/13
will see, there are many commonalities between optimization and root-finding algorithms.
One difficultly with optimization is illustrated in Fig. 2. The condition f (x )=0 , f ( x)>0 tells
us only that x is a local minimum of f ( x ) , not if it is the global minimum of f (x ) .
Unfortunately there are no good, general techniques for finding a global minimum. In calculus
the algorithm given for finding a global minimum is typically to first find all local minimum
values and then identify the least of those as the global minimum. For the same reason that it is
not numerically feasible to find all zeros of an arbitrary function, it is not feasible to find every
local minimum of an arbitrary function. Therefore we will focus on trying to find a single local
Scott Hudson
2015-08-18
3/13
Fig. 4 Bracketing a minimum. If these are samples of a continuous function there must be a
minimum in [a , b] .
minimum.
2 Graphical solution
Similar to root finding, simply plotting y= f ( x ) and visually identifying the minimum is
typically the easiest and most intuitive approach to optimization. This is illustrated in Fig. 3.
However we often need an automated way to to optimize a function. We now turn to the
optimization version of the bisection method for root find, the so-called golden search method.
3 Golden search
The slow-but-sure bisection method for root finding relies on the idea of bracketing a zero.
Recall that for a continuous function, if the signs of f (a ) , f (b) are different then there must be
a zero in the interval [a , b] . To bracket a minimum we need three points a<b<c (or a>b>c )
such that f (b)< f (a) and f (b)< f (c ) , as illustrated in Fig. 4. If f ( x ) is continuous over
[a , c] it cannot go down and come back up without passing through a minimum value of
f (x ) . There may be more than one local minimum, but there has to be at least one.
The golden search method is a way to shrink the interval [a , c] while maintaining three points
a<b<c that bracket a minimum. When |ca| is less than some tolerance we can report our
minimum as
x=bmax(|ba| , |bc|)
The algorithm for shrinking the interval is illustrated in Fig. 5. We might expect that b should be
the midpoint of the interval [a , c] . But if it was we would have to arbitrarily choose in which of
the two equal-length subintervals [a , b] ,[b ,c ] to sample f (x ) . The most efficient strategy is
to have |ba|<| cb| and then sample f ( x ) in the larger interval [b , c ] at
x=b+R( cb)
where R is some constant. We will find either f (x )< f (b) or f (x ) f (b) . Depending on
which of these occur we relabel the a,b,c values as follows (Fig. 5)
if f ( x)< f (b) set a b ,b x
if f ( x) f (b) set a x , c a
EE 221 Numerical Computing
Scott Hudson
2015-08-18
4/13
a b , b x ; right: f ( b)< f ( x) , a x , c a
This gives us a new and smaller bracket. In the second case we change direction with a on the
right and c on the left. This process is most efficient if the ratio of the resulting large and small
intervals is always the same, that is
|cb| | c x| | ba|
=
=
| ba| | xb| | xb|
The value of R that gives this property is
R=
3 5
=0.382 ...
2
and is related to the golden ratio of antiquity, hence the name golden search. We then have
4 Parabolic interpolation
In the secant method for root finding, given two points (x 1 , y 1) and ( x 2 , y 2 ) we draw a line
through them to approximate the function y= f ( x ) and find the root of that line as our next
approximation to the root of f (x ) . Given three points ( x i , y i ) , i=1,2,3 we can draw a unique
parabola through them as an approximation of f ( x ) . We can then take the minimum of that
parabola as our next approximation of the minimum of f ( x ) . This is the idea behind parabolic
interpolation.
EE 221 Numerical Computing
Scott Hudson
2015-08-18
5/13
We start with the Lagrange interpolating polynomial for our three points
y= y 1
( x x 2)( xx 3)
( xx 1)( xx 3)
( xx 1)( xx 2)
+ y2
+ y3
( x1 x 2)( x1 x3 )
( x 2x 1)( x 2x 3)
(x 3x 1)( x3 x 2)
Scott Hudson
2015-08-18
6/13
1 y 1 ( x 2 x 3) y 2 ( x1 x 3 )+ y 3 (x 1 x 2)
x=
2 y1 ( x 2x 3) y 2 (x 1x 3)+ y 3( x 1x 2 )
Rearranging terms we obtain
2
2
2
1 x 1 ( y3 y 2 )+ x 2 ( y1 y3 )+x 3 ( y2 y1 )
x=
2 x1 ( y 3 y 2 )+ x 2( y1 y3 )+x 3 ( y 2 y 1 )
The term on the right is the displacement of the new estimate from the previous estimate x 2 .
This should shrink to zero as the method converges. Parabolic interpolation converges
superlinearly, k+1 qk , with q1.32 . A Scilab implementation is given in the Appendix as
optimParabolic.
Scott Hudson
2015-08-18
7/13
2
3+1
=1.3660254 ...
2
we can see that for x2 , f (x)<0 , so we might expect problems when using
a method which models the function as an upwardly curved parabola.
Scott Hudson
2015-08-18
8/13
5 Newton's method
For root finding we saw that Newton's method gave quadratic convergence at the price of having
to explicitly calculate the derivative f ( x) . We can apply Newton's method to solve f ( x)=0 .
We find the root of the first-order Taylor series
f (x k +h) f ( x k )+ f ( x k )h=0
to be
h=
f ( xk )
f ( xk )
x k +1=x k
f ( xk )
f (x k )
To apply this we need to explicitly calculate both first and second derivatives. Essentially what
we are doing is to approximate the function by the second-order Taylor series
1
f ( x k ) h2
2
and setting h to correspond to the minimum of this parabola. Of course this requires that
f (x k )>0 otherwise the parabola will have a maximum, not a minimum. This should be the
case, provided we start close enough to the actual minimum of f ( x ) . We could avoid
explicitly calculating the derivatives by approximating f (x) (and possibly f ( x) also) using
function values alone, analogous to what we did in the tangent method. The result would be a
quasi-Newton method. These often form a primary component of a state-of-the-art optimization
routine.
6 Hybrid methods
As with root finding, optimization presents us a tradeoff. Sure-fire methods (golden search) are
relatively slow while faster methods can be unstable and fail to find a solution. For particular
functions one or the other may be preferable. For a general-purpose optimization routine, a good
strategy is to combine slow-but-sure and fast-but-unstable methods into a hybrid method. Brent's
method [1] is a good example. This algorithm first attempts to use parabolic interpolation, but
includes tests to indicate if this is converging in a desirable manner. If it isn't, the algorithm falls
back on the golden search for one or more iterations before trying parabolic interpolation again.
Other hybrid algorithms employ quasi-Newton methods in an attempt to achieve rapid
convergence when possible and slow-but-guaranteed convergence otherwise. These include the
built-in optimization routines in Scilab and Matlab
Scott Hudson
2015-08-18
9/13
Here f(x) is the function to be minimized, x0 is an initial guess at the minimum, xopt is the
computed minimum and fopt is the minimum function value. The list(NDcost,f)
statement takes care of providing numerical derivatives.
Matlab provides the function fminunc for optimization. Its basic syntax is
[xopt,fopt] = fminunc(f,x0);
As always, there are many options that can be set, and the help browser provides complete
documentation.
8 Constrained optimization
What we have covered so far is more precisely referred to as unconstrained optimization. We
are free to test any values of x in our search for a minimum. In a constrained optimization
problem only x values that satisfy one or more given constraints are valid candidates for a
minimum. A simple example would be
min ( x+1)2 s.t. x0
Here we want to find the minimum of f (x )=( x+1) 2 but subject to the constraint that x is non-
Fig. 10 Minimization without constraint (solid dot) and with constraint x0 (open dot).
Scott Hudson
2015-08-18
10/13
negative. This is illustrated in Fig. 10. With no constraint our solution would simply be the
bottom of the parabola. But with the constraint the best we can do is x=0 . Implementing
constrained optimization can be tricky, depending on the complexity of the constraints. A simple
work around that allows us to implement contained optimization using unconstrained
optimization algorithms employs the idea of a penalty function. Instead of minimizing f (x ) , we
minimize f ( x)+ p ( x) where the penalty function p ( x ) is large for values of x that violate the
constraints and zero otherwise. For example, adding
p (x )=
0
, x0
4 x , x<0
to f (x )=( x+1) 2 results in the function shown in Fig. 11. Now an unconstrained minimization
of f (x )+ p (x ) produces the same solution as the original constrained minimization problem.
9 References
1. Brent, Richard P. Algorithms for Minimization Without Derivatives. Dover Publications.
Kindle edition. ASIN: B00CRW5ZTK. 2013 (Originally published 1973)
Scott Hudson
2015-08-18
11/13
//////////////////////////////////////////////////////////////////////
// optimGolden.sci
// 2014-10-31, Scott Hudson, for pedagogic purposes.
// Implements golden search for minimization of y=f(x).
// (a,b,c) must bracket a minimum, i.e., f(b)<f(a) & f(b)<f(c)
// with a<b<c or a>b>c. Function terminates when minimum has been
// estimated to within +-tol
//////////////////////////////////////////////////////////////////////
function [xmin, fmin]=optimGolden(a, b, c, fa, fb, fc, f, tol)
R = (3-sqrt(5))/2; //golden ratio
while (abs(c-b)>tol) //abs(c-b)>abs(b-a) is upper bound on error
x = b+R*(c-b);
fx = f(x);
if (fx<fb)
a = b;
fa = fb;
b = x;
fb = fx;
else
c = a;
fc = fa;
a = x;
fa = fx;
end
end
xmin = b;
fmin = fb;
endfunction
Scott Hudson
2015-08-18
12/13
//////////////////////////////////////////////////////////////////////
// optimBracket.sci
// 2014-10-31, Scott Hudson, for pedagogic purposes.
// Given intial values x1,x2 and a function y=f(x), attempts to
// follow the function downhill until a minimum has been bracketed
// f(b)<f(a) & f(b)<f(c) with a<b<c or a>b>c.
//////////////////////////////////////////////////////////////////////
function [a, b, c, fa, fb, fc]=optimBracket(x1, x2, f)
MAX_ITERS = 20; //give up after this many attempts
a = x1;
b = x2;
fa = f(a);
fb = f(b);
if (fa<fb) //going uphill, go other way by switching a & b
c = a;
//save a
fc = fa;
a = b;
//a<-b
fa = fb;
b = c;
//b<=a
fb = fc;
end
R = (3-sqrt(5))/2; //golden ratio
step = (1-R)/R;
done = 0;
iter = 0;
while (~done)
c = b+(b-a)*step;
fc = f(c);
if (fc>fb)
//we're now going uphill, bracket found
done = 1;
else
//still going down hill
a = b;
fa = fb;
b = c;
fb = fc;
iter = iter+1;
end
if (iter>MAX_ITERS)
error('optimBracket: MAX_ITERS reached');
end
end
endfunction
Scott Hudson
2015-08-18
13/13
//////////////////////////////////////////////////////////////////////
// optimParabolic.sci
// 2014-10-31, Scott Hudson, for pedagogic purposes.
// Uses parabolic interpolation to estimate the minimum of y=f(x).
// Last three estimates are retained for interpolation.
//////////////////////////////////////////////////////////////////////
function [xmin, fmin]=optimParabolic(a, b, c, fa, fb, fc, f, tol)
MAX_ITERS = 20;
x = [a,b,c];
y = [fa,fb,fc];
iter = 1;
while ((max(x)-min(x))>2*tol)
N = (y(3)-y(2))*x(1)^2+(y(1)-y(3))*x(2)^2+(y(2)-y(1))*x(3)^2;
D = (y(3)-y(2))*x(1) +(y(1)-y(3))*x(2) +(y(2)-y(1))*x(3) ;
x(1) = x(2);
y(1) = y(2);
x(2) = x(3);
y(2) = y(3);
x(3) = N/(2*D);
y(3) = f(x(3));
iter = iter+1;
if (iter>MAX_ITERS)
error('optimParabolic: MAX_ITERS reached');
end
end
xmin = x(3);
fmin = y(3);
endfunction
Scott Hudson
2015-08-18
Lecture 18
Optimization in n dimensions
1 Introduction
We now consider the problem of minimizing a single scalar function of n variables, f (x) ,
where x=[ x1 , x 2 , , x n ]T . The 2D case can be visualized as finding the lowest point of a
surface z= f ( x , y) (Fig. 1).
A necessary condition for a minimum is that f / x i=0 for all 1in . The partial derivative
f / x i is the ith component of the gradient of f, denoted f , so at a minimum we must have
f =0
(1)
In the 2D case this implies we've bottomed out at the lowest point of a valley. The gradient
also vanishes at a maximum, so this is a necessary but not sufficient condition for a minimum.
2 Quadratic functions
Quadratic functions of several variables come up in many applications. A quadratic function of n
variables x i has the form
n
f (x 1 , x 2 , , x n)=c+ bi x i+
i=1
1
a x x
2 i=1 j =1 ij i j
(2)
Since
1
1
1
a x x + a x x = (a +a ji ) xi x j
2 ij i j 2 ji j i 2 ij
(3)
the coefficients a ij , a ji only appear as the sum a ij +a ji . Without loss of generality, therefore, we
can take a ij =a ji .
EE 221 Numerical Computing
Scott Hudson
2015-08-18
2/9
By differentiation we have
f (0)=c ,
f
f2
=aij
=b ,
xi i xi x j
(4)
which allows us to interpret the form (2) as a multivariable Taylor series of an arbitrary function.
The conditions for a minimum (or maximum) are (for k =1,2, , n )
n
f
1
1
=b k + aik x i+ a kj x j=bk + a kj x j =0
xk
2 i=1
2 j=1
j=1
(5)
(6)
where bi are the components of b, and a ij are the components of the symmetric matrix A. The
solution is
x=A1 b
(7)
3 Line minimization
We know how to go about minimizing a function of one variable. If we start at a point x 0 and
move only in the direction of a vector u (Fig. 2) then the f (x) values we can sample form a
function of a single variable
Scott Hudson
2015-08-18
3/9
(8)
g (t )= f (x 0+t u)
Here the variable t is the distance we move in the direction u. We can use any of our 1D
minimization methods on this function. Of course this is not likely to find the minimum of
f (x) . However, suppose we start at x 0 and move along the direction u1 to a minimum. Call
this new point x 1=x 0+t 1 u1 . Then move along another direction u 2 to find a minimum at
x 2=x1+t 2 u 2 and so on. This process should eventually find a local minimum (if one exists).
The process of minimizing the function f (x) along a single line defined by some vector u is
called line minimization. The algorithm is quite simple
Successive line minimization algorithm
start with initial guess x 0 and search directions ui
iterate until converged
for i=1,2, , n
find t that minimizes g (t )= f (x 0+t ui )
set x 0 x 0+t ui
An obvious set of directions is the coordinate axes
() () ()
1
0
u 1 = 0 , u2 =
0
1
0 ,, u n=
0
0
0
(9)
In this case the algorithm simply minimizes f (x) with respect to x 1 , then with respect to x 2
and so on. The algorithm will find a minimum (if one exists), but in many cases it can be very
slow to do so. An example is shown in Fig. 3 where we minimize the quadratic function
2
( )
x y
2
f ( x , y )=
+( x+ y 2 )
4
(10)
starting at x= y=0 . The minimum is at x= y=1 . From the contour plot we see that the
valley of this function is narrow and oriented 45o to the coordinate axes. Since we are limited
to moving in only the x or y direction at any one time, the algorithm ends up taking many,
progressively smaller, zig-zag steps down the valley. The net movement is in a diagonal direction
along the valley floor. If that direction was one of our ui directions then we might be able to
take one big step directly to the minimum. This motivates the development of direction set
methods which attempt to adapt the ui directions to the geometry of the function being
minimized.
Consider the quadratic function (2). This can be written as
1
f =c+xT b+ x T A x
2
EE 221 Numerical Computing
Scott Hudson
(11)
2015-08-18
4/9
Fig. 3: Successive line minimization along the coordinate axes. Left: contours of quadratic function.
Right: Progressive results of line minimization.
The gradient is
f =b+A x
(12)
Suppose we have found the minimum of g 1 (t)= f ( x 0+t u1) at x 1 . That this point the gradient
of f must be orthogonal to u1 , otherwise we could move along the u1 to lower values of f.
Therefore
uT1 b+uT1 A x1=0
(13)
Now we find the minimum of g 2 (t)= f (x 1+t u2 ) at x 2=x1+t m u 2 . At this new x value we
want both u1 and u 2 to be orthogonal to the gradient. This ensures that the new point remains a
minimum along the u1 direction as well as the u 2 direction. This requires
T
(14)
uT1 A u 2=0
(15)
4 Powell's method
Powell showed that a simple addition to the successive line minimization algorithm enables it to
find conjugate directions and minimize an arbitrary quadratic function of n variables in n
EE 221 Numerical Computing
Scott Hudson
2015-08-18
5/9
iterations. After completing the line minimizations of the for loop, we form a new direction v
which is the net direction x 0 moved due to the n line minimizations. We then perform a single
line minimization along the direction v. Finally, we discard the first search direction, u1 , left
shift the other n1 directions ( ui ui+1 ) and make v the new u n direction. It turns out any
quadratic function will be minimized by n iterations of this procedure. The algorithm is
Powell's method
start with initial guess x 0 and search directions ui
iterate until converged
save current estimate x old
0 x0
for i=1,2, , n
find t that minimizes f (x 0+t ui )
set x 0 x 0+t ui
old
v [x0 x old
0 ]/ x 0x 0
un v
Fig. 4: Powell's method. Left: contours of quadratic function; Right: progressive results of each line
minimization. The minimum is found in n=2 iterations (6 line minimizations total).
Scott Hudson
2015-08-18
6/9
Fig. 5: The "banana function. This is actually the negative function so that the valley appears as a
hill. Values less than -100 have been chopped off to show greater detail.
A Scilab version of Powell's method is given in the Appendix. Applying this to function (10),
starting at x= y=0 , we obtain the results shown in Fig. 4. In two iterations of three line
minimizations each (Powell's method adds one line minimization after the for loop) we arrive at
the minimum. Powell's method has figured out the necessary diagonal direction.
A more challenging test is given by the Rosenbrock function
f (x , y )=(1x)2 +100( y x2 )2
(16)
shown in Fig. 5. Because of its shape it is sometimes called the banana function. The minimum
Scott Hudson
2015-08-18
7/9
is f (1,1)=0 . Unlike the function of Fig. 4, the valley of this function twists and the algorithm
must following this changing direction. Starting at x= y=0 Powell's method gives the results
shown in Fig. 6. We can see how the algorithm tracks the twisting valley and arrives at the
minimum after only a few iterations.
5 Newton's method
Earlier we saw that the gradient of the quadratic function
n
f (x 1 , x 2 , , x n)=c+ b i xi +
i=1
1
a x x
2 i =1 j=1 ij i j
(17)
vanishes at
x=A1 b
(18)
2
f
f
, a ij =a ji=
xi x j
xi
(19)
The vector b is the gradient of f. The matrix of second derivatives A is called the Hessian of f.
We solve for the minimum of this quadratic Taylor series and take that to be our new x. We
continue until the method has converged.
Newton's method
iterate until converged
at x evaluate the gradient b and the Hessian A
set x xA1 b
Let's apply Newton's method to the banana function
f (x , y )=(1x)2 +100( y x 2)2
(20)
The gradient is
()
f
2(1x)400 x ( yx 2 )
x
b=
=
f
200( y x 2)
y
(21)
The Hessian is
2 f
2
x
A=
2 f
y x
2 f
x y
2+1200 x 2 400 y 400 x
=
400 x
200
2 f
2
y
(22)
Scott Hudson
2015-08-18
( )
b= 2
0
8/9
2 0
, A=
0 200
1
, A =
( )
1 200 0
1
1
, A b=
400 0 2
0
(23)
() () ( ) ()
x
0
1
1
=
=
y
0
0
0
(24)
( )
b=
( )
1
400
1202 400
1
200 400
1
0
, A=
, A =
, A b=
80400 400 1202
200
400 200
1
(25)
() () ( ) ()
x = 1 0 = 1
y
0
1
1
(26)
Here f(x) is the function to be minimized, x0 is an initial guess at the minimum, xopt is the
computed minimum and fopt is the minimum function value. By default optim assumes the
function f provides both function and gradient values. If f returns only a function value, the
list(NDcost,f) statement takes care of providing numerical derivatives. Here's this
function applied to the banana function.
x0 = [0;0];
function z = f(x)
z = (1-x(1))^2+100*(x(2)-x(1)^2)^2;
endfunction
[fopt,xopt] = optim(list(NDcost,f),x0);
disp(xopt);
Scott Hudson
2015-08-18
9/9
//////////////////////////////////////////////////////////////////////
// optimPowell.sci
// 2014-10-31, Scott Hudson, for pedagogic purposes.
// Implements Powell's method for minimizing a function of
// n variables.
//////////////////////////////////////////////////////////////////////
function [xmin, fmin]=optimPowell(fun, x0, h, tol)
n = length(x0); //# of variables
searchDir = eye(n,1); //direction for current search
searchDirs = eye(n,n); //set of n search directions
function s=gfun(t)
//local scalar function to pass to 1D
s = fun(x0+t*searchDir) //optimization routines
endfunction
done = 0;
while(~done)
x0old = x0; //best solution so far
for i=1:n //minimize along each of n directions
searchDir = searchDirs(:,i);
[a,b,c,fa,fb,fc] = optimBracket(-h,h,gfun);
[tmin,gmin] = optimGolden(a,b,c,fa,fb,fc,gfun,tol/10);
x0 = x0+tmin*searchDir; //minimum along this direction
end
for i=1:n-1 //update search directions
searchDirs(:,i) = searchDirs(:,i+1);
end
v = x0-x0old; //new search direction
searchDirs(:,n) = v/sqrt(v'*v); //add new search dir unit vector
searchDir = searchDirs(:,n); //minimize along new direction
[a,b,c,fa,fb,fc] = optimBracket(-h,h,gfun);
[tmin,gmin] = optimGolden(a,b,c,fa,fb,fc,gfun,tol/10);
x0 = x0+tmin*searchDir;
xChange = sqrt(sum((x0-x0old).^2));
if (xChange<tol)
done = 1;
end
end //while
xmin = x0;
fmin = fun(xmin);
endfunction
Scott Hudson
2015-08-18
Lecture 19
Curve fitting I
1 Introduction
Suppose we are presented with eight points of measured data (x i , y j ) . As shown in Fig. 1 on the
left, we could represent the underlying function of which these data are samples by
interpolating between the data points using one of the methods we have studied previously.
Fig. 1: Measured data with: (left) spline interpolation, (right) line fit.
However, maybe the data are samples of the response of a process that we know, in theory, is
supposed to have the form y= f ( x )=a x+b where a,b are constants. Maybe we also know that
y is a very weak signal and the sensor used to measure it is noisy, that is, it adds its own
(random) signal in with the true y data. Given this it makes no sense to interpolate the data
because in part we'll be interpolating noise, and we know that the real signal should have the
form y=ax+b . In a situation like this we prefer to fit a line to the data rather than perform an
interpolation (Fig. 1 at right). If done correctly this can provide a degree of immunity against the
effects of measurement errors and noise. More generally we want to develop curve fitting
techniques that allow theoretical curves, or models, with unknown parameters (such as a and b in
the line case) to be fit to n data points.
Scott Hudson
2015-08-18
2/10
we take measurements with all of them and average the results we should get a better estimate of
the true mass that by relying on the measurement from a single scale. Our results might look
something like shown in Fig. 2. Let the measurement of the ith scale be mi then the average
measurement is given by
n
m=
1
m
n i=1 i
(1)
where n is the number of measurements. This is what we should use for our best estimate of
the true mass. Averaging is a very basic form of curve fitting.
(2)
A perfect fit would give r i=0 for all i. The residual can be positive or negative, but what we are
most concerned with is its magnitude. Let's define the mean squared error (MSE) as
n
1
1
MSE = r i2= ( y i( a xi +b))2
n i =1
n i=1
Scott Hudson
(3)
2015-08-18
3/10
We now seek the values of a and b that minimize the MSE. These will satisfy
MSE
MSE
=0 and
=0
a
b
(4)
The b derivative is
n
MSE
2
= ( y i(a x i+b))=0
b
n i =1
(5)
1
a
1
yi x i b=0
n i =1
n i=1
n i=1
(6)
1
1
y= y i , x = x i
n i=1
n i=1
(7)
ya x b=0
(8)
a x +b= y
(9)
MSE
2
= ( y i(a x i+b)) x i =0
a
n i =1
(10)
1
a
b
x i y i x 2i x i=0
n i =1
n i =1
n i =1
(11)
xy a x 2 b x =0
(12)
or
with the additional definitions
n
1
1
xy = x i y i , x 2 = x 2i
n i=1
n i=1
(13)
a x 2 +b x = xy
(14)
Scott Hudson
(15)
2015-08-18
4/10
(16)
(17)
xy x y
x 2 x 2
(18)
Equations (18) and (16) provide the best-fit values of a and b. Because we obtained these
parameters by minimizing the sum of squared residuals, this is called a least-squares line fit.
Example. The code below generates six points on the line y=1 x and adds
normally-distributed noise of standard deviation 0.1 to the y values. Then (18)
and (16) are used to calculate the best-fit values of a and b. The data and fit line
are plotted in Fig. 3. The true values are a=1 , b=1 . The fit values are
a=0.91 ,b=1.02 .
-->x = [0:0.2:1]';
-->y = 1-x+rand(x,'normal')*0.1;
-->a = (mean(x.*y)-mean(x)*mean(y))/(mean(x.^2)-mean(x)^2)
a =
- 0.9103471
-->b = mean(y)-a*mean(x)
b =
1.0191425
Scott Hudson
2015-08-18
5/10
4 Linear least-squares
The least-squares idea can be applied to a linear combination of any m functions
f 1 (x) , f 2 (x) , , f m (x) . Our model has the form
m
y= c j f j ( x)
(19)
j=1
(20)
which is just the linear case we've already dealt with. If we add f 3 (x)=x 2 then the model is
y=c 1+c 2 x+c 3 x 2
(21)
(22)
In any case we'll continue to define the residuals as the difference between the observed and the
modeled y values
m
r i = y i c j f j ( x i)
(23)
j =1
1
1
MSE = r i2= y i c j f j (x i )
n i =1
n i=1
j =1
(24)
1
1
y i c j f j (x i ) = y 2i 2 y i c j f j ( x i )+
n i =1
n i =1
j=1
j =1
j =1
])
2
c j f j ( x i)
(25)
Call
n
1
yi2= y 2
n i =1
and
n
2
y c f (x )= b j c j
n i=1 i j=1 j j i
j =1
(26)
with
n
2
b j = yi f j ( xi )
n i =1
(27)
Scott Hudson
2015-08-18
6/10
[
m
j=1
j=1
k=1
c j f j ( xi ) = c j f j ( x i ) c k f k (x i)
(28)
Therefore
n
n i =1
c j f j ( xi ) =
j=1
n i=1
j=1
k=1
c j f j ( xi ) ck f k (x i ) =
1
a c c
2 j=1 k=1 jk j k
(29)
with
n
a jk =a kj =
2
f (x ) f (x )
n i=1 j i k i
(30)
MSE = y 2 bi ci +
i =1
1
a c c
2 i =1 j=1 ij i j
(31)
This shows that the MSE is a quadratic function of the unknown coefficients. In the lecture
Optimization in n dimensions we calculated the solution to a system of this form, except that
the second term (with the b coefficients) had a plus rather than minus sign. Defining the m1
column vectors b and c and the mm matrix A as
(32)
the condition for a minimum is (with the minus sign for the b coefficients)
b+A c=0
(33)
c=A1 b
(34)
and
Another way arrive at this result is to define the n1 column vector
y=[ yi ]
(35)
(36)
y=F c
(37)
(38)
c=( FT F ) FT y
(39)
Scott Hudson
2015-08-18
7/10
n
[ F F ] jk =
T
i=1
n
f ij f ik = a jk
2
(40)
n
f ij y i= b j
2
(41)
[ F y ] j =
T
i =1
A c=b
The linear system (38) is called the normal equation, and we have the following algorithm
Linear least squares fit
Given n samples (x i , y i )
m
and a model y= c j f j ( x)
j=1
(43)
c=F1 y
(44)
by writing
because F does not have an inverse. However, as we've seen, we can compute
1
c=( FT F ) FT y
(45)
and this c will come as close as possible (in a least-squares sense) to solving (43). This leads us
to define the pseudoinverse of F as the mn matrix
1
F =( F F ) F
T
(46)
(47)
Scilab/Matlab returns the least-squares solution. We do not have to explicitly form the normal
EE 221 Numerical Computing
Scott Hudson
2015-08-18
8/10
5 Goodness of fit
Once we've fit a model to data we may wonder if the fit is good or not. It would be helpful to
have a measure of goodness of fit. Doing this rigorously requires details from probability theory.
We will present the following results without derivation.
Assume our y values are of the form
y i=si +i
where si is the signal that we are trying to model and i is noise. If our model were to perfectly
Scott Hudson
2015-08-18
9/10
r i= y i c j f j (x i)
(48)
j =1
would simply be noise r i=i . We can quantify the goodness of fit by comparing the statistics of
our residuals to the (assumed known) statistics of the noise. Specially, for large nm , and
normally distributed noise, a good fit will result in the number
=
1
r2
nm i=1 i
(49)
being equal, on average, to the standard deviation of the noise, where n is the number of data and
m is the number of model coefficients. If it is significantly larger than this it indicates that the
model is not accounting for all of the signal, where a fractional change of about 2/( nm) is
statistically significant. For example, 2/50=0.2 means that a change of around 20% is
statistically significant. If the noise standard deviation is 0.1 , a larger than about
0.1(1.2)=0.12 implies the signal is not being fully modeled. The following example illustrates
the use of this goodness-of-fit measure.
Example. The following code was used to generate 50 samples of the function
f (x)=x+x 2 over the interval 0x1 with normally distributed noise of
standard deviation 0.05 added to each sample.
n = 50;
rand('seed',2);
x = [linspace(0,1,n)]';
y = x+x.^2+rand(x,'normal')*0.05;
These data were then fit by the four models y=c 1 , y=c 1+c 2 x ,
y=c 1+c 2 x+c 3 x 2 and y=c 1+c 2 x+c 3 x 2+c 4 x 3 . The resulting values were
0 =0.6018 , 1=0.0864 , 2=0.0506 and 3 =0.0504 . Since 2/50=0.2 a
change of about 20% is statistically significant. The fits improved significantly
until the last model. The data therefore support the model y=c 1+c 2 x+c 3 x 2 but
not the cubic model. The fits are shown in Fig. 5.
Scott Hudson
2015-08-18
10/10
Fig. 5 Data set fit by polynomials. Top-left: y=c 1 , 0=0.6018 . Top-right: y=c 1+c 2 x , 1=0.0864 .
2
2
3
Bottom-left: y=c 1+c 2 x+c3 x , 2=0.0506 . Bottom-right: y=c 1+c 2 x+c3 x +c4 x , 3=0.0504 .
Scott Hudson
2015-08-18
Lecture 20
Curve fitting II
1 Introduction
In the previous lecture we developed a method to solve the general linear least-squares problem.
Given n samples (x i , y i ) , the coefficients c j of a model
m
y= f (x )= c j f j ( x )
(1)
j=1
MSE=
2
1
y i f ( x i) ]
[
n i =1
(2)
The MSE is a quadratic function of the c j and best-fit coefficients are the solution to a system of
linear equations.
In this lecture we consider the non-linear least-squares problem. We have a model of the form
y= f ( x ; c 1 , c 2 , , c m )
(3)
where the c j are general parameters of the function f, not necessarily coefficients. An example
is fitting an arbitrary sine wave to data where the model is
y= f ( x ; c 1 , c 2 , c3 )=c 1 sin (c 2 x+c 3)
The mean-squared error
n
MSE (c 1 , , c m)=
2
1
y i f ( x i ; c 1 , , c m )]
[
n i=1
(4)
will no longer be a quadratic function of the c j , and the best-fit c j will no longer be given as
the solutions of a linear system. Before we consider this general case, however, let's look at a
special situation in which a non-linear model can be linearized.
2 Linearization
In some cases it is possible to transform a nonlinear problem into a linear problem. For example,
the model
y=c 1 e c
(5)
(6)
If we define ^y =ln y and c^ 1=ln c 1 then our model has the linear form
^y =^c 1+c 2 x
(7)
Scott Hudson
2015-08-18
2/5
Fig. 1: Dashed line: y= e 2 x . Dots: ten samples with added noise. Solid line: fit of the
c x
model y=c 1 e
obtained by fitting a linear model ln y=^c1+c 2 x and then calculating
c^
c1 =e .
2
^c
Once we've solved for c^ 1 , c 2 we can calculate c 1=e .
1
- 1.2939429
2
1
MSE (c 1 , , c m)= [ y i f ( x i ; c 1 , , c m )]
n i=1
(8)
This is simply the optimization in n dimensions problem that we dealt with in a previous
lecture. We can use any of those techniques, such as Powell's method, to solve this problem. It is
convenient, however, to have a front end function that forms the MSE given the data (x i , y i )
and the function f ( x ; c 1 , c 2 , , c m ) and passes that to our minimization routine of choice. The
function fitLeastSquares in the Appendix is an example of such a front end.
The following example illustrates the use of fitLeastSquares.
Scott Hudson
2015-08-18
3/5
Scott Hudson
2015-08-18
4/5
The following code solves the problem in the previous example using datafit.
rand('seed',2);
x = [0:0.1:1]';
y = 2*cos(6*x+0.5)+rand(x,'normal')*0.1;
c0 = [1;5;0];
xyArray = [x,y]';
function r = residual(c,xy)
x = xy(1);
y = xy(2);
r = y-c(1)*cos(c(2)*x+c(3));
endfunction
c0 = [1;5;0];
c = datafit(residual,xyArray,c0);
disp(c);
In Matlab the function lsqcurvefit can be used to implement a least-squares fit. The first
step is to create a file specifying the model function in terms of the parameter vector c and the x
data. In this example the file is named fMod.m
function yMod = fMod(c,x)
yMod = c(1)*cos(c(2)*x+c(3));
Then, in the main program we pass the function fMod as the first argument to lsqcurvefit,
along with the initial estimate of the parameter vector c0 and the x and y data.
x = [0:0.1:1]';
y = 2*cos(6*x+0.5)+randn(size(x))*0.1;
c0 = [1;5;0];
c = lsqcurvefit(@fMod,c0,x,y);
disp(c);
Scott Hudson
2015-08-18
5/5
//////////////////////////////////////////////////////////////////////
// fitLeastSquares.sci
// 2014-11-11, Scott Hudson, for pedagogic purposes
// Given n data points x(i),y(i) and a function
// fct(x,c) where c is a vector of m parameters, find c values that
// minimize sum over i (y(i)-fct(x(i),c))^2 using Powell's method.
// c0 is initial guess for parameters. cStep is initial step size
// for parameter search.
//////////////////////////////////////////////////////////////////////
function [c,fctMin] = fitLeastSquares(xData,yData,fct,c0,cStep,tol)
nData = length(xData);
function w=fMSE(cTest)
w = 0;
for i=1:nData
w = w+(yData(i)-fct(xData(i),cTest))^2;
end
w = w/nData;
endfunction
[c,fctMin] = optimPowell(fMSE,c0,cStep,tol);
endfunction
Scott Hudson
2015-08-18
Lecture 21
Numerical differentiation
1 Introduction
We can analytically calculate the derivative of any elementary function, so there might seem to
be no motivation for calculating derivatives numerically. However we may need to estimate the
derivative of a numerical function, or we may only have a fixed set of sampled function values.
In these cases we need to estimate the derivative numerically.
(1)
=
dx x
h
(2)
where the step size h is small but not zero. This is called a finite difference. Specifically it's a
forward difference because we compare the function value at x with its value at a point
forward of this along the x axis, x+h .
How small should h be? Because of round-off error, smaller is not always better. Let's use Scilab
to estimate
d x
e
dx
=1
x=0
e e
h
(3)
for various h values. The absolute error in the estimate vs. h is graphed in Fig. 1. As h decreases
from 101 down to 108 the error decreases also. However for h=109 and smaller the error
actually increases! The culprit is round-off error in the form of the small difference of large
numbers. Double precision arithmetic provides about 16 digits of precision. If h1016 then
e he0 to 16 digits and the difference e he 0 will be very inaccurate. When h=108 the
difference e he 0 will be accurate to about 8 digits or about 108 , the point at which theoretical
improvement in numerical accuracy is offset by higher round-off error. We typically ignore
round-off error when estimating numerical accuracy, but round-off error needs to be kept in mind
when implementing any algorithm.
Let's investigate how the numerical accuracy of our estimate varies with step size h. Assume we
want to estimate the derivative of f (x ) at x=0 . Write f as a power series
f (x)= f (0)+
n=1
1 (n )
f (0) x n
n!
Scott Hudson
(4)
2015-08-18
2/11
Then
f (h) f (0) 1 (n)
1
1
=
f (0)h n1= f (0)+ f (0)h+ f (0) h2+
h
n!
2
6
n=1
(5)
1
= f (0)+ f (0) h+
h
2
(6)
and we say that the approximation is first-order accurate since the error (the second term on the
right) varies as the first power of h. Decreasing h by a factor 1/10 will decrease the error by
1/10 . However, as we see in Fig. 1 this is only true up to the point that round-off error begins to
be significant. For double precision h108 is optimal. We can shift (6) along the x axis and
rearrange to obtain the general forward-difference formula
f (x )=
f ( x+h) f ( x)
+O(h)
h
(7)
3 Higher-order formulas
The forward-difference approximation (6) uses two samples of the function, namely f (0) and
f (h) . Using three samples we might be able to get a better estimate of f (0) . Suppose we
have the samples f (h), f (0), f (h) . In terms of the power series representation of the
function these are
Scott Hudson
2015-08-18
3/11
f (h) =
f ( 0) f (0) h+
f (0)
f ( 0)
f (h)
f ( 0)+ f ( 0) h+
1
1
1
1
f (0)h 2 f (3) ( 0) h 3+ f (4) (0)h 4 f (5 )( 0)h5+
2
3!
4!
5!
(8)
1
1
1
1
f (0)h 2+ f (3) (0) h3+ f (4 ) (0)h4 + f (5) (0)h5+
2
3!
4!
5!
(9)
where a , b , c are unknown coefficients that we will choose to get the best possible estimate.
The sum a f (h)+b f (0)+c f ( h) will include a term with a factor of f (0) . We want this to
vanish. This requires
(a+b+c) f ( 0)=0 a+b+c=0
(10)
which is one equation in three unknowns. Terms with a factor of f (0) should combine to give
1
h
(11)
We now have two equations in three unknowns. To get a third equation we can require the next
2
term, which contains a factor of f (0) h , to vanish. This gives us the equation
(a+c)
1
f ( 0)h2=0 a+c=0
2
(12)
)( ) ( )
0
1 1 1 a
1
1 0 1 b =
h
1 0 1 c
0
The solution is
a=
1
1
, b=0 , c=
2h
2h
f (0)
f ( h) f (h)
2h
1
= f (0)+ f (3) (0)h 2+
2h
6
so this approximation is second-order accurate. Decreasing h by a factor of 1/10 should
decrease the numerical error by a factor of 1/100 . Rearranging and writing this for an arbitrary
value of x we have the formula
EE 221 Numerical Computing
Scott Hudson
2015-08-18
4/11
f (x )=
f ( x+h) f ( xh)
+O( h2 )
2h
(13)
This type of finite difference is called a central difference since it uses both the forward sample
f (x+h) and the backward sample f (xh) . Scilab code is given in the Appendix.
The error in the central-difference approximation
d x
e
dx
=1
x=0
e heh
2h
is plotted in Fig. 2. Note how the error reduces more rapidly with decreasing h. This allows the
approximation to reach a greater accuracy before round-off error starts to become significant.
With h=105 the error is only about 1011 .
We extend this idea by using even more function samples. If we have the five samples
f (2 h) , f (h) , f (0) , f ( h), f (2 h) we can form an estimate
(14)
This has five unknowns, so we need to form five equations. In terms of the Taylor series
representation of f ( x ) our five samples have the form
Scott Hudson
2015-08-18
f (2 h)
8 (3)
16 (4 )
32 (5)
f (0) h 3+
f (0)h4
f (0) h5+
3!
4!
5!
1
1
1
1
f ( 0) f (0) h+ f (0) h2 f (3) (0) h3+ f (4) (0)h 4 f (5) ( 0) h 5+
2
3!
4!
5!
(15)
f ( 0)
1
1
1
1
8
16 (4)
32 (5)
f ( 0)+2 f (0) h+2 f ( 0) h 2+ f (3) (0)h3+
f (0)h 4+
f (0) h 5+
3!
4!
5!
f (h) =
=
f (h)
f (2 h)
f (0)
5/11
(16)
1
h
(17)
The remaining three equations are obtained by requiring the f (0) , f (3) (0) and f (4 )( 0) terms
to vanish:
b d
1
(8 ab+d +8 e ) f (3) (0)h3=0 8 ab+d +8 e=0
3!
1
( 16 a+b+d +16 e ) f (4) (0) h 4=0 16 a+b+d +16 e=0
4!
1
( 16+b+d +16 e ) f (4) ( 0)h 4=0 16 a+b+d +16 e=0
4!
Our five equations in five unknowns form the system
)( ) ( )
1
1 1 1 1
2 1 0 1 2
4
1 0 1 4
8 1 0 1 8
16
1 0 1 16
0
a
1
b
h
c =
0
d
0
e
0
(18)
1
8
8
1
, b=
, c=0 , d =
, e=
12 h
12 h
12 h
12 h
(19)
f (0)
EE 221 Numerical Computing
(20)
2015-08-18
6/11
(5)
(0)
h
1
(32+8+832 )= f (5) (0) h 4
12 h5!
30
(21)
so
f (2 h)8 f (h)+8 f (h) f (2 h)
1 (5)
4
= f (0)
f (0)h +
12 h
30
(22)
f ( x)=
(23)
If f is a function several variables f (x) , then numderivative will return the gradient of f. It
is also possible to specify the step size h and the order of the approximation (1,2 or 4).
fp = numderivative(f,x0,h,order);
EE 221 Numerical Computing
Scott Hudson
2015-08-18
7/11
The default is second order (central difference) and Scilab chooses an optimal value of h.
Some examples:
-->deff('y=f(x)','y=exp(-x)');
-->numderivative(f,1) //default central difference with optimal h
ans =
- 0.3678794 //exact value is -exp(-1)=-0.3678794...
-->(f(1.1)-f(1))/0.1 //forward difference h=0.1
ans =
- 0.3500836
-->numderivative(f,1,0.1,1) //forward difference h=0.1
ans =
- 0.3500836
-->(f(1.1)-f(0.9))/0.2 //central difference h=0.1
ans =
- 0.3684929
-->numderivative(f,1,0.1,2) //central difference h=0.1
ans =
- 0.3684929
5 Second derivative
(24)
(25)
we would require
(26)
and
(a+c)
1
2
(27)
1
2
, b= 2
2
h
h
(28)
and we find
f (h)2 f (0)+ f (h)
h
= f (0)+
1 (4)
f ( 0)h 2+
12
(29)
Scott Hudson
2015-08-18
8/11
f (x)=
(30)
6 Partial derivatives
For a function f (x , y ) the partial derivative
f
can be defined as
x
f ( x+h , y ) f (x , y)
f
=lim
x h0
h
that is, we hold y fixed and compute the derivative as if f was only a function of x. A centraldifference approximation is
f ( x+h , y) f ( xh , y)
f
x
2h
(31)
f ( x , y+h) f ( x , yh)
f
y
2h
(32)
f
f ( x+h , y)2 f ( x , y )+ f ( xh , y )
2
x
h2
(33)
Likewise
and
2
f
Mixed partial derivative approximations such as
can be developed in steps such as
y x
2
yx
[ ] [ ]
f
x
y+h
f
x
yh
2h
f (x+h , y+h) f ( xh , y+h) f (x+h , y h) f ( xh , yh)
2h
2h
2h
f ( x+h , y+h) f (xh , y+h) f (x+h , yh)+ f ( xh , yh)
=
4 h2
(34)
7 Differential equations
7.1 Ordinary differential equations
An ordinary differential equation (ODE) relates a single independent variable, e.g., x, to a
function f (x ) and its derivatives f (x ) , f (x) , . Most physical laws are expressed in terms
of differential equations, hence their great importance. Certain classes of ODEs can be solved
analytically but many cannot. In either case our derivative formulas can be used to develop
numerical solutions.
Suppose a physical problem is described by a differential equation of the form
Scott Hudson
2015-08-18
9/11
f +2 f +17 f =0
(35)
(36)
solves (35) by taking derivatives and substituting into the equation. A numerical approximation
to (35) is given by (using (30) and (13))
f ( x+h)2 f ( x)+ f ( xh)
f (x+h) f ( xh)
+2
+17 f ( x)=0
2
2h
h
(37)
(38)
Let's use this to calculate f (x ) for x=0, h , 2 h , 3 h , . To get started we need the first two
values
f ( x 1=0)=1 , f ( x 2=h)=eh cos (4 h)
(39)
Then we can apply (38) to get f (x 3=x 2 +h) , f ( x 4= x3 +h) and so on as long as we wish. In
Scilab this looks something like
h = 0.1;
x = 0:h:5;
f(1) = 1;
f(2) = exp(-h)*cos(4*h);
for i=2:n-1
y(i+1) = ((2-17*h^2)*y(i)-(1-h)*y(i-1))/(1+h);
end
The resulting numerical solution and the exact solution are shown in Fig. 4. The agreement is
excellent.
Function odeCentDiff in the Appendix uses this idea to numerically solve a second-order
equation of the form
y + p( x ) y +q ( x) y=r ( x)
Given and initial x value x 1 , a step size h and the two function values y ( x 1) and y (x 1+h) .
Fig. 4 compares the numerical solution of (35) using odeCentDiff with the exact solution
y= f ( x )=ex cos (4 x )
(40)
Scott Hudson
2015-08-18
10/11
f f
+
=0
x2 y 2
(41)
+
h2
h2
f ( x+h , y)+ f ( xh , y)+ f (x , y+h)+ f ( x , yh)4 f ( x , y)
=
h2
The last expression is zero when
1
f (x , y )= [ f ( x+h , y )+ f (xh , y )+ f ( x , y +h)+ f ( x , yh) ]
4
(43)
Scott Hudson
2015-08-18
11/11
//////////////////////////////////////////////////////////////////////
// derivSecondOrder.sci
// 2014-11-15, Scott Hudson, for pedagogic purposes
// Numerical estimation of derivative of f(x) using 2nd-order
// accurate central difference and "optimum" step size.
//////////////////////////////////////////////////////////////////////
function yp=derivSecondOrder(f, x)
h = 1e-5*(1+abs(x)); //step size scales with x, no less than 1e-5
yp = (f(x+h)-f(x-h))/(2*h);
endfunction
//////////////////////////////////////////////////////////////////////
// derivFourthOrder.sci
// 2014-11-15, Scott Hudson, for pedagogic purposes
// Numerical estimation of derivative of f(x) using 4th-order
// accurate central difference and "optimum" step size.
//////////////////////////////////////////////////////////////////////
function yp=derivFourthOrder(f, x)
h = 1e-3*(1+abs(x)); //step size scales with x, no less than 1e-3
yp = (f(x-2*h)-8*f(x-h)+8*f(x+h)-f(x+2*h))/(12*h);
endfunction
//////////////////////////////////////////////////////////////////////
// odeCentDiff.sci
// 2014-11-15, Scott Hudson, for pedagogic purposes
// Uses 2nd-order accurate central difference approximation to
// derivatives to solve ode y''+p(x)y'+q(x)y=r(x)
// approximations are
// y' = (y(x+h)-y(x-h))/(2h) and y'' = (y(x+h)-2y(x)+y(x-h))/h^2
// p,q,r are functions, x1 is the initial x value, h is step size,
// n is number of points to solve for, y1=y(x1), y2=y(x1+h).
//////////////////////////////////////////////////////////////////////
function [x, y]=odeCentDiff(p, q, r, x1, h, n, y1, y2)
x = zeros(n,1);
y = zeros(n,1);
x(1) = x1;
x(2) = x(1)+h;
y(1) = y1;
y(2) = y2;
h2 = h*h;
for i=2:n-1
hp = h*p(x(i));
x(i+1) = x(i)+h;
y(i+1) = (2*h2*r(x(i))+(4-2*h2*q(x(i)))*y(i)+(hp-2)*y(i-1))/(2+hp);
end
endfunction
Scott Hudson
2015-08-18
Lecture 22
Numerical integration
1 Introduction
The derivative of any elementary function can be calculated explicitly as an elementary function.
However, the anti-derivative of an elementary function may not be expressible as an elementary
function. Therefore situations arise where the value of a definite integral
b
I = f ( x)dx
(1)
Scott Hudson
2015-08-18
2/15
where x,y are the arrays of x and y samples, the integration is from x 1 to x n , and I is the
estimate of the integral. As an example, the following integral can be calculated exactly
3
I = ex sin ( x )dx=
0
(1+e3)=0.3034152 ...
+1
2
(2)
Eleven samples of f ( x ) (Fig. 2) passed to intsplin estimated I with an error of less than
1%.
deff('y=f(x)','y=exp(-x).*sin(%pi*x)');
n = 11;
x = linspace(0,3,n);
y = f(x);
I = intsplin(x,y);
disp('I = '+string(I));
I = 0.3057043
Fig. 2: Eleven samples of f ( x)=e sin ( x) over 0 x3 used with the intsplin
3
Scott Hudson
2015-08-18
3/15
ba
n
(3)
The midpoint rule is conceptually simple. In it is nothing more than a Riemann sum such as is
typically used in calculus textbooks to define a definite integral. It has the advantage that f (x )
is not evaluated at x=a , b , so it can be applied to functions which are singular at one or both
endpoints, such as
1
dxx
0
(4)
4 Trapezoid rule
The trapezoid rule approximates f (x ) using linear interpolation (Fig. 4). The integral is then
the sum of areas of trapezoids. If the left and right heights of a trapezoid are f (x i ) and
f (x i+h) then the trapezoid's area is
h
I = [ f ( xi )+ f ( x i+h)]
2
(5)
(the average height times the width). Adding up all these areas we have
Scott Hudson
2015-08-18
4/15
h
h
h
I (h)= [ f (a )+ f (a+h)]+ [ f (a+h)+ f (a+2 h)] + [ f (a+2 h)+ f ( a+3 h) ]
2
2
2
h
++ [ f (a+[n1] h)+ f (b) ]
2
(6)
Notice that except for f (a ), f (b) , all the function values appear twice in the sum. Therefore
I (h)=h
1
1
f (a )+ f (a+h)+ f ( a+2 h)++ f (a+[n1]h)+ f (b)
2
2
(7)
or
I (h)=h
n1
f (a )+ f (b)
+ f (a+ih)
2
i=1
(8)
Scott Hudson
2015-08-18
5/15
i=1
n1
n 1
i=1
even
i =1
odd
(9)
the sum of even samples has already been calculated in the previous iteration. We only need to
multiply the previous iteration value by 1/2 (since h is being halved) and add in the new (odd)
samples
n1
1
I ( h)= I (2 h)+h f (a+ih)
2
i=1
(10)
odd
1
I = I old +h f ( a+ih)
2
i=1
odd
Scott Hudson
2015-08-18
6/15
3
x
Example 1. I = e sin ( x )dx estimated with intTrapezoid with desired
0
0.3032642
The integration is indeed accurate to three decimal places. This required 129
function evaluations.
5 Simpson's rule
Simpson's rule integrates a quadratic interpolation of groups of three sampled function values.
Suppose we want to estimate
b
I = f (x)dx
(11)
using h=(ba )/2 and the three samples y 1= f (a ), y 2= f (a+h), y 3= f (a+2 h=b) . Let
x=a+th . Then a xb corresponds to 0t2 and dx=h dt so that
2
I =h f (a+th)dt
(12)
I =h
0
1
1
h
y t(t 1) y 2 t (t2)+ y1 (t1)(t2) dt= ( y 1+4 y 2+ y 3)
2 3
2
3
(13)
To apply this result in general (Fig. 6) we arrange our samples x 1 , x 2 , x3 , x 4 , into adjacent
groups of three
( x 1 , x 2 , x3 ),( x3 , x 4 , x 5) ,(x 5 , x 6 , x 7 ) ,
(14)
(this only works if we have an odd number of samples, which implies an even number of
intervals). We then apply (13)
I=
h
[( y +4 y 2+ y 3)+( y 3+4 y 4+ y5 )+( y5 +4 y 6+ y 7 )++( y n2 +4 y n1+ y n)]
3 1
(15)
Notice that samples at group boundaries, such as y 3 and y 5 , appear twice in the summation.
Therefore
I=
h
[ y +4 y 2+2 y 3+4 y 4 +2 y 5+4 y 6+2 y 7++2 y n2+4 y n1+ y n ]
3 1
(16)
Simpson's rule is fourth-order accurate (error varies as h 4 ). Simpson's rule applied to the eleven
EE 221 Numerical Computing
Scott Hudson
2015-08-18
7/15
ba
[ y 1+ y 3 ]
2
(17)
Suppose we add a sample between the other two: y 2= f ((a+b)/2) . The trapezoid rule with
h=(ba )/2 applied to these three samples gives us
I 2=
ba
[ y 1+2 y 2+ y 3 ]
4
(18)
Now
4 I 2I 1 ba
1
ba 1
1
=
y 1+2 y 2+ y 3 ( y 1+ y 3) =
y 1+2 y 2+ y 3
41
3
2
3 2
2
ba
=
[ y1 +4 y 2+ y3 ]
6
(19)
is Simpson's rule with h=(ba )/2 . We see that if we have one trapezoid-rule estimate I 1
using a step size 2 h and a second I 2 using step size h, then Simpson's rule with step size h can
be calculated as
I=
4 I 2 I 1
3
(20)
Simpson's rule can be thought of as a weighted combination of trapezoid rules with different step
sizes. This idea is generalized by Romberg integration.
6 Romberg integration
Neglecting round-off error, the trapezoid rule would (in principle) produce an exact result in the
EE 221 Numerical Computing
Scott Hudson
2015-08-18
8/15
limit h 0 . Let's call the exact result I 0 . For arbitrary h let's denote the trapezoid-rule estimate
by I (h) . Then I 0 =I (0) . For small but finite h, I (h) will equal the exact result plus some
error, and the error will be a function of the step size h. We can write
2
I (h)=I 0 +a h +b h +
(21)
(It can be shown that the error is an even function of h and therefore involves only even powers.)
Romberg integration is a technique that allows us to subtract off the error terms a h2 , b h 4 , .
Applying the trapezoid rule with step size h /2 we get
I ( h/2)= I 0+a h2 /4+b h4 /16+
(22)
We don't know the value of the coefficient a, so we don't know the first error terms in (21) and
(22). However, we do know that for any value of a
4 (a h2 / 4)=a h2
(23)
4 I ( h/ 2) I ( h)
=I 0b h4 /4+
41
(24)
We have just removed the h 2 error term! Two second-order accurate trapezoid-rule calculations
have been combined to produce a fourth-order accurate result. In fact, as we saw above, this is
just Simpson's rule.
Now run the trapezoid rule with step size h /4 to get
I (h/ 4)=I 0+a h2 /16+b h 4 /256+
(25)
Once again the h 2 error term is 1/ 4 the value of the previous iteration, and we can calculate
4 I ( h/ 4)I (h/2)
=I 0b h 4 /64+
41
(26)
Now we have two results, (24) and (26), that are fourth-order accurate (both are Simpson's rule
calculations). Furthermore, notice that although we don't know the value of the coefficient b, we
do know that
16(b h4 /64)=b h4 / 4
(27)
Therefore
42 ( I 0b h4 /64)(I 0 b h 4 /4)
4 21
=I 0 +
(28)
and we have eliminated both the h 2 and h 4 error terms! This result is sixth-order accurate. We
can continue on in this manner to produce a result accurate to as high an order as we wish.
Here is a useful notation that will allow us to easily code Romberg integration. Define
R( j ,1)=I
( )
ba
2 j 1
(29)
so that R(1,1)=I ( ba) , R(2,1)= I ((ba)/2) , R(3,1)=I ((ba )/4) and so on. Stack these
EE 221 Numerical Computing
Scott Hudson
2015-08-18
9/15
[ ]
(30)
R( 2,2)=
4 R(2,1)R(1,1)
41
(31)
R(3,2)=
4 R(3,1)R(2,1)
41
(32)
R(1,1)
R(2,1)
R(3,1)
Now calculate
and
Just as for (24) and (26), R(2,2) and R(3,2) will be fourth-order accurate results, lacking the
2
h error term. Place these in the second column of our array
R(1,1)
0
R(2,1) R(2,2)
R(3,1) R(3,2)
(33)
Now calculate
R(3,3)=
4 2 R( 3,2)R(2,2)
4 21
(34)
As for (28) this will be sixth-order accurate, lacking both the h 2 and h 4 terms. Place this in the
third column of our array
R(1,1)
0
0
R(2,1) R(2,2)
0
R(3,1) R(3,2) R(3,3)
(35)
The relation between an element in the kth column and the elements in the previous column is
4(k 1) R( j , k 1)R( j1, k 1)
R( j , k )=
4(k1)1
(36)
Suppose we want an eight-order accurate result. Calculate R( 4,1)= I ((ba)/8) and then use
formula (36) to calculate R( 4,2) , R(4,3), R( 4,4) to obtain
R(1,1)
0
0
0
R(2,1) R( 2,2)
0
0
R(3,1) R(3,2) R(3,3)
0
R(4,1) R( 4,2) R(4,3) R(4,4)
(37)
Our eight-order accurate estimate is R( 4,4) . We can continue to add rows in this manner as
many times as desired. The difference R( 4,4)R(3,3) provides an error estimate. Adding these
calculations to the trapezoid-rule algorithm results in
Scott Hudson
2015-08-18
10/15
1
I = I old +h f (a+ih)
2
i=1
odd
j j+1 , R( j ,1)=I
for k=2,3,...,j
R( j , k )=
Example 2.
I = ex sin ( x )dx
0.3034151
0.3034152
4097.
65.
4097 function calls were required by the trapezoid rule while only 65 were
needed for Romberg integration. The trapezoid rule error was 1.5107 while the
Romberg integration error was 7.21011 .
EE 221 Numerical Computing
Scott Hudson
2015-08-18
11/15
7 Gaussian quadrature
Quadrature is an historic term used to describe integration. So far we've assumed f ( x ) is
uniformly sampled along the x axis. The idea behind Gaussian quadrature is to consider
arbitrarily placed x samples. To see why this is a good idea consider the following function
f (x )=a+b x 2+c x 4
(38)
1
1
I = f (x) dx=2 a+ b+ c
3
5
1
(39)
(Note: the integral of an odd power of x over 1x1 vanishes, hence we don't bother to
include odd powers in f (x ) .) Suppose we are allowed to estimate I using three samples of
f (x ) . We could use Simpson's rule to get
1
1
1
1
I Simpson = [ f (1)+4 f (0)+ f (1) ]= [ (a+b+c)+4 a+(a+b+c) ] =2 a+ b+ c
3
3
3
3
(40)
The a and b terms are correct but the c term is not. This is not surprising since Simpson's rule
interpolates the three samples with a quadratic. This is exact for a quadratic function, but the
presence of the x 4 results in error. Now consider the following combination of three f (x )
samples
I Gauss=
[ ()
1
3
5f
+8 f ( 0 )+5 f
9
5
1
1
=2 a+ b+ c
3
5
( )] [(
3
1
=
5
9
9
9
5 a+3 b+ c +(8 a )+ 5 a+3 b+ c
5
5
)]
(41)
This result is exact, I Gauss =I , even though it required only three samples. It turns out that if you
properly choose the n sample points x i and corresponding weights w i you can make
n
i=1
wi f (x i )= f (x)dx
for f ( x ) an arbitrary polynomial of order 2 n1 . The x i turn out to be roots of certain
polynomials, and the formulas for the x i and w i are fairly involved.
estimated using
I = intg(a,b,f);
Scott Hudson
2015-08-18
12/15
Optional output variable err is an estimate of the absolute error. In Matlab the corresponding
function is
I = quadgk(f,a,b);
To use intg in the console you first need to define a function f ( x ) . For example
-->deff('y=f(x)','y=exp(-x)*sin(%pi*x)');
-->I = intg(0,3,f)
I =
0.3034152
3
x
Example 3. I = e sin ( x )dx was estimated in Scilab using
0
deff('y=f(x)','y=exp(-x)*sin(%pi*x)');
I = intg(0,3,f,1e-6)
I =
0.3034152
Only 21 function calls were required to produce a result with error of 0, i.e.,
the exact result and the intg result were equal to within double precision
accuracy.
The integrate function convenient allows you to skip the deff statement as it accepts the
function and variable of integration as string arguments
-->I = integrate('exp(-x)*sin(%pi*x)','x',0,3)
I =
0.3034152
9 Improper integrals
An integral is improper if the integrand has a singularity within the integration interval. For
example
1
sinx x dx
(42)
Here the integrand in undefined at x=0 where it has a 0/0 form. In fact lim
x 0
sin x
f ( x )= x
1
x0
sin x
=1 , so one
x
(43)
x=0
1
dx=
2
2
1x
(44)
becomes infinite as x 1 . In either case a solution would be an integration technique that avoids
evaluating the function at the endpoints. The midpoint rule is a simple example of this type of soEE 221 Numerical Computing
Scott Hudson
2015-08-18
13/15
called open integration formula. The GaussKronrod quadrature method used by intg and
quadgk is also an open formula and will work for functions with singularities at one or both
endpoints. For example
-->integrate('sin(x)/x','x',0,1)
ans =
0.9460831
-->integrate('1/sqrt(1-x^2)','x',0,1)
ans =
1.5707963
For a singularity at x=c , a<c<b , we can break the integral into two
b
Another type of improper integral is one with an infinite limit of integration, such as
x 3 ex dx=6
(45)
One way to treat an integral of this type is by using a change of variable such as
x=ln(1u)
(46)
du
1u
(47)
(ln (1u))3 du
(48)
This is also improper because as u 1 , x =ln (1u) but an open integration formula can
evaluate it
-->integrate('(-log(1-u))^3','u',0,1)
ans =
6.
10 References
1. http://en.wikipedia.org/wiki/Numerical_integration
Scott Hudson
2015-08-18
14/15
//////////////////////////////////////////////////////////////////////
// intTrapezoid.sci
// 2014-11-15, Scott Hudson, for pedagogic purposes
// Trapezoid rule estimation of integral of f(x) from a to b
// Estimated error is <= tol
//////////////////////////////////////////////////////////////////////
function I=intTrapezoid(f, a, b, tol)
n = 1;
h = (b-a);
I = (h/2)*(f(a)+f(b));
converged = 0;
while (~converged)
Iold = I;
n = 2*n;
h = (b-a)/n;
I = 0;
for i=1:2:n-1 //i=1,3,5,... odd values
I = I+f(a+i*h);
end
I = 0.5*Iold+h*I;
if (abs(I-Iold)<=tol)
converged = 1;
end
end
endfunction
Scott Hudson
2015-08-18
15/15
//////////////////////////////////////////////////////////////////////
// intRomberg.sci
// 2014-11-15, Scott Hudson, for pedagogic purposes
// Romberg integration of f(x) from a to b
// Estimated error is <= tol
//////////////////////////////////////////////////////////////////////
function I=intRomberg(f, a, b, tol)
n = 1;
h = (b-a);
I = (h/2)*(f(a)+f(b));
j = 1;
R(j,j) = I;
converged = 0;
while (~converged)
Iold = I;
n = 2*n;
h = (b-a)/n;
I = 0;
for i=1:2:n-1 //i=1,3,5,... odd values
I = I+f(a+i*h);
end
I = 0.5*Iold+h*I;
j = j+1;
R(j,1) = I;
for k=2:j
w = 4^(k-1);
R(j,k) = (w*R(j,k-1)-R(j-1,k-1))/(w-1);
end
if (abs(R(j,j)-R(j-1,j-1))<=tol)
converged = 1;
end
end
I = R(j,j);
endfunction
Scott Hudson
2015-08-18
Lecture 23
Random numbers
1 Introduction
1.1 Motivation
Scientifically an event is considered random if it is unpredictable. Classic examples are a coin
flip, a die roll and a lottery ticket drawing. For a coin flip we can associate 1 with heads and 0
with tails, and a sequence of coin flips then produces a random sequence of binary digits.
These can be taken to describe a random integer. For example, 1001011 can be interpreted as the
binary number corresponding to
6
(1)
For a lottery drawing we can label n tickets with the numbers 0, 1, 2, , n1 . A drawing then
produces a random integer i with 0in1 . In these and other ways random events can be
associated with corresponding random numbers.
Conversely, if we are able to generate random numbers we can use those to represent random
events or phenomena, and this can very useful in engineering analysis and design. Consider the
design of a bridge structure. We have no direct control over what or how many vehicles drive
across the bridge or what environmental conditions it is subject to. A good way to uncover
unforeseen problems with a design is to simulate it being subject to a large number of random
conditions, and this is one of the many motivations for developing random number generators.
How can we be sure our design will function properly? By the way, if you doubt this is a real
problem, read up on the Tacoma Narrows bridge which failed spectacularly due to wind forces
alone [1].
Scott Hudson
2015-08-18
2/12
In the past sequences of random numbers, generated by some random physical process, were
published as books. In fact the book A Million Random Digits with 100,000 Normal Deviates,
produced by the Rand Corporation in 1955, is still in print (628 pages of can't put it down
reading!) as of 2014. It is still the case that if a sequence of truly random numbers is desired they
must be obtained from some physical random process. One can buy plug-in cards that sample
voltages generated by thermal noise in an electronic circuit (such as described at onerng.info),
and the website random.org generates random numbers by sampling radio-frequency noise in the
atmosphere.
There appears to be nothing random about this sequence; it follows an obvious pattern of
incrementing by one. Of course it's possible that a random sequence of numbers just happened to
form this pattern by chance, but intuitively that's not very likely. On the other hand the sequence
1 6 7 4 5 2 3 0
which contains the same eight digits does look somewhat random, although it doesn't take long to
notice an add-one and subtract-three pattern in the last seven digits. In fact it was generated by a
algorithm which is every bit as deterministic as increment by one, and to which we now turn.
y=x (mod m)
(2)
read y equals x modulo m, means that y is the remainder when x is divided by m. For example
3=11 (mod 4)
(3)
11 8+3
=
=2 remainder 3
4
4
(4)
because
Scott Hudson
2015-08-18
3/12
which subtracts off an integer number times n from x leaving the remainder. More directly
-->modulo(11,4) //Scilab
ans =
3.
>> mod(11,4) %Matlab
ans =
3
Another way to think of this is that x (mod m) is the least-significant digit of x expressed in
base-m. For example 127 (mod 10)=7 , and 13 ( mod 8)=5 because 13=181+580=158 .
If x ( mod m)= y (mod m) we say x is congruent to y modulo m and we write
x y ( mod m)
(5)
A linear congruential generator (LCG) is a simple method for generating a permutation of the
integers 0im1 using modular arithmetic. Starting with a seed value x 0 , with 0x 0<m , a
LCG generates a sequence x 1 , x 2 , , x m with
x n+1=(a x n+c ) ( mod m)
(6)
Provided the constants a and c are properly chosen this will be a permutation of the integers
0im1 . If m is a power of 2, then c must be odd and a must be one more than a multiple of
4. For example, if m=23=8 , a=5, c=1 and x 0=0 , then x 1 , x 2 , , x8 is the permutation
1 6 7 4 5 2 3 0
because
(0+1) mod 8=1 , (51+1) mod8=6 , (56+1) mod 8=31 mod 8=7 ,
(57+1) mod 8=36 mod 8=4 , (54+1) mod 8=21 mod 8=5 ,
(55+1) mod8=26 mod 8=2 , (52+1) mod 8=11 mod 8=3
and (53+1) mod 8=16 mod8=0
Since x 8=x 0=0 the permutation will then repeat with x 9=x 1=1 and so on. If we used a seed
value of x 0=4 then we would get the sequence
5 2 3 0 1 6 7 4
which is the same sequence starting from a different digit. A LCG with given a and c values will
always produce the same sequence, and using a different seed value will simply start us off at a
different location in the sequence.
A repeating sequence of eight numbers is probably not of much use, but in practice we use a
large value of m, quite commonly m=232 corresponding to a 32-bit unsigned integer which
are conveniently implemented in digital hardware. Fig. 1 shows plots of sequences generated
with m=210=1024 and parameters a=5 , c=1 (left) and a=9 , c=1 (right). Ten samples from
the latter sequence
Scott Hudson
2015-08-18
4/12
Fig. 1: Output of linear congruential generator ( vertical vs. n horizontal) with m=1024, starting with seed
value x0 =0 . (left) a=5, c=1; (right) a=9,c=1. Repeated patterns are a strong indication that these are not
truly random sequences.
532 693 94 847 456 9 82 739 508 477
don't display an immediately obvious sequential relationship. In Fig. 1 we plot both sequences in
full. In the left graph some of the samples form a very noticeable diagonal line pattern which is a
strong indication that the data are almost certainly not truly random. The right plot lacks such a
glaring red flag. However, on closer inspection one can see repeated patterns throughout the
image (indicated by polygons). It is extremely unlikely that a truly random sequence of numbers
would exhibit such a pattern, and a LCG is unacceptable for a demanding PRNG application
such as cryptography.
In engineering applications we usually want random real numbers x instead of random integers i.
We can easily generate real numbers 0x<1 from integers 0im1 by calculating
x=
i
m
Of course there will only be a discrete set of m such real numbers with spacing 1/m , but if m is
large enough this can provide a good approximation to a real random number generator. In
many programming languages (including Scilab) the command
x = rand();
Scott Hudson
2015-08-18
5/12
in both Scilab and Matlab generates a random real number 0x<1 . Subsequent calls return the
next number in the pseudo-random sequence. To generate an array with dimensions mn use
x = rand(m,n);
On start up both Scilab and Matlab initialize the PRNG to a specific state. Therefore you will
always get the same sequence of numbers. In my current version of Scilab, on start up I will
always obtain the initial result
-->rand(3,1)
ans =
0.2113249
EE 221 Numerical Computing
Scott Hudson
2015-08-18
6/12
0.7560439
0.0002211
while in my current version of Matlab, on start up I will always obtain the initial result
>> rand(3,1)
ans =
0.8147
0.9058
0.1270
To start off at a different place in the sequence you can seed the PRNG as follows
-->rand('seed',i0); //Scilab
>> rand('twister',i0); %Matlab
where i 0 is an integer in the range 0i 0<232 . It can actually be useful to generate the same
random number on separate occasions because it allows interesting simulation results to be
repeated. However, sometimes you want pseudo-random numbers that are different each time
your open and run a program. One way to get a unique seed each time you run a program is to
generate it from the system clock. Recommended ways to do this are
-->rand('seed',getdate('s')); //Scilab
>> rand('twister',sum(100*clock)); %Matlab
to produce a 3-by-1 array. The 'def' string indicates that you want the returned value to be from
the default distribution which is uniform over 0x<1 . To seed this PRNG use the command
-->grand('setsd',i0);
As with the rand() function you can use the system clock as an ever-changing seed
-->grand('setsd',getdate('s'));
Prob { x 1xx 2 }= f x ( x) dx
x1
Scott Hudson
2015-08-18
7/12
So far we have used the rand() and grand() functions to produce x values uniformly
distributed over 0x<1 . The ideal uniform pdf is
f x ( x)= 1 0x<1
0 otherwise
(7)
Let's test this by generating a large number of random values and then plotting a histogram of the
data. To generate a histogram we first divide the x interval of interest into a number of
subintervals or bins. Let's take our bins to be 0x<0.05 , 0.05 x<0.10 , 0.10x<0.15 and
so on up to 0.95 x<1.00 . We then count how many x samples fall in each bin. This number
divided by the total number of samples is an estimate of f x (x) dx over the bin. Dividing by
the width of the bin (0.05 in our case) we get an estimate of the average value of f x ( x) for that
bin.
Generating a histogram plot in Scilab is as simple as
histplot(nbins,x);
where nbins is the number of equal-sized bins we want in the interval [ x min , x max ] . As the
number of samples increases a histogram should give a progressively better estimate of the
underlying pdf of the random (or pseudo-random) process. Running the commands
x = grand(1e3,1,'def');
histplot(20,x);
and
x = grand(1e6,1,'def');
histplot(20,x);
produced the results shown in Fig. 3. We see that the pdf does approach the ideal uniform
Scott Hudson
2015-08-18
8/12
1 y
, < y<
The normal distribution is specified by two parameters: is the mean value (average) of y and
is the standard deviation. The Central Limit Theorem tells us that any process that is the sum
or average of a large number of independent, identically distributed processes will be normally
distributed. Since so many natural phenomena have this property, the normal distribution finds
wide application. In particular noise in measurements is often assumed to be normally
distributed.
1 P / P
e
, P0
P av
av
(8)
Here P av is the average received power. To simulate a wireless communication channel we need
to be able to generate exponentially distributed random values to model fading effects.
( )
4
v
v p vp
( )
e
2 v
vp
, v 0
(9)
where
v p=
2 kT
m
(10)
is the most likely velocity (the peak of the pdf). Here m is the molecular mass and k is
Boltzmann's constant. We need to generate samples from this distribution if we wish to perform
molecular dynamics simulations.
Scott Hudson
2015-08-18
9/12
Fig. 4: f x ( x) x f y ( y) y
y=g ( x)
(11)
(12)
with
This is illustrated in Fig. 4. We want to find the pdf of y over the interval a yb . An x interval
of width x will correspond to a y interval of width y , and y will fall in the y interval if
and only if x falls in the x interval. Therefore we can equate the probabilities
f x ( x) x f y ( y) y
(13)
(14)
Integrating this from 0 to x on the left and correspondingly from a to y on the right, we have
x
x= dx = f y ( y )dy=F y ( y)
0
where F y ( x) is called the cumulative distribution function (cdf) of the random variable y.
Inverting this relation
x=F y ( y)
(15)
we get y=g ( x) .
For example, the exponential distribution (8) has cdf
y
P1
0
eP / P dP=1e y/ P
av
av
(16)
av
y/ P
Solving x=1e
gives us
av
y=P av ln(1x)
EE 221 Numerical Computing
Scott Hudson
(17)
2015-08-18
10/12
Fig. 5: Histograms of exponentially distributed random values generated by y=4 ln(1 x) and ideal pdf.
In Fig. 5 we show histograms of y=4 ln (1x) where x is a uniform random variable. Given
enough sample points the histogram approximates the ideal pdf very well.
Unfortunately, for some pdfs it is not possible to calculate the cdf. Arguably the most important
example is the normal distribution. The integral
(
1
F y ( y)=
e 2
2
y
1 y
) dy
(18)
(19)
Scott Hudson
2015-08-18
11/12
Fig. 6: Histograms of zero-mean, unit-variance, normally distributed random values generated by the
Box-Muller transform and ideal pdf.
In Matlab the randn() function is similar to the rand() function but generates normal
random values with =0 , =1 . In Scilab the grand() function can generate arrays of normal
random variables with specified , when called as follows
Y = grand(m, n, 'nor', Av, Sd);
For example
-->grand(2,3,'nor',2,1)
ans =
1.5434479
0.7914453
3.5310142
1.459521
1.172681
3.5340618
8 References
1. Tacoma Narrows bridge failure: https://www.youtube.com/watch?v=j-zczJXSxnw
2. http://en.wikipedia.org/wiki/Diehard_tests
3. http://csrc.nist.gov/groups/ST/toolkit/random_number.html
4. http://en.wikipedia.org/wiki/Random_number_generation
5. http://en.wikipedia.org/wiki/Mersenne_twister
6. http://en.wikipedia.org/wiki/List_of_probability_distributions
Scott Hudson
2015-08-18
12/12
//////////////////////////////////////////////////////////////////////
// randLCG.sci
// 2014-12-08, Scott Hudson, for pedagocic purposes
// 32-bit linear congruential generator for generating uniformly
// distributed real numbers 0<=x<1.
//////////////////////////////////////////////////////////////////////
global randLCGseed randLCGa randLCGc
randLCGseed = uint32(0);
function x=randLCG(seed)
global randLCGseed randLCGa randLCGc
[nargout,nargin] = argn();
if (nargin==1) //if there is an argument, use it as the seed
randLCGseed = uint32(seed);
end
randLCGseed = uint32(1664525)*randLCGseed+uint32(1013904223);
x = double(randLCGseed)/4294967296.0;
endfunction
//////////////////////////////////////////////////////////////////////
// randMWC.sci
// 2014-12-08, Scott Hudson, for pedagocic purposes
// Multiply-with-carry algorithm for pseudo-random number generation.
// Retuns uniformly distributed real number 0<x<1.
// Reference: http://en.wikipedia.org/wiki/Random_number_generation
//////////////////////////////////////////////////////////////////////
global randMWCs1 randMWCs2
randMWCs1 = uint32(1);
randMWCs2 = uint32(2);
function x=randMWC(seed1,seed2)
global randMWCs1 randMWCs2
[nargout,nargin] = argn();
if (nargin==2) //if there are arguments, use as seeds
randMWCs1 = uint32(seed1); //should not be zero!
randMWCs2 = uint32(seed2); //should not be zero!
end
s = uint32(2^16);
randMWCs1 = 36969*modulo(randMWCs1,s)+randMWCs1/s;
randMWCs2 = 18000*modulo(randMWCs2,s)+randMWCs2/s;
x = double(randMWCs1*s+randMWCs2)/4294967296.0;
endfunction
Scott Hudson
2015-08-18