You are on page 1of 16

How to normalize an image to zero mean and unit variance

From: Rob Gens

Date: 7 Apr, 2009 10:46:02

Message: 10 of 13

Reply to this message


Add author to My Watch List
View original format
Flag as spam

> NormalizedArray = (TheArray-mean(TheArray(:))) ./ var(TheArray(:));


>
> To normalize to 0 mean, subtract off the mean. To normalize to unit variance, divide by
> the variance. The variance is independent of the mean, so there is no need to take
> the variance -after- subtracting the mean: the variance of the original array will be
> identical.

I'm afraid that is incorrect. You should divide by the standard deviation which is the square root of the variance.

NormalizedArray = (TheArray-mean(TheArray(:))) ./ sqrt(var(TheArray(:)));

the function ' randn' gives only random nos. but based on mean=0, variance=1
>and S.D=1.
>
>But, I need to vary the variance of this function from 1 to 2, 3,...10. ..to
>simulate a gaussian white noise. May some expert teach me how to do it ?

Here's a simple example:

meanx = 3;
stdx = 5;
x = meanx + stdx*randn(50,1);

14. Normal Probability Distributions


The Normal Probability Distribution is very common in the field of statistics. Whenever you measure things like people's height, weight,
salary, opinions or votes, the graph of the results is very often a normal curve.

The Normal Distribution


On this page...

• Properties of a normal distribution


• Area under the normal curve
• Standard normal distribution
• Percentages of the area under standard normal curve
• The z-Table
• Application - stock market

A random variable X whose distribution has the shape of a normal curve is called a normal random variable.

Normal Curve

This random variable X is said to be normally distributed with mean μ and standard deviation σ if its probability distribution is given by

Properties of a Normal Distribution


1. The normal curve is symmetrical about the mean μ;
2. The mean is at the middle and divides the area into halves;
3. The total area under the curve is equal to 1;
4. It is completely determined by its mean and standard deviation σ (or variance σ2)

Note:

In a normal distribution, only 2 parameters are needed, namely μ and σ2.

Area Under the Normal Curve using Integration


The probability of a continuous normal variable X found in a particular interval [a, b] is the area under the curve bounded by x = a and x
= b and is given by

and the area depends upon the values of μ and σ.

[See Area under a Curve for more information on using integration to find areas under curves. Don't worry - we don't have to perform
this integration - we'll use the computer to do it for us.]

The Standard Normal Distribution


It makes life a lot easier for us if we standardize our normal curve, with a mean of zero and a standard deviation of 1 unit.

If we have the standardized situation of μ = 0 and σ = 1, then we have:

Standard Normal Curve μ = 0, σ = 1

We can transform all the observations of any normal random variable X with mean μ and variance σ to a new set of observations of
another normal random variable Z with mean 0 and variance 1 using the following transformation:

We can see this in the following example.

Example
Say μ = 2 and σ = 1/3 in a normal distribution.

The graph of the normal distribution is as follows:

μ = 2, σ = 1/3

The following graph represents the same information, but it has been standardized so that μ = 0 and σ = 1:
μ = 0, σ = 1

The two graphs have different μ and σ, but have the same shape (if we tweak the axes).

The new distribution of the normal random variable Z with mean 0 and variance 1 (or standard deviation 1) is called a standard normal
distribution. Standardizing the distribution like this makes it much easier to calculate probabilities.

If we have mean μ and standard deviation σ, then

Since all the values of X falling between x1 and x2 have corresponding Z values between z1 and z2, it means:

The area under the X curve between X = x1 and X = x2 equals:

The area under the Z curve between Z = z1 and Z = z2.

Hence, we have the following equivalent probabilities:

P(x1 < X < x2) = P(z1 < Z < z2)

Example
Considering our example above where μ = 2, σ = 1/3, then

One-half standard deviation = σ/2 = 1/6, and

Two standard deviations = 2σ = 2/3

So s.d. to 2 s.d. to the right of μ = 2 will be represented by the area from to . This area is graphed as
follows:

μ = 2, σ = 1/3
The area above is exactly the same as the area

z1 = 0.5 to z2 = 2

in the standard normal curve:

μ = 0, σ = 1

Percentages of the Area Under the Standard Normal Curve


A graph of this standardized (mean 0 and variance 1) normal curve is shown.

In this graph, we have indicated the areas between the regions as follows:

-1 ≤ Z ≤ 1 68.27%

-2 ≤ Z ≤ 2 95.45%

-3 ≤ Z ≤ 3 99.73%

This means that 68.27% of the scores lie within 1 standard deviation of the mean.

This comes from:


Also, 95.45% of the scores lie within 2 standard deviations of the mean.

This comes from:

Finally, 99.73% of the scores lie within 3 standard deviations of the mean.

This comes from:

The total area from -∞ < z < ∞ is 1.

The z-Table
The areas under the curve bounded by the ordinates z = 0 and any positive value of z are found in the z-Table. From this table the area
under the standard normal curve between any two ordinates can be found by using the symmetry of the curve about z = 0. We can also
use Scientific Notebook, as we shall see.

Go here for the actual z-Table.

EXAMPLE 1
Find the area under the standard normal curve for the following, using the z-table. Sketch each one.

(a) between z = 0 and z = 0.78

(b) between z = -0.56 and z = 0

(c) between z = -0.43 and z = 0.78

(d) between z = 0.44 and z = 1.50

(e) to the right of z = -1.33.

Answer

Loading...

EXAMPLE 2
Find the following probabilities:

(a) P(Z > 1.06)

(b) P(Z < -2.15)


(c) P(1.06 < Z < 4.00)

(d) P(-1.06 < Z < 4.00)

Answer

Loading...

EXAMPLE 3
It was found that the mean length of 100 parts produced by a lathe was 20.05 mm with a standard deviation of 0.02 mm. Find the
probability that a part selected at random would have a length

(a) between 20.03 mm and 20.08 mm

(b) between 20.06 mm and 20.07 mm

(c) less than 20.01 mm

(d) greater than 20.09 mm.

Answer

Loading...

EXAMPLE 4
A company pays its employees an average wage of $3.25 an hour with a standard deviation of 60 cents. If the wages are approximately
normally distributed, determine

a. the proportion of the workers getting wages between $2.75 and $3.69 an hour;
b. the minimum wage of the highest 5%.

Answer

Loading...

EXAMPLE 5
The average life of a certain type of motor is 10 years, with a standard deviation of 2 years. If the manufacturer is willing to replace only
3% of the motors that fail, how long a guarantee should he offer? Assume that the lives of the motors follow a normal distribution.

Answer

Loading...
Application - The Stock Market
Sometimes, stock markets follow an uptrend (or downtrend) within 2 standard deviations of the mean. This is called moving within the
linear regression channel.

Here is a chart of the Australian index (the All Ordinaries) from 2003 to Sep 2006.

Image source: incrediblecharts.com.

The upper gray line is 2 standard deviations above the mean and the lower gray line is 2 standard deviations below the mean.

Notice in April 2006 that the index went above the upper edge of the channel and a correction followed (the market dropped).

But interestingly, the latter part of the chart shows that the index only went down as far as the bottom of the channel and then recovered
to the mean, as you can see in the zoomed view below. Such analysis helps traders make money (or not lose money) when investing.

Image source: incrediblecharts.com.

I – Gaussian Noise – Linear Filtering


Gaussian noise is another type of noise commonly encountered in image processing. It gets this name because the noise spectrum (ie: a
histogram of just the image noise over a blank background) has a Gaussian/normal distribution, as shown below.
Probability distribution function - Gaussian distribution (mean = 0, variance = 1)

A normalized Gaussian distribution has a mean (average) value of zero (0), and a variance (spread) of one (1). The mean essentially
points out where the peak of the curve is with respect to the x-axis, and the variance tells us how wide or spread-out the values are from
this point. Consider the curves below, where we play around with the variance and mean of a normalized curve.
Gaussian distribution (mean = 10, variance = 0.5)

Notice how the peak is at the mean value (10), and the smaller variance has caused the curve to be compressed in the horizontal axis
(less spread).
Gaussian distribution (mean = -10, variance = 5)

Now consider the example above. The curve’s peak is centered at the mean (-10), while the increased variance has caused the curve to
spread-out more along the x-axis.

Now, with respect to image processing, the addition of Gaussian noise essentially means that the input image has been altered due to the
addition of a Gaussian random variable with mean and variances , where is the square-root of the variance, also known as the
standard deviation. So, for each pixel in the original source image, the output after the addition of noise will be roughly equal to the sum
of the original pixel value, and a random variable in the neighborhood of the mean value of the noise being added.

Before we continue, we should ask ourselves this: how can we generate Gaussian noise to apply to an image. MATLAB comes with a
tool, normpdf, which allows us to generate a Gaussian/normal distribution easily enough, but we’d prefer to do everything from the
ground-up so that we can learn more from this. One method for generating a Gaussian distribution with a mean of zero (0) and a
variance of one (1) is the Box-Muller method as described in [1].

First, consider two random variables, and , both of which are independent random variables on [0,1]. We can generate from these
values, two separate random variables which will approximately fall within a random distribution with a mean of zero (0) and a variance
of one (1), by using the following formulae:


Once we’ve generated enough Z-values using these formulae (using different random values for and each time, of course), the Z-
values will roughly approximate a normalized Gaussian/normal distribution, according to the central limit theorem [2]. Now, we’d like
to be able to change the mean and variance of this distribution so that we can experiment with Gaussian distributions other than just the
normalized Gaussian distribution (ie: something other than just mean = 0, variance = 1).

This is actually a very simple operation, since we already have a normalized Gaussian distribution. If we add a constant to each Z-value
we generated, we effectively change the mean by the constant. As for the variance, if we multiple each Z-value by a constant, the
variance increases by the square of that constant. So if we want to take a normalized Gaussian distribution (mean = 0, variance = 1), and
we’d like to change the mean to , and the variance to , we simply perform the following operation on each Z-value we generated
previously:

A MATLAB function for generating ‘nVals’ data points that form a Gaussian distribution with a mean of ‘meanVal’ and a variance of
‘varVal’ is shown below, demonstrating how I have implemented this approach to creating arbitrary Gaussian distributions.

%==========================================================================
function output_data = genNormDist(meanVal, varVal, nVals)
% genNormDist - Generates a normal/Gaussian distribution with mean
% of 'meanVal', variance of 'varVal', and 'nVals'
% discrete values in the distribution.
% This method uses the Box-Muller method to create an
% approximation to a normal distribution with mean of
% '0' and a variance of '1'. The data set is then
% multiplied by sqrt(varVal) to change the variance
% to 'varVal', and then has meanVal added to each
% term to adjust the mean, respectively
% output_data - The processed data
% meanVal - The mean
% varVal - The variance
% nVals - The number of values in the distribution
% (C) 2010 Matthew Giassa, <teo@giassa.net> www.giassa.net
%==========================================================================
%Make sure variance isn't zero
if(varVal <= 0)
error('Can not proceed: Variance cannot equal zero or be negative!')
end

%Create a data buffer


tempVector = zeros(1,nVals);

%Use the Box-Muller method to generate pairs of random variables, and


%store them in the tempVector buffer
for counter = 1:2:nVals
%Generate two random variables on [0,1]
U1=rand(1);
U2=rand(1);
%Calculate our Z-values
Z0=sqrt(-2*log(U1))*cos(2*pi*U2);
Z1=sqrt(-2*log(U2))*cos(2*pi*U1);
tempVector(counter) = Z0;
tempVector(counter+1) = Z1;
end

%Adjust the mean and variance


tempVector = tempVector.*sqrt(varVal) + meanVal;

%==========================================================================
% Completed
%==========================================================================
output_data = tempVector;

Now that we have a method for generating a normal distribution, we can apply Gaussian noise to an image by simply “adding” the
distribution to the original source image. One simple approach to this is to generate a Gaussian distribution with a specific mean and
variance, and the number of elements should be equal to the number of pixels in the source image. Then, simply randomize the order of
the individual values in the distribution, and add it as an array to the image buffer. The sample code below demonstrates this in action.

%==========================================================================
function output_image = addGaussianNoise(input_image, meanNoise, varNoise )
% addGaussianNoise - Adds Gaussian noise to an image
% output_image - The processed image
% input_image - The source image data
% (C) 2010 Matthew Giassa, <teo@giassa.net> www.giassa.net
%==========================================================================
%Make a grayscale copy of our input image
I = double(rgb2gray(input_image));

%Determine input image dimensions


[j k] = size(I);

%Create a Gaussian random variable distribution


randVals = genNormDist(meanNoise, varNoise, j.*k);
size(randVals)
%Reshape the image
I = reshape(I,1,[]);
size(I)
%Add the Gaussian noise
I = I + randVals;
mean(I)
var(I)

%Revert the image back from a 1D vector to a 2D image


I = reshape(I,j,k);

%==========================================================================
% Completed
%==========================================================================
output_image = I;

A few sample images are provided below with different means and variances assigned to them using the code above. It should give you
a visual representation of how changing these parameters will affect the final result.

Before and after (mean = 0, variance = 50)


Before and after (mean = 100, variance = 1)
Before and after (mean = 100, variance = 50)

In case you haven’t guessed it by now, changing the mean will change the average brightness of the image, while increasing the variance
will make the image more noisy (ie: the random “speckles” everywhere become more noticeable).

Now that we’ve learned what Gaussian noise is with respect to image processing, along with a means of generating it in arbitrary
amounts, we need a way to remove it. Fortunately, we already know how! We know from the earlier section on image normalization that
we can deal with changes to the average brightness of an image simply by normalizing it. That takes care of the new mean intensity
value. As for the “speckles”, we can remove those by blurring the image with a simple convolution masks. The code for these algorithms
can be found in previous sections of these tutorials, so they will not be reproduced here. An example run-through of this process is
shown below.

You will notice that the input image and the final result are not identical. You could theoretically use varying sizes of convolution masks
for blurring the noisy image in the final step, and calculate separate error vectors for each case to see which one yields the smallest error
value. In most cases, a convolution mask larger than 5×5 will generally tend to blur the image too much and decrease the quality of the
output. As is the case in practically any engineering problem, there are always trade-offs. In this case, how much noise can we remove
without decreasing the output image quality too much.
Simple method for removing Gaussian noise from an image

References
[1] “Box-Muller Transformation — from Wolfram MathWorld.” Wolfram MathWorld: The Web’s Most Extensive Mathematics
Resource. Web. 26 Apr. 2010. <http://mathworld.wolfram.com/Box-MullerTransformation.html>.

[2] “Central Limit Theorem — from Wolfram MathWorld.” Wolfram MathWorld: The Web’s Most Extensive Mathematics Resource.
Web. 26 Apr. 2010. <http://mathworld.wolfram.com/CentralLimitTheorem.html>.

You might also like