You are on page 1of 30

ESTIMATOR

&
TYPES OF
ESTIMATORS
ABOUT ESTIMATOR
 An Estimator is the statistical function of the
observable sample data that is used to
estimate an unknown parameter (which is
called as Estimand).
 The result from the application of the
function to a particular sample of data is
called as Estimate .
 It is possible to construct many estimators
for a given parameter.
 The performance of an estimator may be
evaluated by using Loss functions.
HOW TO ESTIMATE A
PARAMETER ??
 To estimate a parameter i.e. a population mean
etc.. The usual procedure is as follows :
 Select a random sample from the population of
interest .
 Calculate the Point Estimate of the parameter .
 Calculate a measure of variability, often a
confidence interval .
 Associate with this estimate a measure of
variability .
POINT ESTIMATION
 This Point estimation makes use of sample
data to calculate a single value which is to
serve as a best guess for an unknown
parameter .
 Point estimation should be contrasted with the
general Bayesian methods of estimation where
the goal is usually to compute the posterior
distributions of parameters and other
quantities of interest.
 The contrast here is between estimating a
single point versus estimating a weighted set
of points (PDF).
BAYESIAN METHODS
 Uses aspects of the scientific method, which
involves in collecting evidence that is meant
to be consistent or inconsistent with a given
hypothesis. As evidence accumulates, the
degree of belief in a hypothesis ought to
change.
 Hypothesis with very high support should be
accepted as true and those with low support
is to be rejected .
METHODS TO DERIVE POINT
ESTIMATES DIRECTLY :
 Maximum Likelihood (ML)
 Method of Moments
 Minimum mean squared error (MMSE)
 Minimum variance unbiased estimator(MVUE)
 Best Linear unbiased estimator.

 Now we discuss each of them in detail……


MAXIMUM LIKELIHOOD (MLE)
 This is a popular statistical model used for
fitting a statistical model to data ,and providing
estimates for the model‘s parameters .
  For example, suppose you are interested in the
heights of Americans. You have a sample of
some number of Americans, but not the entire
population, and record their heights.
 Further, you are willing to assume that heights
are normally distributed with some
unknown mean and variance . The sample mean
is then the maximum likelihood estimator of the
population mean, and the sample variance is a
close approximation to the maximum likelihood
estimator of the population variance
MAXIMUM LIKELIHOOD(cont……)
 For a fixed set of data and underlying
probability model, maximum likelihood picks
the values of the model parameters that
make the data "more likely" than any other .
 Maximum likelihood estimation gives a
unique and easy way to determine solution in
the case of the normal distribution and many
other problems, although in very complex
problems this may not be the case.
 If a uniform prior distribution is assumed
over the parameters, the maximum
likelihood estimate coincides with the most
probable values thereof.
METHOD OF MOMENTS
 This is a method of estimation of population
parameters such as mean, variance, median
etc… by equating sample moments with
unobservable population moments and then
solving those equations for the quantities to
be estimated.
 Estimates by the method of moments may be
used as the first approximation to the
solutions of the likelihood equations, and
successive improved approximations may
then be found by the Newton–Raphson
method .
METHOD OF MOMENTS (cont..)
 In some cases, infrequent with large samples
but not so infrequent with small samples, the
estimates given by the method of moments
are outside of the parameter space; it does
not make sense to rely on them then.
 Also, estimates by the method of moments
are not necessarily sufficient statistics, i.e.,
they sometimes fail to take into account all
relevant information in the sample.
MEAN SQUARE ERROR
 MSE of an estimator is one of many ways to
quantify the difference between an estimator
and the true value of the quantity being
estimated.
 The MSE is the second moment (about the
origin) of the error, and thus incorporates both
the variance of the estimator and its bias. For
an unbiased estimator, the MSE is the variance.
  In an analogy to standard deviation, taking the
square root of MSE yields the root mean
squared error or RMSE
 For an unbiased estimator RMSE is called as
standard error.
MEAN SQUARE ERROR

Since MSE is an expectation, it is a scalar and


not a random variable. It may be a function
of a unknown parameter θ, but it does not
depend on any random quantities.
MEAN SQUARE ERROR (cont…)
 An MSE of zero, meaning that the
estimator  predicts observations of the
parameter θ with perfect accuracy, is the
ideal and forms the basis for the least
squares method of regression analysis.
 While particular values of MSE other than
zero are meaningless in and of themselves,
they may be used for comparative purposes.
  The unbiased model with the smallest MSE is
generally interpreted as best explaining the
variability in the observations.
MEAN SQUARE ERROR (cont…)

 Minimizing MSE is a key criterion in selection


estimators. Among unbiased estimators, the
minimal MSE is equivalent to minimizing the
variance, and is obtained by the MVUE
(Minimum variance unbiased estimator) .
 Like variance, mean squared error has the
disadvantage of heavily weighting outliers. This
is a result of the squaring of each term, which
effectively weights large errors more heavily
than small ones. This property, undesirable in
many applications, has led researchers to use
alternatives such as the mean absolute error,
or those based on the median.
MEAN ABSOLUTE ERROR
 In statistics, the mean absolute error is a
quantity used to measure how close forecasts or
predictions are to the eventual outcomes. The
mean absolute error (MAE) is given by

 As the name suggests, the mean absolute error is


an average of the absolute errors ei = fi − yi,
where fi is the prediction and yi the true value.
 The MAE and the RMSE can be used together to
diagnose the variation in the errors in a set of
forecasts. The RMSE will always be larger or equal
to the MAE; the greater difference between them,
the greater the variance in the individual errors in
the sample. If the RMSE=MAE, then all the errors
are of the same magnitude
MINIMUM MEAN SQUARE ERROR
 In statistics and signal processing ,a MMSE
Estimator describes the approach which
minimizes the mean square error, which is a
common measure of estimator quality.
 Let X be an unknown random variable, and Y
be a known random variable(measurement).
 An estimator  X^(y) is any function of the
measurement Y and its MSE is given by
MSE=E{(X^-X)2 }

 The MMSE estimator is defined as the


estimator achieving minimal MSE.
MINIMUM VARIANCE UNBIASED
ESTIMATOR (MVUE)
 It has lower variance than any other
unbiased estimator for all possible values of
the parameter.
 An efficient estimator need not exist, but if
it does, it's the MVUE because MSE is the sum
of variance and bias of an estimator.
 The MVUE minimizes MSE among unbiased
estimators. In some cases biased estimators
have lower MSE because they have a smaller
variance than does any unbiased estimator.
BEST LINEAR UNBIASED ESTIMATOR
 It frequently occurs that the MVU estimator,
even if it exists, cannot be found. For
example, if PDF is not known, theory of
sufficient statistics cannot be applied. Also, if
PDF is known, it doesn’t make ensure
minimum variance.
 In such cases, we have to resort to a
suboptimal estimator approach. We can
restrict the estimator to a linear form that is
unbiased. It should also have minimum
variance.
 An example of this approach is the Best Linear
Unbiased Estimator (BLUE) approach
BEST LINEAR UNBIASED ESTIMATOR (CONT….)

 This a linear model in which the errors have


expectation zero and are uncorrelated and
have equal variances.
METHODS TO DERIVING POINT
ESTIMATES VIA BAYESIAN ANALYSIS
 Maximum A Posteriori (MAP)
 Wiener filter
 Kalman filter
 Particle filter
 Markov chain Monte Carlo (MCMC)
MAXIMUM A POSTERIOR
 Sometimes we have priori information about
PDF of the parameter to be estimated.
 Let Ɵ be the RV and the associated
probabilities are called priori probabilities.
 Bayes theorem shows the way for
incorporating prior information in the
estimation process

 The term on the left hand side is called


posterior ,numerator is product of likelihood
term and the prior term, denominator serves
as a normalization term so that a posterior
PDF sums to unity.
MAXIMUM A POSTERIOR (CONT….)
 In Bayesian statistics MAP is a mode of the
posterior distribution.
 The Bayesian inference produces a maximum
a posterior (MAP) estimate ..
WIENER FILTER
 Wiener filter reduces the amount of noise
present in a signal by comparison with an
estimation of the desired noiseless signal.
 Since ,the filter assumes the inputs as stationary
this filter is not an adaptive filter.
 Wiener filters are characterized by the following:
 Assumption: signal and (additive) noise are
stationary linear stochastic processes with known
spectral characteristics or known autocorrelation
and cross-correlation
 Requirement: the filter must be physically
realizable, i.e. causal .
 Performance criterion: minimum mean-square
error(MMSE)
WIENER FILTER-MODEL
 The input to the filter is assumed to be a
signal, s(t) corrupted by additive noise,
n(t).The output S^(t) is calculated by means
of a filter g(t) using the following
convolution S^(t) =g(t)*(s(t)+n(t))
where
g(t) is the wiener filter’s impulse response
 The error is defined as e(t)=s(t+ α)- S^(t)
where α is the delay of the wiener filter
( since it is casual) .
 In other words, the error is the difference
between the estimated signal and the true
signal shifted by α.
WIENER FILTER
 Clearly the squared error is given by
e2(t)=s2(t+α)-2s(t+ α)s^(t)+s^2(t) where
s(t+ α) is the desired output of the filter e(t) is
the error .
 Depending on the value of α the problem name
can be changed.
 If α>0 then the problem is that of prediction
(error is reduced when s^(t) is similar to a later
value of s) .
 If α=0 then the problem is that of filtering (error
is reduced when s^(t) is similar to s(t).
 If α<0 then the problem is that of Smoothing
(error is reduced when s^(t) is similar to an
earlier value of s)
WIENER FILTER cont….
 The Wiener filter problem has solutions for three
possible cases:
 One where a non-causal filter is acceptable
(requiring an infinite amount of both past and
future data) .
 The case where a causal filter is desired (using
an infinite amount of past data), and
 The FIR case where a finite amount of past data
is used.
 The first case is simple to solve but is not suited
for real-time applications.
 Wiener's main accomplishment was solving the
case where the causality requirement is in effect
MARKOV CHAIN MONTE CARLO
METHODS (MCMC)
 A major limitation towards more widespread
implementation of Bayesian approaches is
that obtaining the posterior distribution
often requires the integration of high-
dimensional functions.
 This can be computationally very difficult in
times.
 MCMC approaches are so named because one
uses the previous sample values to randomly
generate the next sample value , thus
generating a Markov chain.
MARKOV CHAIN MONTE CARLO
METHODS (MCMC) cont…..

You might also like