Professional Documents
Culture Documents
1
Outline
1. Point Estimation of Parameter
2. Confidence Interval
3. Testing Hypothesis & Decision
4. Goodness of fit , chi – square test
5. Non Parametric Test
6. Linear Regression Analysis
7. Correlation
2
1. Point Estimation of Parameters
• What?
• Decision?
• Why?
• To make estimation about the population
• Where?
• Everywhere, where decision is to be made
• Who?
• Mangers
• When?
• On demand
• How?
• By estimation and defining the interval
3
1. Point Estimation of Parameters
• Types
• Point Estimation
• Interval Estimation
• Point Estimation
• Used to estimate the population estimate.
• Interval Estimate
• Range of values of population parameters
• Confidence Interval
4
2.Confidence Interval
• It can be constructed using 2 types
• By z Statistics (For Larger data size)
• By t Statistics (For smaller data size)
Z Statistics T Statistics
For n
For n >30
>30 For n<30
For n<30
Uses
Uses Normal
Normal distribution
distribution curve
curve with
with Uses
Uses Normal
Normal distribution
distribution curve
curve tt
values of z
values of z transformation
transformation and
and degree
degree of
of freedom
freedom
5
2.Confidence Interval
Z Statistics T Statistics
Values estimated within the confidence By increasing samples, the values will
level match with Standard normal curve.
Values estimated within the confidence By increasing samples, the values will
level match with Standard normal curve.
6
2.Confidence Interval
• Example (z statastics):
• A researcher has taken a random sample of size 70 from a population
with a sample mean of 35 and a population standard deviation of
4.62. construct a 90% confidence interval to estimate the population
mean.
7
2.Confidence Interval
8
2.Confidence Interval
• Example (t statastics):
• The personal department of an organization wants to apply cost-cutting measure for improving
efficiency. As the first step, the personnel department wants to curtail telephone expenses
incurred by employees. For this, Personal department had taken random sample of 10 employees
and gathered the following data of telephone expenses (in thousand) in previous year;
• 10,12,24,23,11,14,15,34,16,23
• Construct a 95% confidence interal to estimate the average telephone expenses of the employees
in population
9
2.Confidence Interval
10
2.Confidence Interval
• In Excel:
• CONFIDENCE.NORM
• By putting
• Alpha
• Std.Dev.
• Sample Size
• Give us value of
11
3. Testing Hypothesis & Decision
• Assumption about unknown parameter
• Process help us to decide we should accept to reject the hypothesis
• Process: Step 1: Set Null and alternative hypothesis
14
3. Testing Hypothesis & Decision
• Test of Hypothesis
• • Two tailed
• One tailed
• Rejection in one tail
• Z Statistics:
• Testing for large samples n>=30 is based on the assumption that population from which is sample
is drawn, has a normal distribution.
• z formula for a single population mean
• = population mean
• = population standard deviation
• n = number of sample size
15
• x’ = sample mean
3. Testing Hypothesis & Decision
• Test of Hypothesis
• • Two tailed
• One tailed
• Rejection in one tail
• Z Statistics:
• Testing for large samples n>=30 is based on the assumption that population from which is sample
is drawn, has a normal distribution.
• z formula for a single population mean
• = population mean
• = population standard deviation
• n = number of sample size
16
• x’ = sample mean
3. Testing Hypothesis & Decision
• Example
• A marketing research firm conducted a survey 10 years ago and found that the average household income of a particular
geographic region is ₹10,000. Mr.Gupta, who has recently joined the firm as vice president has expressed doubt about the
accuracy of the data. For verifying the data, the firm has decided to take a random sample of 200 households that yield a
sample mean (for household income) of ₹ 11,000. Assume that the population standard deviation of the household
income is ₹1200. Verify Mr.Gupta’s doubts using the seven steps of hypothesis testing. Let α = 0.05
Step 1: Set Null and alternative hypothesis
17
Step 7: Arrive at a statistical conclusion and business implication
3. Testing Hypothesis & Decision
• t Statestics
•
• For a small random sample n<30 to estimate the population mean µ and when the population standard deviation is
unknown and population is normally distributed, t-test can be applied.
• Example:
• Royal tyres has launched a new brand of tyres for tractors and claims that under normal circumstances that average life of
tyre is 40,000 km. A retailer wants to test this claim and has taken a random sample of 8 years. He tests the life of the tyres
under normal circumstances. The result obtained are presented in Table below.
Tyres 1 2 3 4 5 6 7 8
km 35,000 38,000 42,000 41,000 39,000 41,500 43,000 38,500
18
3. Testing Hypothesis & Decision
• Solution
Step 1: Set Null and alternative hypothesis
20
4. Goodness of Fit / X2 Test (Chi
Square Test):
• Example:
•
• A company is concerned about the increasing violent altercation between its employees. The number of
violent incidents recorded by the management during six randomly selected months is given in the table. Use
= 0.06=5
Months to determine whether
Jan the data
Febfits a uniform
March distribution.
April May June
Number of violent incidents 55 65 68 72 80 85
22
4. Goodness of Fit / X2 Test (Chi
Square Test):
• Solution using MS Excel:
• Functions used:
• To get Probability: Formula > Functions > Statistics > CHISQ.TEST
• To get Final Result: Formula > Functions > Statistics > CHISQ.INV.RT
Months fo fe X^2 Probability X^2
Jan 55 70.83333 3.539216
Feb 65 70.83333 0.480392
March 68 70.83333 0.113333
April 72 70.83333 0.019216 0.14701995 8.171765
May 80 70.83333 1.186275
June 85 70.83333 2.833333
Total 425 425 8.171765
23
5. Non Parametric Test
• Distribution free test
• Valid for any distribution
• Used in cases when the kind of distribution is unknown
• Tests to be discussed here:
• Sign Test for Median
• Test of Arbitrary Trend
• Sign Test for Median
• A median of the population is a solution x = µ’ of the equation where F(x) = 0.5 is the distribution function of the population.
• Steps:
• 1. Tests One Population Median, h
• 2. Corresponds to t-Test for 1 Mean
• 3. Assumes Population Is Continuous
• 4. Small Sample Test Statistic: # Sample Values Above (or Below) Median
• 5. Can Use Normal Approximation If n ³ 10
24
5. Non Parametric Test
Solution:
• Example: Here α = 5%
• Suppose that eight radio operators were tested, P+ = P-; P = 0.5
first in rooms without air-conditioning and then in
air-conditioned rooms over the same period of X = No of positive calues along n values
time, and the difference of errors (unconditioned Sample have 8 values, remove 0 from it gives total 6 values,
minus conditioned) were
P (X = 6) = (█(6@6)) (0.5)^6 (0.5)^0
9 4 0 6 4 0 7 11 = (1) (0.0125625) (1)
• Test the hypothesis µ’=0 (that is, air-conditioning = 0.0156
has no effect) against the alternative µ’>0 (that is, = 1.56% < 5% Therefor here , µ’> 0
inferior performance in unconditioned rooms).
The number of errors made in unconditioned rooms is
significantly higher, so the installation of the air condition
should be considered
25
5. Non Parametric Test
T_0 T_2 T_3
12345 12453 12543 23154
T_1 12534 13452 23415
• Test of Arbitrary Trend 12354 13254 13524 24135
31254
12435 13425 14253
31425
• Example: 13245 14235 14325
15234 32145
21345 21354 21453 41235
• A certain machine is used for cutting lengths of wire. Five 21435 Etc.
21534
successive pieces had the lengths 23145
31245
29 31 28 30 32
• Using this sample, test the hypothesis that there is no trend, that
is, the machine does not have the tendency to produce longer and
longer pieces or shorter and shorter pieces. Assume that the type
of machine suggests the alternative that there is positive trend, From this we obtain:
that is, there is the tendency of successive pieces to get longer.
27
6. Simple Linear Regression
Analysis (SLRA)
• Determining the equation:
•
• SLRA is based on the slope intercept equation of line: y = ax + b
• b = y intercept of the line
• a = slope
• SLRA with respect to population parameters β0 & β1 can be given as
• y = β0 + β1x
• β0 = Population y intercept which represent the average value of dependent variable when x = 0 obtained.
• β1 = Slope of the regression line which indicates expected change in the value of y for per unit After b0 & b1 are determined, researcher
change in the value of x can plot the graph and compare with its
original data.
• In case of dependent variable Least square criterion is given by
• y = β0 + β1x + εi
• εi = random error Slope
• b0 = sample y intercept which represent the average value of the independent variable when x = 0
• b1 = slope of the sample regression line
28
6. Simple Linear Regression
Analysis (SLRA)
Regression Model
Sample statistics
• b0 provides estimate
population
Estimate Regression parameters of β0 &
Equation β1
Sample Layout • b0 , b1 ^& y^ is
• x y computed
• x1 y1
• x2 y2
• . .
• . .
• xn yn
29
6. Simple Linear Regression
Analysis (SLRA)
• Example
• A cable wire company has spent heavily on advertisements. The sales and advertisement expenses (in
thousand rupees) for the 12 randomly selected months are given in table. Develop a regression model to
predict the impact of advertisement on sales.
Months Advertisement (in Sales (in thousand
thousand rupees) rupees)
January 92 930
February 94 900
March 97 1020
April 98 990
May 100 1100
June 102 1050
July 104 1150
August 105 1120
September 105 1130
October 107 1200
November 107 1250
December 110 1220
30
6. Simple Linear Regression
Analysis (SLRA)
• Solution:
Step 2
• Step 1
Advertisement Sales (in = 19.07044
Months (in thousand thousand x^2 xy
rupees) x rupees) y = -852.084
January 92 930 8464 85560
February 94 900 8836 84600
March 97 1020 9409 98940
April 98 990 9604 97020
= -852.084 + 19.07 x
May 100 1100 10000 110000
June 102 1050 10404 107100
July 104 1150 10816 119600 This indicates that for each unit increase in x, y is predicted
August 105 1120 11025 117600
Septembe to increase by 19.07 units.
11025 118650
r
October
105
107
1130
1200 11449 128400
b0 indicates the value of y when x = 0
Novembe
11449 133750 When there is not expenditure in advertisement, sales is
r 107 1250
December 110 1220 12100 134200 predicted to decrease by 852.08 thousand rupees.
Total 1221 13060 124581 133540
31
6. Simple Linear Regression
Analysis (SLRA)
• Solution:
• Step 1
Advertisement Sales (in
Months (in thousand thousand x^2 xy
rupees) x rupees) y Regression Analysis
January 92 930 8464 85560 1300
February 94 900 8836 84600
1250
March 97 1020 9409 98940 f(x) = 19.07x - 852.08
April 98 990 9604 97020 1200
May 100 1100 10000 110000
1150
June 102 1050 10404 107100
July 104 1150 10816 119600 1100
August 105 1120 11025 117600 1050
Septembe
11025 118650
r 105 1130 1000
October 107 1200 11449 128400
950
Novembe
11449 133750
r 107 1250 900
December 110 1220 12100 134200
850
Total 1221 13060 124581 133540
800
90 95 100 105 110 115
32
7. Correlation
• Correlation measures the degree of association between two variables
• We will determine the method of finding out correlation between 2 variables using: Karl Pearson’s coefficient of
correlation.
• Karl Pearson’s coefficient of correlation.
•
• r lies between +1 to -1
• Relationship details are shown below,
33
7. Correlation
• Example:
• Table shows the sales revenue and advertisement expenses of a company for the past 10 months. Find the
coefficient of correlation between sales and advertisement.
Sales Advertisement
Month (x) (y)
January 110 10
February 120 11
March 115 12
April 128 13
May 137 11
June 145 10
July 150 9
August 130 10
Septemb
er 120 11
October 115 14
34
7. Correlation Sales Advertisemen
Month (x) t (y) xy x^2 y^2
• Solution: January 110 10 1100 12100 100
February 120 11 1320 14400 121
Coefficient of Correlation : March 115 12 1380 13225 144
April 128 13 1664 16384 169
May 137 11 1507 18769 121
June 145 10 1450 21025 100
July 150 9 1350 22500 81
August 130 10 1300 16900 100
36
Thank You
• Open for discussion
37