Econometrics Project Iorganda Beatrice Cristina 133 Revised

- The Academy of Economic Studies The Faculty of Economic Studies in Foreign Languages
Influencial factor of the

price of nail polish
- Econometrics Project -
Student:
Coordinator:
Iorganda Beatrice Cristina
Prof. Dr. Daniela Serban
Group 133, Series A
January, 2014
Introduction
The subject chosen for this project is from a personal curiosity to determine whether price
influences the durability of a product. The product chosen is a product used on a frequent basis
by women- the nail polish. The reason of choosing this type of product is because women are the
respondents that usually take time to answer at questionnaires and deliver the truth behind the
proper consumption.
Fig.1. Nail polish brands

The methodology used is represented by simple and multiple linear regressions, as well as
hypothesis testing. Hypothesis testing is an integral part in producing quality research work and
provides a reliable decision if an effect has occurred in response to the cause.
The potential outcomes of this work can be used in further studies for retailers because the
questionnaire can provide information related to the age of women that use a specific brand, the
average price expenditure index, how long do women hold their hands under water- indicators
that will help the companies provide innovative products which will be in accordance with the
modern woman.
Database description
Data selected from the questionnaire are the three most important factors with which I will
develop this econometrics project, two of which influence the third one.
Durability- x1
Time- x2
Price- y
In this case study, there have been used 37 observations for the sample data.
Hypothesis testing
2
Based on the chosen model, we will conduct 2 hypothesis tests that reflect the importance of
analyzing certain features and assumptions related to our data.
Ist category of hypothesis testing
First of all, I will begin by making a test regarding the most used brand (from the questionnaire
provided most of the women are using Flormar nail polish because of the good relationship
quality-price) which seems to last more than the other brands mentioned. The average durability
of Flormar users is of 4.428571 minutes. Sample results for 23 observations show that the
average durability of other brands (except Flormar) was of 4.152173913 minutes, with a
standard deviation of 1.76734767 minutes. We use hypothesis testing to see whether the
result supports the results of women with an average age of 22 years old.
Survey data:
x1 4.42
x 2 4.15
n1 23
n2 14
s1 7.76
s2 3.12
The computations will be made in minutes
Step1
Initial assumption: Women of an average age of 22 years old believe that Flormar nail polish
lasts long than other brands.
Alternative hypothesis: Those women are wrong and Flormar nail polish doesnt last longer than
other brands.
Step 2
H 0 : 4.428
H 1 : 4.428
Step 3
We are in the case of a both-sided test upon the mean, because of the alternative
hypothesis.
Step 4
The significance level chosen is =5% and therefore the rejection region is (-,
-1.96)
(1.96, ).
3
Step 5
Z calc
x1 x 2
2
s1
s2
n1 n 2
4.42 4.15
7.76 3.12
0.27
10.88
0.08
Step 6
As Zcalc doesnt fall into the rejection region, we decide that we cannot reject Ho .
We do not have enough sample evidence to infere that H1 is true, nor to reject H0.
In 95% of cases we cannot say for cartain that Flormar nail polish lasts longer than
other brands, nor can we say that it lasts less.
IInd category of hypothesis testing

After the analysis made on the current questionnaire, 37% of the women that answered, prefer
Flormar polish than other brands because of the good relationship quality (obtained by the
durability of the product)-price. Sample test was conducted and for 31 observations, the results
show that 14 women believe that Flormar exceeds the other brands durability. We need to verify
whether this result supports the idea of the women that were present at this experiment.
Survey data:
14
0.451 45.15%
31
n 31
p
The computations will be made in minutes

Step 1
Initial assumption: The questionnaire claims that 37% of women prefer Flormar nail polish
because of the price-quality good relationship (it has a high durability)
Alternative hypothesis: The questionnaire is wrong and women dont prefer Flormar nail polish
and because of this, the sample evidence regarding personal preference is less than 37%.
Step 2
H 0 : 37%
H 1 : 37%
Step 3
4
We are in the case of the test upon proportion to the left, because of the alternative
hypothesis.
Step 4
The significance level chosen is =5% and therefore the rejection region is (-,
-1.645).
Step 5
Z calc
p 0
0 (100 0 )
n
45.15 37
37(100 37)
31
8.15
0.108
75.19
Step 6
As Zcalc does not fall into the rejection region (-, -1.645), we decide that we
cannot reject Ho . We have enough sample evidence to reject H1 in 95% of the
cases.
SIMPLE
LINEAR REGRESSION
MODEL
We will firstly analyze the influence of durability upon price. This is a model with 1 regressor.
Consider the general form of the simple linear regression function:
1 2 X 2
Yi
=
Yi
The variables of this model are
X2
and
Yi
= Value of the dependent variable, price
X2
= Value of the independent variable, time

= Residuals that do not have a significant influence upon price
The specific model for this sample is: Price= 6.89 + 1.511Durability +
SUMMARY OUTPUT
Regression Statistics
0.25
Multiple R
87
0.06
R Square
693
5
0.04
027
12.4
494
37
Adjusted R Square
Standard Error
Observations
ANOVA
df
Regression
SS
MS
389.07
98
154.98
65
F
2.510
41
Significan
ce F
0.122091
661
Lower
95%
389.08
35
36
5424.5
5813.6
Coefficie
nts
Standa
rd
Error
t Stat
Pvalue
0.138
69
0.122
09
Residual
Total
Intercept
6.89038
4.5474
1.5152
52
Durability
1.51147
0.954
1.5844
28
2.341229
16
0.425157
03
Upper
95%
16.122
3.4480
88
Lower
95.0%
2.34122
92
0.42515
7
The level of correlation between the variables is shown by multiple-R. In this case, it is 0.25
which doesnt belong to the interval [0.75,1]. This shows a low level of correlation between the
variables.
In order to interpret the coefficients, we have to look first at the intercept. This represents the
predicted value, the price would have if durability was 0. However, since the regressor cannot be
0, the interpretation of the intercept is meaningless.
The slope is 1.511. This shows a positive correlation between Price & Durability. For any
additional unit in Durability, it would result in 1.511 units increase in Price.
In order to test the validity of the model, we shall hypothesis that all values of the Price are the
same.
H 0 : Pr ice1 Pr ice 2 Pr ice 3 ... Pr ice 37
H 1 : Pr icei Pr ice j
Upper
95.0%
16.12199
564
3.448088
081
In order to test this claim, we can compare F calculated with F critical for this model,
but also compare significance F with =5%. Significance F (0.12) >0.05, therefore
we cannot reject H0 and say with a confidence class of 95% that the model is NOT valid.
To test the inference upon the slope, we have to test the validity of the confidence class.
The confidence class is (-0.425, 3.44). This interval contains the value 0, therefore we must test
the validity. We can do this by comparing the p-value (0.12) to (0.05). P-value is higher than
0.05, therefore the inference on this slope is not valid.
Normal Probability Plot

60
40
Price 20
0
0
20
40
60
80
100
120
Sample Percentile
Residual Analysis - Violation of assumptions

The errors are distributed as follows, showing a skewness to the right :
Durability Residual Plot

40
Residuals
20
0
-20
9 10
Durability
Durability Line Fit Plot

100
Price
Price
50
Predicted Price
0
1 2 3 4 5 6 7 8 9 10
Durability
From the residual plot above, we can see that the errors are randomly scattered, therefore there is
no correlation between the errors. From the line fit plot, it can be noticed that the errors arent
equally spread around the mean, therefore the model is heteroskedastic.
Finally, I have conducted a Durbin-Watson test for this model. In the excel file, I have calculated
the d which is 2.22 for the simple regression. dL and dU are 1.217 and 1.322 respectively, thus
d being higher than dU means there is no statistical evidence to show that the errors are
positively autocorrelated.
MULTIPLE LINEAR
REGRESSION
MODEL
We will add another independent variable to our model, time.
1 2 X 2 3 X 3
Yi
=
Yi X 2
The variables of this model are
X3
and
Yi
= Value of the dependent variable, price
X2
= Value of the independent variable, durability
X3
=Value of independent variable, time

= Residuals that do not have a significant influence upon price
The specific model for our sample is: Price = 6.89 + 1.38 Durability - 0.17 Time+
In order to analyze the correlation between the variables, we look at multiple-R. In this case, it is
0.26 which doesnt belong to the interval [0.75, 1]. This shows a low level of correlation between
the variables, but slightly improved by adding an extra regressor.
To interpret the coefficients, we will first look at the intercept. This represents the predicted
value, the price would have if the 2 regressors were 0. However, since the 2 regressors cannot be
0, the interpretation of the intercept is meaningless.
The first slope is 1.38. This shows a positive correlation between Price & Durability. For any
additional unit in Durability, it would result in 1.38 units increase in Price.
The second slope is -0.17. This shows a negative correlation between Price & Time. For any
additional unit in Time, it would result in -0.17 units decrease in Price.
In order to test the validity of the model, we shall hypothesis that all values of the Price are the
same.
H 0 : Pr ice1 Pr ice 2 Pr ice 3 ... Pr ice 37
H 1 : Pr icei Pr ice j
In order to test this claim, we can compare F calculated with F critical for this model,
but also compare significance F with =5%. Significance F (0.28) >0.05, therefore
we cannot reject H0 and say with a confidence class of 95% that the model is NOT
valid.
To test the inference upon the slope, we have to test the validity of the confidence class.
The first confidence class is (-0.684, 3.44). This interval contains the value 0, therefore we must
test the validity. We can do this by comparing the p-value (0.18) to (0.05). P-value is higher
than 0.05, therefore the inference on this slope is not valid.
Normal Probability Plot

100
Price
50
0
0
20
40
60
80
100
120
Sample Percentile
Time Residual Plot

40
Residuals
20
0
-20
10
15
20
25
30
35
Time
The second confidence class is (1.01, 0.67). This interval contains the value 0, therefore we must test the validity. We can do this
by comparing the p-value (0.68) to (0.05). P-value is higher than 0.05, therefore the inference
on this slope is not valid.
10
Time Line Fit Plot

60
40
Price
Price 20
Predicted Price
0
0
10
15
20
25
30
35
Time
Durability Line Fit Plot

100
Price
Price
50
Predicted Price
0
1 2 3 4 5 6 7 8 9 10
Durability
Durability Residual Plot

40
Residuals
20
0
-20
Durability
11
10
From the normal probability, we can see that the distribution of errors is skewed to the right.
Both residual plots present random scattering of the errors, meaning there is no correlation
between the errors. Also, in both line fit plots, the errors are randomly dispersed around the
mean, showing that both models are heteroskedastic.
I have conducted a DW test for this model again and the results were the same as for the simple
regression, with d being higher than dU, meaning there is no statistical evidence to show that
the errors are positively autocorrelated.
Finally, I have analyzed the two independent variables in order to see their coefficient of
correlation, which in this case was -0.31. This shows that the two variables, durability and time,
are negatively correlated in a small percentage. It also shows that the multicollinearity
phenomenon does not occur.
Conclusion
Based on the limited sample evidence and low correlation between the variables, the test must be
repeated because we cannot be sure if time of drying and durability of the nail polish on the nail
are the only factors that influence the price of such a product.
12

Econometrics Project Iorganda Beatrice Cristina 133 Revised

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Econometrics Project Iorganda Beatrice Cristina 133 Revised

Uploaded by

Copyright:

Available Formats

- The Academy of Economic Studies The Faculty of Economic Studies in Foreign Languages

Influencial factor of the

Iorganda Beatrice Cristina

Prof. Dr. Daniela Serban

Group 133, Series A

Fig.1. Nail polish brands

IInd category of hypothesis testing

The computations will be made in minutes

= Value of the independent variable, time

Normal Probability Plot

Residual Analysis - Violation of assumptions

Durability Residual Plot

Durability Line Fit Plot

We will add another independent variable to our model, time.

=Value of independent variable, time

Normal Probability Plot

Time Residual Plot

Time Line Fit Plot

Durability Line Fit Plot

Durability Residual Plot

You might also like