You are on page 1of 12

- The Academy of Economic Studies The Faculty of Economic Studies in Foreign Languages

Influencial factor of the


price of nail polish
- Econometrics Project -

Student:

Coordinator:

Iorganda Beatrice Cristina

Prof. Dr. Daniela Serban

Group 133, Series A

January, 2014

Introduction

The subject chosen for this project is from a personal curiosity to determine whether price
influences the durability of a product. The product chosen is a product used on a frequent basis
by women- the nail polish. The reason of choosing this type of product is because women are the
respondents that usually take time to answer at questionnaires and deliver the truth behind the
proper consumption.

Fig.1. Nail polish brands


The methodology used is represented by simple and multiple linear regressions, as well as
hypothesis testing. Hypothesis testing is an integral part in producing quality research work and
provides a reliable decision if an effect has occurred in response to the cause.
The potential outcomes of this work can be used in further studies for retailers because the
questionnaire can provide information related to the age of women that use a specific brand, the
average price expenditure index, how long do women hold their hands under water- indicators
that will help the companies provide innovative products which will be in accordance with the
modern woman.

Database description
Data selected from the questionnaire are the three most important factors with which I will
develop this econometrics project, two of which influence the third one.
Durability- x1
Time- x2
Price- y
In this case study, there have been used 37 observations for the sample data.

Hypothesis testing
2

Based on the chosen model, we will conduct 2 hypothesis tests that reflect the importance of
analyzing certain features and assumptions related to our data.
Ist category of hypothesis testing
First of all, I will begin by making a test regarding the most used brand (from the questionnaire
provided most of the women are using Flormar nail polish because of the good relationship
quality-price) which seems to last more than the other brands mentioned. The average durability
of Flormar users is of 4.428571 minutes. Sample results for 23 observations show that the
average durability of other brands (except Flormar) was of 4.152173913 minutes, with a
standard deviation of 1.76734767 minutes. We use hypothesis testing to see whether the
result supports the results of women with an average age of 22 years old.
Survey data:

x1 4.42
x 2 4.15
n1 23
n2 14
s1 7.76
s2 3.12
The computations will be made in minutes
Step1
Initial assumption: Women of an average age of 22 years old believe that Flormar nail polish
lasts long than other brands.
Alternative hypothesis: Those women are wrong and Flormar nail polish doesnt last longer than
other brands.
Step 2

H 0 : 4.428
H 1 : 4.428

Step 3
We are in the case of a both-sided test upon the mean, because of the alternative
hypothesis.
Step 4
The significance level chosen is =5% and therefore the rejection region is (-,

-1.96)

(1.96, ).
3

Step 5

Z calc

x1 x 2
2

s1
s2

n1 n 2

4.42 4.15
7.76 3.12

0.27
10.88

0.08

Step 6
As Zcalc doesnt fall into the rejection region, we decide that we cannot reject Ho .
We do not have enough sample evidence to infere that H1 is true, nor to reject H0.
In 95% of cases we cannot say for cartain that Flormar nail polish lasts longer than
other brands, nor can we say that it lasts less.

IInd category of hypothesis testing


After the analysis made on the current questionnaire, 37% of the women that answered, prefer
Flormar polish than other brands because of the good relationship quality (obtained by the
durability of the product)-price. Sample test was conducted and for 31 observations, the results
show that 14 women believe that Flormar exceeds the other brands durability. We need to verify
whether this result supports the idea of the women that were present at this experiment.
Survey data:

14
0.451 45.15%
31
n 31
p

The computations will be made in minutes


Step 1
Initial assumption: The questionnaire claims that 37% of women prefer Flormar nail polish
because of the price-quality good relationship (it has a high durability)
Alternative hypothesis: The questionnaire is wrong and women dont prefer Flormar nail polish
and because of this, the sample evidence regarding personal preference is less than 37%.
Step 2
H 0 : 37%
H 1 : 37%
Step 3
4

We are in the case of the test upon proportion to the left, because of the alternative
hypothesis.
Step 4
The significance level chosen is =5% and therefore the rejection region is (-,
-1.645).
Step 5

Z calc

p 0

0 (100 0 )
n

45.15 37
37(100 37)
31

8.15
0.108
75.19

Step 6
As Zcalc does not fall into the rejection region (-, -1.645), we decide that we
cannot reject Ho . We have enough sample evidence to reject H1 in 95% of the
cases.

SIMPLE

LINEAR REGRESSION

MODEL

We will firstly analyze the influence of durability upon price. This is a model with 1 regressor.
Consider the general form of the simple linear regression function:

1 2 X 2

Yi
=

Yi
The variables of this model are

X2
and

Yi
= Value of the dependent variable, price

X2

= Value of the independent variable, time


= Residuals that do not have a significant influence upon price

The specific model for this sample is: Price= 6.89 + 1.511Durability +

SUMMARY OUTPUT
Regression Statistics
0.25
Multiple R
87
0.06
R Square
693
5

0.04
027
12.4
494
37

Adjusted R Square
Standard Error
Observations

ANOVA
df
Regression

SS

MS
389.07
98
154.98
65

F
2.510
41

Significan
ce F
0.122091
661

Lower
95%

389.08

35
36

5424.5
5813.6

Coefficie
nts

Standa
rd
Error

t Stat

Pvalue
0.138
69
0.122
09

Residual
Total

Intercept

6.89038

4.5474

1.5152
52

Durability

1.51147

0.954

1.5844
28

2.341229
16
0.425157
03

Upper
95%
16.122
3.4480
88

Lower
95.0%
2.34122
92
0.42515
7

The level of correlation between the variables is shown by multiple-R. In this case, it is 0.25
which doesnt belong to the interval [0.75,1]. This shows a low level of correlation between the
variables.
In order to interpret the coefficients, we have to look first at the intercept. This represents the
predicted value, the price would have if durability was 0. However, since the regressor cannot be
0, the interpretation of the intercept is meaningless.
The slope is 1.511. This shows a positive correlation between Price & Durability. For any
additional unit in Durability, it would result in 1.511 units increase in Price.
In order to test the validity of the model, we shall hypothesis that all values of the Price are the
same.
H 0 : Pr ice1 Pr ice 2 Pr ice 3 ... Pr ice 37
H 1 : Pr icei Pr ice j

Upper
95.0%
16.12199
564
3.448088
081

In order to test this claim, we can compare F calculated with F critical for this model,
but also compare significance F with =5%. Significance F (0.12) >0.05, therefore
we cannot reject H0 and say with a confidence class of 95% that the model is NOT valid.

To test the inference upon the slope, we have to test the validity of the confidence class.
The confidence class is (-0.425, 3.44). This interval contains the value 0, therefore we must test
the validity. We can do this by comparing the p-value (0.12) to (0.05). P-value is higher than
0.05, therefore the inference on this slope is not valid.

Normal Probability Plot


60
40
Price 20
0
0

20

40

60

80

100

120

Sample Percentile

Residual Analysis - Violation of assumptions


The errors are distributed as follows, showing a skewness to the right :

Durability Residual Plot


40
Residuals

20
0
-20

9 10

Durability

Durability Line Fit Plot


100
Price

Price

50

Predicted Price

0
1 2 3 4 5 6 7 8 9 10
Durability

From the residual plot above, we can see that the errors are randomly scattered, therefore there is
no correlation between the errors. From the line fit plot, it can be noticed that the errors arent
equally spread around the mean, therefore the model is heteroskedastic.
Finally, I have conducted a Durbin-Watson test for this model. In the excel file, I have calculated
the d which is 2.22 for the simple regression. dL and dU are 1.217 and 1.322 respectively, thus
d being higher than dU means there is no statistical evidence to show that the errors are
positively autocorrelated.

MULTIPLE LINEAR

REGRESSION

MODEL

We will add another independent variable to our model, time.

1 2 X 2 3 X 3

Yi
=

Yi X 2
The variables of this model are

X3
and

Yi
= Value of the dependent variable, price

X2
= Value of the independent variable, durability

X3

=Value of independent variable, time


= Residuals that do not have a significant influence upon price

The specific model for our sample is: Price = 6.89 + 1.38 Durability - 0.17 Time+

In order to analyze the correlation between the variables, we look at multiple-R. In this case, it is
0.26 which doesnt belong to the interval [0.75, 1]. This shows a low level of correlation between
the variables, but slightly improved by adding an extra regressor.
To interpret the coefficients, we will first look at the intercept. This represents the predicted
value, the price would have if the 2 regressors were 0. However, since the 2 regressors cannot be
0, the interpretation of the intercept is meaningless.
The first slope is 1.38. This shows a positive correlation between Price & Durability. For any
additional unit in Durability, it would result in 1.38 units increase in Price.
The second slope is -0.17. This shows a negative correlation between Price & Time. For any
additional unit in Time, it would result in -0.17 units decrease in Price.
In order to test the validity of the model, we shall hypothesis that all values of the Price are the
same.
H 0 : Pr ice1 Pr ice 2 Pr ice 3 ... Pr ice 37
H 1 : Pr icei Pr ice j

In order to test this claim, we can compare F calculated with F critical for this model,
but also compare significance F with =5%. Significance F (0.28) >0.05, therefore
we cannot reject H0 and say with a confidence class of 95% that the model is NOT
valid.

To test the inference upon the slope, we have to test the validity of the confidence class.
The first confidence class is (-0.684, 3.44). This interval contains the value 0, therefore we must
test the validity. We can do this by comparing the p-value (0.18) to (0.05). P-value is higher
than 0.05, therefore the inference on this slope is not valid.

Normal Probability Plot


100
Price

50
0
0

20

40

60

80

100

120

Sample Percentile

Time Residual Plot


40
Residuals

20
0
-20

10

15

20

25

30

35

Time

The second confidence class is (1.01, 0.67). This interval contains the value 0, therefore we must test the validity. We can do this
by comparing the p-value (0.68) to (0.05). P-value is higher than 0.05, therefore the inference
on this slope is not valid.

10

Time Line Fit Plot


60
40

Price

Price 20

Predicted Price

0
0

10

15

20

25

30

35

Time

Durability Line Fit Plot


100
Price

Price

50

Predicted Price

0
1 2 3 4 5 6 7 8 9 10
Durability

Durability Residual Plot


40
Residuals

20
0
-20

Durability

11

10

From the normal probability, we can see that the distribution of errors is skewed to the right.
Both residual plots present random scattering of the errors, meaning there is no correlation
between the errors. Also, in both line fit plots, the errors are randomly dispersed around the
mean, showing that both models are heteroskedastic.
I have conducted a DW test for this model again and the results were the same as for the simple
regression, with d being higher than dU, meaning there is no statistical evidence to show that
the errors are positively autocorrelated.
Finally, I have analyzed the two independent variables in order to see their coefficient of
correlation, which in this case was -0.31. This shows that the two variables, durability and time,
are negatively correlated in a small percentage. It also shows that the multicollinearity
phenomenon does not occur.

Conclusion
Based on the limited sample evidence and low correlation between the variables, the test must be
repeated because we cannot be sure if time of drying and durability of the nail polish on the nail
are the only factors that influence the price of such a product.

12

You might also like