You are on page 1of 14

Significance of P-Value, Box Whisker plots in statistical testing

Hypothesis Test
Hypothesis: Abstract claim which cannot be proved. For example, claiming that a new algorithm is better than the current algorithm tested on same set of data. the null hypothesis, denoted H0, alternative hypothesis, denoted H1 The experiment has been carried out in an attempt to disprove or reject the null hypothesis, thus we give that one priority so it cannot be rejected unless the evidence against it is sufficiently strong. For example, H0: There is no difference in mean overlapping fraction between algorithm A1 and algorithm A2 against H1: There is a difference.

The outcome of a hypothesis test is "Reject H0in favour of H1" or "Do not reject H0".

P-value
Assume the null hypothesis H0, is true The probability value (p-value) of a statistical hypothesis test is the probability of getting a value of the test statistic as extreme as than that observed by chance alone. It is the probability of wrongly rejecting the null hypothesis if it is in fact true. The p-value is compared with the actual significance level of our test and, if it is smaller, the result is statistically significant. That is, if the null hypothesis were to be rejected at the 5% significance level, this would be reported as "p < 0.05".

Example:Significance level, alpha= 5% If p<0.05 ==> less overlapbetween distributionsofmean overlapping fraction of algorithm A1 and A2 ==> Reject Null Hypothesis, Ho in favor of Alternate Hypothesis, H1 If p>=0.05 ==> high overlap between distributionsofmean overlapping fraction of algorithm A1 and A2 ==> Accept Null Hypothesis,Ho against Alternate Hypothesis, H1 Where, Null Hypothesis, Ho= There is no difference between mean overlapping fraction of algorithm A1 and A2 on average & Alternate Hypothesis ,H1=There is difference between mean overlapping fraction of algorithm A1 and A2 on average

P-Value Approach
Assume that the null hypothesis is true. The P-Value is the probability of observing a sample mean that is as or more extreme than the observed.

0 of test: How to compute the P-Value for eachxtype


Step 1: Compute the test statistic

z0 =

/ n

Two-tail

Right Tail

Left Tail

The Decision Rule Using the P-Value Approach

If the P-Value is greater than the significance level , do not reject the null hypothesis. If the P-Value is smaller than the significance level , reject the null hypothesis.

Reasoning of the P-Value Approach


For the P-Value approach, instead of comparing the z-scores of the test statistic and the critical value determine by . We compare the areas or probabilities related to each score. Why is it the same?

Decision: Reject the Null Hypothesis 1. Left tail test

If the test statistic is smaller than z , then the area to the left of the test statistic (P-Value) would be smaller than .

3. Right tail test

If the test statistic is greater than z ,, the area to the right of the test statistic (P-Value) is smaller than .

5. Two-tail test

If the test statistic is smaller than z or larger than z , the area to the left of the test statistic combine with the area to the right of the test statistic (P-Value) is smaller than .

Decision: Do not reject the Null Hypothesis

4. Left tail test

If the test statistic is greater than z , then the area to the left of the test statistic (P-Value) would be greater than .

6. Right tail test

If the test statistic is less than z ,, the area to the right of the test statistic (P-Value) is greater than .

8. Two-tail test

If the test statistic is greater than z or smaller than z , the area to the left of the test statistic combine with the area to the right of the test statistic (P-Value) is greater than .

Right-Tail Test Example


In 1990, the average farm size in Kansas was 694 acres, according to data obtained from the U.S. Department of Agriculture. A researcher claims that farm sizes are larger now due to consolidation of farms. She obtains a random sample of 40 farms and determines the mean size to be 731 acres. Assume that = 212 acres. Test the researchers claim at the = 0.05 level of significance.

Step 1: Find the test statistic

z0 =

x 0 / n

z0 =

731 694 =1.11 212 / 40

Step 2: Find the P-Value

P(Z>zo)=P(Z>1.11)=0.1335

Step 3: Compare to the P-Value The P-Value is larger than

Step 4: State your decision

Do not reject the null hypothesis. There is not sufficient evidence at the = 0.05 level of significance to support the researchers claim that the farm sizes are larger.

Left-Tail Test Example


An energy official claims that the oil output per well in the United States has declined from the 1998 level of 11.1 barrels per day. He randomly samples 50 well throughout the United States and determines the mean output to be 10.7 barrels per day. Assume that = 1.3 barrels. Test the researchers claim at the = 0.05 level of significance.

Step 1: Find the test statistic

x 0 z0 = / n
Step 2: Find the P-Value

z0 =

10.7 11.1 = 2.18 1.3 / 50

P(Z<zo)=P(Z<-2.18)=0.0146

Step 3: Compare to the P-Value The P-Value is smaller than

Step 4: State your decision

Reject the null hypothesis. There is sufficient evidence at the = 0.05 level of significance to support the researchers claim that the oil output per well has declined.

Two-Tail Test Example


The average daily volume of Dell Computer stock in 2000 was 31.9 million shares according to Yahoo!Finance. Based on a random sample of 35 trading days in 2004, the sample mean of number of shares traded is found to be 23.5 million. Is the volume of Dell stock different in 2004. Assume that = 14.8 million shares. Use = 0.05 level of significance.

Step 1: Find the test statistic

x 0 z0 = / n
Step 2: Find the P-Value

23.5 31.8 z0 = = 3.32 14.8 / 35

P(Z<-zo or Z>z0)=P(Z<-3.32 or Z>3.32)= 2P(Z>3.32)=0.001

Step 3: Compare to the P-Value The P-Value is smaller than

Step 4: State your decision

Reject the null hypothesis. There is sufficient evidence at the = 0.05 level of significance to conclude that the volume of the Dell stock was different in 2004.

Example
x=normrnd(.9,.02,1,10); Normrnd=Random arrays from normal distribution Mean=.9, Sigma =.02, size of x is 1x10 [h,p]=ttest(x,.89) h= p= 1 0.0289

As p<.05, we reject null hypothesis. There is sufficient evidence at the = 0.05 level of significance to support the researchers claim that mean value is increased There is statistical significant difference.

Box and Whisker Plot


A diagram that summarizes data using the median, the upper and lower quartiles, and the extreme values (outliers). The box shows us the middle values of a variable, while the whiskersstretch to the greatest and lowest value of that variable. Median -the middle of the data when it is arranged in order from least to greatest. Lower quartile or 25th percentile - the median of the lower half of the data. Upper quartile or 75th percentile -the median of the upper half of the data. Minimum value -the smallest observation value. Maximum value -the largest observation value.

Example x=normrnd(.9,.02,1,10); boxplot(x); Normrnd=Random arrays from normal distribution Mean=.9, Sigma =.02, size of x is 1x10 Maximum whisker length w. The default is a w of 1.5. Points are drawn as outliers if they are larger than q3 + w(q3 q1) or smaller than q1 w(q3 q1), where q1 and q3 are the 25th and 75th percentiles, respectively.

Box plot

You might also like