You are on page 1of 28

Descriptive Statistics, Cross Tabulation and Hypothesis Testing

15-2

1) Difference between descriptive and inferential statistics

2) Frequency Distribution
3) Statistics Associated with Frequency Distribution i. Measures of Location ii. Measures of Variability

iii. Measures of Shape

15-3

4) Cross-Tabulations i. Two Variable Case ii. Three Variable Case

5) Introduction to Hypothesis Testing


Procedure for Hypothesis Testing

15-4

Internet Usage Data


Respondent Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Sex 1.00 2.00 2.00 2.00 1.00 2.00 2.00 2.00 2.00 1.00 2.00 2.00 1.00 1.00 1.00 2.00 1.00 1.00 1.00 2.00 1.00 1.00 2.00 1.00 2.00 1.00 2.00 2.00 1.00 1.00 Familiarity 7.00 2.00 3.00 3.00 7.00 4.00 2.00 3.00 3.00 9.00 4.00 5.00 6.00 6.00 6.00 4.00 6.00 4.00 7.00 6.00 6.00 5.00 3.00 7.00 6.00 6.00 5.00 4.00 4.00 3.00 Internet Usage 14.00 2.00 3.00 3.00 13.00 6.00 2.00 6.00 6.00 15.00 3.00 4.00 9.00 8.00 5.00 3.00 9.00 4.00 14.00 6.00 9.00 5.00 2.00 15.00 6.00 13.00 4.00 2.00 4.00 3.00 Attitude Toward Usage of Internet Internet Technology Shopping Banking 7.00 6.00 1.00 1.00 3.00 3.00 2.00 2.00 4.00 3.00 1.00 2.00 7.00 5.00 1.00 2.00 7.00 7.00 1.00 1.00 5.00 4.00 1.00 2.00 4.00 5.00 2.00 2.00 5.00 4.00 2.00 2.00 6.00 4.00 1.00 2.00 7.00 6.00 1.00 2.00 4.00 3.00 2.00 2.00 6.00 4.00 2.00 2.00 6.00 5.00 2.00 1.00 3.00 2.00 2.00 2.00 5.00 4.00 1.00 2.00 4.00 3.00 2.00 2.00 5.00 3.00 1.00 1.00 5.00 4.00 1.00 2.00 6.00 6.00 1.00 1.00 6.00 4.00 2.00 2.00 4.00 2.00 2.00 2.00 5.00 4.00 2.00 1.00 4.00 2.00 2.00 2.00 6.00 6.00 1.00 1.00 5.00 3.00 1.00 2.00 6.00 6.00 1.00 1.00 5.00 5.00 1.00 1.00 3.00 2.00 2.00 2.00 5.00 3.00 1.00 2.00 7.00 5.00 1.00 2.00

15-5

Frequency Distribution
In a frequency distribution, one variable is considered at a time. A frequency distribution for a variable produces a table of frequency counts, percentages, and cumulative percentages for all the values associated with that variable.

15-6

Frequency Distribution of Familiarity with the Internet


Value label Not so familiar Value 1 2 3 4 5 6 7 9 TOTAL Frequency (N) 0 2 6 6 3 8 4 1 30 Percentage 0.0 6.7 20.0 20.0 10.0 26.7 13.3 3.3 100.0 Cumulative percentage 0.0 6.9 27.6 48.3 58.6 86.2 100.0

Very familiar Missing

15-7

8 7 6

Frequency Histogram

Frequency

5 4 3 2 1 0 2 3 4 5

Familiarity

Statistics Associated with Frequency Distribution


Measures of Location
n

15-8

X The mean, or average value, is the most commonly used measure of central tendency. The mean, ,is given by X = S X i /n
i =1

Where, Xi = Observed values of the variable X n = Number of observations (sample size) The mode is the value that occurs most frequently. It represents the highest peak of the distribution. The mode is a good measure of location when the variable is inherently categorical or has otherwise been grouped into categories.

Statistics Associated with Frequency Distribution


Measures of Location
The median of a sample is the middle value when the data are arranged in ascending or descending order. If the number of data points is even, the median is usually estimated as the midpoint between the two middle values by adding the two middle values and dividing their sum by 2. The median is the 50th percentile.

15-9

Statistics Associated with Frequency Distribution


Measures of Variability
The range measures the spread of the data. It is simply the difference between the largest and smallest values in the sample. Range = Xlargest

15-10

Xsmallest.

The interquartile range is the difference between the 75th and 25th percentile. For a set of data points arranged in order of magnitude, the pth percentile is the value that has p% of the data points below it and (100 - p)% above it.

Statistics Associated with Frequency Distribution


Measures of Variability

15-11

The variance is the mean squared deviation from the mean. The variance can never be negative. The standard deviation is the square root of the variance. n (Xi - X)2 sx = i =1 n - 1

The coefficient of variation is the ratio of the standard deviation to the mean expressed as a percentage, and is a unitless measure of relative variability.

CV = sx/X

Statistics Associated with Frequency Distribution


Measures of Shape
Skewness. The tendency of the deviations from the mean to be larger in one direction than in the other. It can be thought of as the tendency for one tail of the distribution to be heavier than the other. Kurtosis is a measure of the relative peakedness or flatness of the curve defined by the frequency distribution. The kurtosis of a normal distribution is zero. If the kurtosis is positive, then the distribution is more peaked than a normal distribution. A negative value means that the distribution is flatter than a normal distribution.

15-12

15-13

Skewness of a Distribution Figure 15.2


Symmetric Distribution

Skewed Distribution

Mean Median Mode (a) Mean Median Mode (b)

15-14

Cross-Tabulation
While a frequency distribution describes one variable at a time, a cross-tabulation describes two or more variables simultaneously. Cross-tabulation results in tables that reflect the joint distribution of two or more variables with a limited number of categories or distinct values.

15-15

Gender and Internet Usage


Gender Internet Usage Light (1) Heavy (2) Column Total Male 5 10 15 Female 10 5 15 Row Total 15 15

15-16

Internet Usage by Gender


Gender Internet Usage Light Heavy Column total Male 33.3% 66.7% 100% Female 66.7% 33.3% 100%

15-17

Gender by Internet Usage


Internet Usage Gender Male Female Light 33.3% 66.7% Heavy 66.7% 33.3% Total 100.0% 100.0%

15-18

Purchase of Fashion Clothing by Marital Status


Purchase of Fashion Clothing High Low Column Number of respondents Current Marital Status Married 31% 69% 100% 700 Unmarried 52% 48% 100% 300

15-19

Purchase of Fashion Clothing by Marital Status


Pur chase of Fashion Clothing High Low Column totals Number of cases Marr ied 35% 65% 100% 400 Male Not Mar r ied 40% 60% 100% 120 Sex Mar r ied 25% 75% 100% 300 Female Not Mar r ied 60% 40% 100% 180

15-20

Ownership of Expensive Automobiles by Education Level


Own Expensive Automobile College Degree Yes No Column totals Number of cases 32% 68% 100% 250 Education No College Degree 21% 79% 100% 750

15-21

Ownership of Expensive Automobiles by Education Level and Income Levels


Income Own Expensive Automobile Low Income High Income

College Degree

No College Degree 20% 80% 100% 700

College Degree

No College Degree

Yes No Column totals Number of respondents

20% 80% 100% 100

40% 60% 100% 150

40% 60% 100% 50

Desire to Travel Abroad by Age


Desire to Travel Abroad Less than 45 Yes No Column totals Number of respondents 50% 50% 100% 500 Age 45 or More 50% 50% 100% 500

15-22

15-23

Desire to Travel Abroad by Age and Gender


Desir e to Tr avel Abr oad < 45 Yes No Column totals Number of Cases 60% 40% 100% 300 Male Age >=45 40% 60% 100% 300 Sex Female Age <45 35% 65% 100% 200 >=45 65% 35% 100% 200

15-24

Eating Frequently in Fast-Food Restaurants by Family Size


Eat Frequently in FastFood Restaurants Small Yes No Column totals Number of cases 65% 35% 100% 500 Family Size Large 65% 35% 100% 500

Eating Frequently in Fast FoodRestaurants by Family Size & Income


Income Eat Frequently in FastFood Restaurants Low High Family size Small Large 65% 65% 35% 35% 100% 100% 250 250

15-25

Family size Small Large Yes 65% 65% No 35% 35% Column totals 100% 100% Number of respondents 250 250

Steps Involved in Hypothesis Testing Formulate H and H


0 1

15-26

Select Appropriate Test

Choose Level of Significance


Collect Data and Calculate Test Statistic

Determine Critical Value of Test Statistic

If the calculated value is less than the critical value, accept the null hypothesis otherwise reject it

Reject or Do not Reject H0 Draw Marketing Research Conclusion

Hypothesis Tests

Parametric Tests (Metric Tests)


One Sample * t test * Z test Two or More Samples

Non-parametric Tests (Nonmetric Tests) One Sample * * * * Chi-Square K-S Runs Binomial Two or More Samples

Independent Samples * Two-Group t test * Z test

Paired Samples * Paired t test

Independent Samples * Chi-Square * Mann-Whitney * Median * K-S

* * * *

Paired Samples Sign Wilcoxon McNemar Chi-Square

15-28

You might also like