You are on page 1of 3

This project is for an introduction to statistics class.

It will help me practice the things


Name

Number
of Red
Candies

Number
of orange
candies

Number
of yellow
candies

Number
of green
candies

Number
of purple
candies

Total
Candies

Samuel
11
14
10
15
10
60
TOTAL
292
307
299
297
317
1512
proportion
0.193
0.203
0.198
0.196
0.210
1.000
that I have learned throughout the semester. Using skittles, each student in the class counted the
amount of skittles in a snack size bag (2.17 oz), both the total and by color. Once all of this data
was collected from the entire class, I calculated the total number of skittles from the class, and
the total of each color. Using those totals I calculated the proportion of each color relative to the
total number of candies. The table below is just a part of the entire class data, this one just shows
my personal data and the class totals by color and overall.

Cumulative Proportions
1.2
1
0.8
0.6
0.4
0.2
0

purple

orange

yellow

proportion

green

red

cumulative

SKittles Proportions

0.193
yellow
green

0.210
red

orange

purple

0.203

0.196
0.198

Using the overall data I created a pie


chart and a Pareto chart to visually show the proportions of the candies as color per total. The
chart on the right shows us the proportions of each color using the class data. This is kind of
what I expected to see. As there are five different colors of skittles in the bag, the expected

outcome would be that the total candies be divided into fifths or 1/5 = .20. Looking at the totals
on the chart each one is close to .20. From this data I can assume that with an even larger sample
of skittle bags that those proportions would grow closer and closer to .20. So not much of a
surprise with that one. The Pareto chart to the left shows the proportion of each color
individually, again noting that they are all around .2. The black line shows the cumulative
proportion of the candies as you move to the right, (ie. Directly above a color uses that colors
data and all other colors to the left). Because each of the individual proportions are nigh unto
equal, the growing cumulative increases fairly constantly, and that is why it appears to be a
straight line.
Comparing this class data to
my
own gives me a table looking like
this. From just my own data the
proportions of each color are more
spread out. This right here shows
the
importance of large sample sizes
when collecting data for analysis.
This chart here shows us the descriptive statistics for the class
data. The sample size is the total number of bags, which is 25,
and the other data used was the total number of candies per bag.
So the mean is showing us the average number of candies one
would expect to find in any given 2.17 oz. bag of skittles, with a
standard deviation of 1.76 skittles. Using some of this data like
the maximum and the minimum I can and will create a five
number summary, and using that summary I can and will create a
box plot to represent that data.

Q1

Min
55

60

61

Med

Q3

Max

6263

From this data and chart we can see that the data is skewed left. It is not evenly
distributed. This data is also a representation of the quantitative data in this project.
Quantitative data being data with numerical value, as opposed to categorical data which sorts
data, like the different colors of skittles. Each color, green, red, yellow, orange, and purple is a
different category, but each category has its own quantitative data associated with it, such as the
proportion of each color with respect to the total number of candies which was discussed earlier.
There are still so many things I can do with this data. One of which is create many different
confidence intervals. In statistics, a confidence interval is a type of interval estimate of a
population parameter. It is an observed interval, in principle different from sample to sample, that
frequently includes the parameter of interest if the experiment is repeated. In this instance I will
create three different confidence intervals using the data from my class.

The first confidence interval is to help estimate the proportion of


purple candies. To the right is the data required to create the
confidence interval. n = the total number of candies, x = the total
number of purple candies, p = the proportion of purple candies, =
the confidence level, and E = the margin of error. So using all of this
data we get a confidence interval of:
0.18918 < p < 0.23022
What this means is that whenever somebody buys a bag of skittles,
we are 95% confident that the proportion of skittles in that bag will
be between 18.918% and 23.022%.
Next is a confidence interval of the mean number of candies in a bag.
Again the data is to the right. Most of the symbols are the same as
above, this time however x is included, this represents the mean
number of candies in a bag gathered from my class sample. This time
n = 25 because we are talking about bags of skittles rather than the
total number of skittles. And using this data we get a confidence
interval of:
59.5 < < 61.5
This means that in any given bag of skittles (2.17 ounces), the consumer can be 99% confident
that they will find anywhere from 59.5 skittles to 61.5 skittles in their bag, based upon the data of
this class.
Finally is a confidence interval to represent the standard
deviation of the number of candies in a bag. As per usual the data
is to the right. Using this data one can find a confidence interval
of:
1.32 < < 2.62
This means that in any given bag of skittles we can be 98%
confident that the standard deviation will fall between 1.32 and
2.62.

This project was great. Although I did not complete all the parts it has helped me to see how
applicable statistics is in the real world. If statistics can be applied to skittles it can be applied to
anything. It can be used by companies to help them learn about their product. The skittles
people could use this data to help further their product. I have taken two statistics classes in my
past schooling, but I was never required to do a project like this. This project really helped me to
see the application of math, especially statistics

You might also like