Professional Documents
Culture Documents
Nicole Matthews
Mrs. Atkinson
Period 6
25 May 2018
In this project, I analyzed the proportions of colors of skittles per bag. I began
with a pie chart showing the percentage of colors in 31 bags. I then created a Pareto
chart showing the number of colors in 31 bags of skittles, the purpose of this being
to organize and display categorical data. Next I found the average number of skittles
per bag as well as the standard deviation and 5-number summary, the purpose
confidence intervals for the proportion of purple candies and the average of candies
per bag and tested two claims with hypothesis tests, the purpose being to have an
assurance of my answers.
The charts mostly show what I expected to see. I expected that the
proportion of red skittles would be the largest and that the proportion of purple
skittles would be the smallest. I did not, however, expect to see that there is a larger
proportion of yellow skittles than orange skittles overall. In my own bag, there was
a larger proportion of orange skittles. I also thought there would be less green than
400
Red
300 Yellow
200 Green
Orange
100
Purple
0
Red Yellow Green Orange Purple
plot, but skewed slightly right in the frequency histogram. There are 31 bags in the
sample and my bag had 61 skittles. The overall data agrees with the data from my
bag and I thought the number of skittles per bag would be fairly consistent, so the
10
0
58-59 60-61 62-64
Bins: Number of Skittles Per Bag
5-Number
Mean Standard Deviation Summary
59.806 1.138 58
59
60
61
62
Categorical data are qualities of groups. In this case, the qualities or
categories are the different colors of the skittles. Quantitative data deal with
numerical variables. Variations of pie charts and bar graphs are appropriate for
categorical data because each slice of a pie chart and each bar of a bar graph can
represent a different category. A pie chart would not work with quantitative data
because it shows proportions rather than specific quantities. Line graphs might be
okay to use for categorical data as several different lines can represent the different
categories, but they are probably best used for quantitative data for representing
continuous quantities. Histograms and boxplots are also good for displaying
quantities. A boxplot wouldn’t work for categorical data because it shows the 5-
proportions make sense to use with categorical data to compare the categories to
each other. Calculations involving averages make sense to use with quantitative data
to find distributions.
A confidence interval specifies a margin that your data should fall inside of
with a certain level of confidence or surety. I am 95% confident that the true
proportion of purple candies will fall between (0.136, 0.168). I am 99% confident
that the true mean number of candies per bag falls between (59.243, 60.370). These
results are consistent with the before seen charts and graphs. Both my bag and the
overall 31 bags proportion of purple candies fall into the 95% confident interval.
However, while the mean of the sample falls into the 99% confidence interval, my
therefore I fail to reject the null hypothesis. There is sufficient evidence to conclude
that 20% of all skittles are green. The p-value is smaller than a 0.05 significance;
therefore I reject the null hypothesis. There is insufficient evidence to conclude that
the mean number of candies in a bag of skittles in 56. The results of both tests
coincide with the pie chart of 31 bags that says green make up 20% of the skittles
proportions, the sample has to be reasonably random, equal to less than 10% of the
population, and large enough to conduct to appropriate tests for confidence interval
To conduct interval estimates and hypothesis tests for population means, the
sample must be reasonably random, the data has to come from a normal
distribution or large sample, you have to know the standard deviation, and the
sample has to be smaller than 10% of the population. My sample met these
conditions.
Possible errors that may have occurred are that perhaps the mean was too
high or too low due to outliers in the sample and perhaps the sample was not large
enough to provide resiliency and ensure fairly accurate results. Errors could also
include human errors such as calculating the math incorrectly or miscounting the
increasing the sample amount (n) or by expanding the sampling area by buying
skittles bags from several different stores rather than just one.
In completing this project, I was really able to use and further develop my
problem solving skills. Statistics has been my hardest class yet and I had a really
hard time trying to complete this project, but my rule number one was to not give
up. I asked several people for help, I consulted the Internet to try and walk myself
through the different sections, and I just took it one step at a time.
I have learned a lot of applicable skills to other areas of my life through this
project. At some point in my life, I do believe I’ll have to make charts or graphs
similar to the ones I made here and I will already know how to make them. While
my chosen career does not specifically use math, knowing these things may help me
with my writing career such as if I need to write about a character that is a math
genius or if I have to use this knowledge to calculate the way things will work in a
world I create.
The writing portion of this project will also help me in future classes. I had to
look closely and actually observe the data and come to conclusions based off of the
data rather than just do the problem. This will help me in every future class I take to
learn better through observation and to build my own unique identity in the way I
I used to think that we only need basic math like addition and multiplication
to get through life managing people and money and things, but now I see that even
the more complicated math has its application to the real world. Even though I
might not specifically apply these things to my life, the trouble I went through to
learn these things will help me learn more in the future and apply myself even when