You are on page 1of 5

MATH 1040 Term Project

Ashren Olarte

Number of Number of Number of Number of Number of


red candies orange yellow green purple
candies candies candies candies
My data 15 14 11 13 10
Class data 371 355 409 366 387
Class data .197 .188 .217 .194 .205
decimal

Introduction
Every student was required to purchase a 2.17 ounce bag of Skittles and count how many
whole Skittles each bag contained, separating them by color. Each student recorded their data
and submitted their results. The data was then compiled together in an excel spreadsheet by
the professor and given to the students for this project.
Based on the class data, it seems the yellow Skittles have the most amount, followed by the
purple ones, the red ones, the green ones, and the orange ones the lease amount. The pie
chart seems to show how close they are to being equal from one another, but the pareto chart
clearly shows the yellow Skittles have more compared to the orange Skittles. My data compared
to the classs data is a bit different. The most color I had in my bag was red Skittles, which is the
third largest amount in the class data. My least amount of Skittles I had in my bag were the
purple ones, which is the second largest amount for the class data.

Mean: 60.9
Standard Deviation: 2.08
5-Number Summary: 57, 59, 61, 63, 65
In the histogram, the data looks a little more like its skewed to the left, but one could also say
its somewhat symmetrical, or bell-shaped. The boxplot looks symmetrical, so it would mean the
data is symmetrical rather than skewed left. I expected the data to be fairly symmetrical. When
compared to my own data, mine was more skewed left rather than symmetrical. Although my
data is a lot less compared to the entire classs data.

Reflection
Categorical data seems to organize the data into descriptions and labels. Quantitative data has
more to do with numerical organization. Categorical data would best be organized in pie chart
graphs and pareto chart graphs like the ones used because it describes the data better than
showing the numerical values. Quantitative data would be better organized in histograms and
boxplot graphs like the ones above because it shows the numerical values better. The kinds of
calculations that make sense for categorical data are like describing how many certain colors of
skittles are in each bag. For quantitative data, the kinds of calculations would be how many
skittles are in each bag for the sample.

Confidence Interval Estimates


A confidence interval is for an unknown parameter that consists of interval numbers based on a
point estimate. The interval of numbers represents the level of confidence that a true parameter
is being represented by the point estimate is found within the boundaries of the internal a certain
percentage of the time from randomly drawn samples.
Im 99% confident that the true proportion of yellow candies is between .193 and .241.

Im 99% confident that the true mean number of candies per bag is between 60.526 and 61.274.

Hypothesis Tests
Hypothesis testing is used to test statements that have one or more populations. Its a
procedure that is based on sample evidence and probability, used to test statements regarding
a characteristics of one or more populations.
Do not reject the null hypothesis, there is significant evidence that 20% of all Skittles candies
are red.

Reject the null hypothesis, there is no significant evidence that the mean number of candies per
bag is 55.

Reflection
The conditions of doing interval estimates is to see if the point estimate was within the
boundaries of the level of confidence. The conditions of doing hypothesis testing is to see if we
should reject the null hypothesis or not to reject the null hypothesis, and if we have significant
evidence for rejecting or not rejecting the null. Possible errors could be with calculations, sample
size, or if a Type I or Type II error was created. Sampling methods could be improved by
increasing the sample size to decrease the rejection zone.

You might also like