Professional Documents
Culture Documents
Total marks: 50
Time: 90 min
1. Question 1 [10]
A study was carried out in July 2010 to investigate the profile of soccer enthusiasm
in Bloemfontein. A random sample of 1000 adults in Bloemfontein was drawn.
Among other questions, participants were asked whether or not they had tickets
for at least one of the world cup matches. Participants were cross-classified
according to gender, and whether or not they had at least one world cup ticket.
Of the 329 males in the sample, 159 had a word cup ticket. Of the females, 210
had a world cup ticket.
Summarize the data in a 2 2 table and then answer the following questions
about soccer enthusiasm in Bloemfontein:
Table 1: Data
Have world cup ticket
Gender
Yes
No
Total
Male
159
170
329
Female
210
461
671
Total
369
631
1000
(a) Estimate the probability that a male has a world cup ticket. Is this a
conditional, marginal or joint probability? [2]
Conditional probability: P(Have world cup ticket | male) = 159/329 = 0.483
(b) Estimate the probability that somebody who has no ticket is a female. Is
this a conditional, marginal or joint probability? [2]
Conditional probability: P(Female | Have no ticket) = 461/631 = 0.731
(c) Estimate the probability that an adult in Bloemfontein has a world cup
ticket. Is this a conditional, marginal or joint probability? [2]
Marginal probability: P(Have ticket) = 369/1000 = 0.369
(d) Estimate the probability of being female and having a world cup ticket. Is
this a conditional, marginal or joint probability? [2]
Joint probability: P(Female and Have ticket) = 210/1000 = 0.210
(e) Who was more likely to have a word cup ticket, males or females? Motivate
your answer. [2]
Males are more likely to have a world cup ticket. The probability of a male
having a world cup ticket is estimated as 159/329 = 0.483 (see answer to
question 1(a)); conditional probability of having a world cup ticket given
one is a female is estimated as 210/671 = 0.313, which is less than 0.483.
2. Question 2 [7]
Consider the following Model 2 2 table.
Table 2.1: Model 2 2 Table
Characteristic B
Characteristic A
Present
Absent
Total
Present
n11
n12
n1+
Absent
n21
n22
n2+
Total
n+1
n+2
(a) Under cross-sectional sampling, specify the distribution of the cell frequencies n11 , n12 , n21 and n22 . [1]
Multinomial distribution.
(b) Which total counts are considered fixed under cross-sectional sampling? [1]
Total count n.
(c) Under prospective sampling, specify the distribution of the cell frequency
n21 [1]
Binomial distribution Binomial(n2+ , 2 )
(d) Which total counts are considered fixed under prospective sampling? [1]
Row totals n1+ and n2+ (and therefore also Total count n = n1+ + n2+ .
(e) Under retrospective sampling, specify the distribution of the cell frequency
n12 [1]
Binomial distribution Binomial(n+2 , 2 )
(f) Which total counts are considered fixed under retrospective sampling? [1]
Column totals n+1 and n+2 (and therefore also Total count n = n+1 + n+2 ).
(g) Specify the distribution of the cell frequency n21 when all marginal totals
are fixed. [1]
Hypergeometric distribution
3. Question 3 [4]
Before and during the 2010 World Cup the most convenient way to acquire tickets
for a World Cup match was via the internet, and credit card payment. Therefore, a study was carried out on 27 June 2010 to investigate whether credit card
ownership was associated with having a ticket for Germanys famous World Cup
victory over England (4:1 !!) in Bloemfontein. That is, the research question
was the following: Is an owner of a credit card more likely to have a ticket and
attend the Germany-England match than somebody who did not own a credit
card. In the stadium during the match, a sample of 100 adult spectators were
asked whether or not they owned a credit card. At the some time, but outside
the stadium in the Waterfront shopping centre, a control sample of 100 adults
were asked the same question.
Was this a prospective, a retrospective or a cross-sectional study. Motivate your
answer. [4]
This was a retrospective study. The explanatory variable is ownership of credit
card, and the outcome variable is attendance at the Germany-England match.
Two samples of fixed size (100 adults each) were taken for the two outcome
categories (100 adults attending the match and 100 adults not attending the
match), and then the presence or absence of the explanatory characteristic was
determined.
4. Question 4 [13]
Let be the probability that a student registered for STK114 attends class.
A random sample of size n = 120 of students registered for STK114 is taken,
and the random variable X denotes the number of students in the sample who
are actually found to be in class . We assume that X follows the binomial
distribution. The probability that exactly X = n1 members of the sample are
observed to be in class is denoted by Bin(n1 , n, ), and is given by the probability
function
!
Prob(X = n1 ) = Bin(n1 , n, ) =
n n1
(1 )nn1
n1
n!
n1 (1 )nn1
n1 !(n n1 )!
n1
X
i=0
Bin(i, n, 0 ) =
n1
X
85
X
n!
120!
0i (10 )ni =
0.8i 0.2120i
i!(n
i)!
i!(120
i)!
i=0
i=0
5. Question 5 [16]
A cross-sectional study was carried out to investigate the attitude of students at
the UFS to compulsory class attendance. A random sample of students was drawn
from the total student population, and cross-classified according to whether or
not they were postgraduates, and whether or not they approved of compulsory
class attendance. The data for the sub-sample of students who study Actuarial
Science is as follows:
Table 5.1: Student seniority and approval of compulsory classes
Approve compulsory class attendance
Seniority
Yes
No
Total
Postgraduate student
Undergraduate student
10
Total
10
17
77
= 2.88 < 5
17
(c) Specify the 5 easy steps to test a null-hypothesis for Fishers exact test
for a 2 2 table. [5]
i. Null-hypothesis: H0 : No association between row and column variable
ii. Collect data: The data in Table 2 above
iii. Test statistic: count n11 (say); exact distribution of n11 : Hypergeometric
6
Probability
n11
n12
n21
n22
10
0.0001
0.0036
0.0486
0.2160
0.3779
0.2721
0.0756
0.0062
Calculate the exact two-sided P-value and the exact one-sided P-value for
testing the null-hypothesis. [3]
The two-sided P-value is given by the sum of the probabilities for the observed table (0.0486) plus the probabilities of the more extreme tables,
that is, tables associated with probabilities that are smaller than 0.0486.
Thus P = 0.0486 + 0.0036 + 0.0001 + 0.0062 = 0.0585.
The one-sided P-value is the sum of probabilities P = 0.0486 + 0.0036 +
0.0001 = 0.0522.
(e) What can you conclude from the data in Table 5.1 and from the result of
the hypothesis test? [3]
Considering the observed data, there is an apparent association between
seniority and approval of compulsory class attendance: 5/7=71% of postgraduate students, but only 2/10=20% of undergraduate students approve
7