Professional Documents
Culture Documents
Module 1
STA60004
Contents
Learning Objectives..............................................................................................
Optional Reading..................................................................................................
Research Designs................................................................................................
2015 Semester 2
STA60004
Learning Objectives:
On completion of this topic you will be able to:
1. Explain the characteristics, benefits, and concerns of basic research designs;
2. Explain the difference between different research designs;
3. Understand basic steps involved in developing a survey project;
4. Formulate and clarify research objectives;
5. Outline the difference between explanatory and descriptive research;
6. Explain the ethical principles that should guide survey research design;
7. Describe the main components of an informed consent form.
Optional Reading
Bryman, A. (2012). Social research methods (4th ed.). Oxford University Press
(Chapters 1 and 2).
(The book is available at Swinburne Library and Swinburne Bookshop.)
Chapter 1: The nature and process of social research.
http://onlineres.swin.edu.au.ezproxy.lib.swin.edu.au/1134862.pdf
De Vaus, D.A. (2002). Surveys in social research (5th ed.). Sydney: Allen & Unwin
(Chapters 1, 3 and 5).
(The book is available at Swinburne Bookshop, Swinburne Library and as online
resource (E-book*) through Swinburne Library.)
*E-book web-link:
http://www.swin.eblib.com.au.ezproxy.lib.swin.edu.au/patron/FullRecord.aspx?p=1111585&echo=1&u
serid=Q5n5hJLVgt5fL7c6yQENgA%3d%3d&tstamp=1394251339&id=A0028CD2E072B4F2CD91B85
563B3B88863439F44
(Electronic edition: de Vaus, D.A. (2013). Surveys in social research (5th ed.). Taylor
& Francis.)
2015 Semester 2
STA60004
STA60004
Research Designs
A research design provides a framework for the collection and analysis of data
(Bryman, 2012). There are four major types of research design:
Experimental Design;
Cross-Sectional Design or Survey Research;
Longitudinal Design;
Case Study Design
STA60004
The level of smoking in each group was measured before the QUIT program began
and six months after. It was found that in the experimental group ten per cent fewer
smoked by Time 2 and in the control group three per cent fewer smoked by Time 2.
A reduction of three per cent among the control group was likely to be due to factors
not related to the program. The effect of the QUIT program was measured by the
difference in the amount of change between the experimental and the control group.
E1, E2, C1, C2 - the measure of the dependent variable (E experimental group, C control group).
Echange = E2 E1 = 10%
Cchange = C2 C1 = 3%
Effect = Echange - Cchange = 7%
STA60004
In a cross-sectional survey, data are collected at one point in time from a sample
selected to describe some larger population at that time.
For the study of smoking behavior, discussed in the Experimental Design section, we
would ask a sample of people about their level of smoking some time after an antismoking campaign.
Effect = E2 ?
In this case we will not be able to tell anything about the effectiveness of the
campaign. We need to have an empirical frame of reference against which to
compare the 30% figure. Otherwise we cannot say anything about the causal
process. This type of survey research is sometimes referred to one group post-test
only design.
A better option is to collect measures from two groups of people at one point of time
and to compare the extent to which groups differ on the level of smoking. For
example, we would obtain a random sample of people after the anti-smoking
campaign and ask those people about their current level of smoking and how aware
they were of the anti-smoking campaign. We could then divide people into groups
according to how well aware they had been of the campaign. If the campaign was
successful we would expect that those with the greatest awareness would also have
the lowest level of smoking. We may then conclude that the campaign was effective.
2015 Semester 2
STA60004
Effect = E2 C2 = 7%
However, high and low awareness groups might differ in other ways not just in the
awareness of the campaign (e.g. high awareness participants might be older, be in
poorer health etc.).
The experimental design is different to the survey design in that the variation
between the attributes of people is created by intervention from the researcher
wanting to see if the intervention generates a difference. A survey approach would
not create the variation but would find naturally occurring variation. The problem
with survey research is that we cannot be sure that the two groups are similar in
other respects, whereas the experimental researcher begins with two similar groups
and the only difference is that only one group receives the treatment. Therefore any
difference in dependent variable must be due to the treatment.
In many cases we do not need to test causal propositions. For example, if we want
to determine voting intentions in the upcoming elections, a cross-sectional survey will
be the best option.
The ultimate goal of survey design is to allow researchers to generalize about a large
population by studying only a small portion of that population. Therefore, if you need
personal, self-reported information that is not available elsewhere and if
generalization of research findings to a larger population is desired, survey research
is the most appropriate method.
Survey research is usually used when there is a need for information about a group
of people or organisations, which is not available from any other source. In some
cases, this is because we want to know facts that are difficult to observe
systematically. For example, some crimes are reported to police, many are not. A
2015 Semester 2
STA60004
way to estimate the rate at which people are victims of crime is to ask a sample of
people about their victimisation experience. Sometimes we are interested in
measuring phenomena that only individuals themselves can perceive: what people
think or know, or what they feel. Very often the best way to find out what people like
and believe is to ask them. You cannot assume that people think in certain ways
without asking them what they think.
Surveys are used for collecting information from or about people to describe,
compare, predict, or explain peoples behaviour, attitudes, opinions and values. In
other words, surveys are used to answer four broad classes of questions:
1. The prevalence of attitudes, beliefs, and behaviour;
2. Changes in attitudes, beliefs, and behaviour over time;
3. Differences between groups of people in their attitudes, beliefs, and behaviour;
and
4. Causal propositions about these attitudes, beliefs, and behavior.
1. Prevalence of Attitudes, Beliefs, and Behaviour
Surveys are most often used to measure the frequency of certain attitudes, beliefs,
and behaviour. For example, surveys can be used to see what proportion of the
public approves of the Prime Ministers performance (an attitude), what proportion of
the public believes that taxing emissions of carbon dioxide is imperative to reduce
global warming (a belief), and what proportion of the population has been
unemployed and looked for a job during the previous month (a behaviour).
2. Changes of Attitudes, Beliefs, and Behaviour Over Time
Measuring the prevalence of attitudes, beliefs, or behaviour is generally only of
limited interest. The proportions often mean little by themselves. For example, if 20%
of high school students have used drugs in the past month, it is important to know
whether that is higher or lower than in previous years, whether there is an increase
in drug use or whether it is diminishing. Longitudinal surveys are usually used for
measuring change.
2015 Semester 2
STA60004
10
STA60004
separately among men and women. The logical goal of isolating relevant variables
by ruling out the influence of extraneous variables is considered to be the same for
both methods.
Longitudinal Designs
With a longitudinal design a sample is surveyed and then is surveyed again on at
least one further occasion. A longitudinal design allows the analysis of process and
change over time, which is not easily possible in a cross-sectional survey. Therefore
longitudinal designs may be more able to make causal inferences.
The primary longitudinal designs are panel studies, trend studies and cohort
studies.
Panel Studies
Panel studies involve the collection of data over time from the same sample of
respondents. The sample for such a study is called the panel. For example, the
same people could be interviewed at successive elections to assess changes in
attitude and vote.
For the study of smoking behavior, we would measure the smoking behavior of a
representative sample of people before the anti-smoking campaign. After
participating in the QUIT program participants level of smoking will be reassessed.
The difference in smoking behavior between Time 1 and Time 2 will provide a
measure change over the period.
Effect = E2 E1
The problem with the panel design is that, in comparison with the experimental
design, we dont know the extent to which comparable smokers who did not
2015 Semester 2
11
STA60004
Trend Studies
In trend studies a particular population is sampled and studied at different points in
time. While samples are of the same population, they are not composed of the same
people.
The voting polls conducted over the course of a political campaign are an example of
a trend study. At several times during the course of the campaign, samples of voters
are selected and asked for whom they will vote. By comparing the results of these
several polls, researchers might determine shifts in voting intentions.
For the study of smoking behavior, for example, we would measure smoking
behavior of a representative sample of people before implementing the anti-smoking
campaign. After the campaign we would measure the smoking behavior of another
representative sample.
Effect = E2 E1
The problem with this design is that we cannot fully match the samples. Therefore
the effect observed between Time 1 and Time 2 might be due to sampling error
(differences between the samples).
2015 Semester 2
12
STA60004
2015 Semester 2
13
STA60004
Effect = E2 E1
This type of design is referred to a retrospective panel design. A problem with this
design is that respondents might not be able to report information accurately. The
farther back they are forced to reach into their memories, the less accurate the
information they provide is likely to be.
2015 Semester 2
14
STA60004
with low awareness. We would also try to ensure that both groups are similar in
regards to other variables (age, sex etc.). If the level of smoking in the high
awareness group dropped more than in the low awareness group we might attribute
this difference to the effect of the campaign.
Echange = E2 E1 = 10%
Cchange = C2 C1 = 3%
Effect = Echange - Cchange = 7%
2015 Semester 2
15
STA60004
In any particular study different research designs can be used. Causes of industrial
disputes, for example, can be studied with the help of a survey of attitudes of
management and workers, a case study of a particular strike or a particular factory
and an experiment where groups of workers work under different conditions to find
out if this affects the frequency of disputes (de Vaus, p.5).
In many studies it may be appropriate to use a range of research designs. The best
studies are often those that combine more than one design, since each design
provides a different perspective on the subject under study.
2015 Semester 2
16
STA60004
Exercise 1
(de Vaus, Chapter 1, Exercise 4, p.8)
Imagine that you believe being unemployed leads to a loss of self-esteem. Briefly
contrast how the case study, the experiment and the survey research would differ in
their basic procedure for testing this proposition.
(Use Discussion Board/ Blackboard to discuss this exercise)
2015 Semester 2
17
STA60004
Exercise 2
(de Vaus, Chapter 3, Exercise 5, pp. 39-40)
For each of the following statements of research findings indicate the type of
research design that appears to have been employed and explain what is wrong with
the conclusions that are drawn. Concentrate on problems that arise from research
design problems.
a. A Sixty-eight per cent of married people scored high on our index of conservatism
while only 38 per cent of single people scored high. Marriage makes people more
conservative.
c. In the early 1970s, before the end of the Vietnam War, surveys showed that
tertiary students had strong anti-American attitudes. Recent surveys have shown
that these feelings are no longer evident among students. Ending the Vietnam
War certainly improved the attitudes of students to the United States.
2015 Semester 2
18
STA60004
d. Old people attend church more often than young people. For example, 58 per
cent of those over 60 attend church regularly while only 22 per cent of those
under 25 do so. From this we can conclude that as people get older they become
more religious.
e. The average number of children per family now is 1.8 families are obviously
getting smaller these days.
f. To test the idea that having children makes people happier, a group of parents
were asked how happy they felt now compared with before they had children.
Eighty-seven per cent said they were happier now than before they had children.
From this we can conclude that having children improves peoples happiness.
2015 Semester 2
19
STA60004
g. A
HEADSTART
program
(a
preschool
educational
program
to
help
disadvantaged children have a head start by the time they commence school)
was used to test the effectiveness of HEADSTART. A group of four-year-olds
from disadvantaged backgrounds were chosen to enter the program. IQ tests
were given at the beginning of the program and again at the end. There was an
average gain of ten IQ points over the period of the program. HEADSTART
increases childrens IQ.
2015 Semester 2
20
STA60004
2015 Semester 2
21
STA60004
Terms questionnaire and survey are very often used interchangeably. Therefore it
is crucial to understand the difference between research design and research
method.
Research Question
Research Design
Experiment
Cross-Sectional
Survey
Longitudinal
Survey
Case Study
Direct
Measurement;
Questionnaire;
Interview;
Observation;
Etc.
Questionnaire;
Questionnaire;
Interview;
Interview;
Etc.
Etc.
Direct
Measurement;
Questionnaire;
Interview;
Observation;
Etc.
From the diagram above you can see that questionnaires, as a method of data
collection, can be used not only in survey research.
Example: Use of Questionnaires in Experimental Studies
In experimental studies, questionnaires can be used before, during, and after a
program or intervention. Data collected before the intervention may be used for:
Selecting groups to participate in a program;
Checking the support for a program;
Ensuring the comparability of groups;
Providing a basis for monitoring change.
2015 Semester 2
22
STA60004
2015 Semester 2
23
STA60004
2015 Semester 2
24
STA60004
(Figure 8.1 from Bryman (2012), reproduced with permission from Oxford University Press)
2015 Semester 2
25
STA60004
Exercise 3
Read the following article:
Lim, M.S.C., Hellard, M.E., & Aitken, C.K. (2005). The case of the disappearing
teaspoons: Longitudinal cohort study of the displacement of teaspoons in an
Australian research institute. British Medical Journal, 331, 498-500
http://www.biostat.jhsph.edu/courses/bio622/misc/Disappearing_teaspons.pdf
and comment on the importance of the research topic of the study.
(Use Discussion Board/ Blackboard to discuss this exercise)
2015 Semester 2
26
STA60004
Defining Terms
When developing a survey, you need to define all abstract, imprecise or ambiguous
terms in the survey objectives. In the previous example the imprecise terms are
needs, educational services, characteristics, and benefit. These terms are
ambiguous because no standard definition exists for any of them.
2015 Semester 2
27
STA60004
28
STA60004
Descriptive Research
Descriptive studies deal with research questions of what things are like. Public
opinion polling, voter intention studies, unemployment rate surveys and the census
are examples of descriptive surveys.
2015 Semester 2
29
STA60004
Explanatory Research
The first step in explanatory research is to decide whether you are looking for
causes or consequences. For instance, if you are studying recent increase in
2015 Semester 2
30
STA60004
divorce rate, you may be interested why this happens or you may be interested in
the consequences of the increased divorce rate. In the first case, increase in divorce
rate will be a dependent variable:
Increase in
divorce rate
?
Independent Variables
Dependent Variable
Increase in
divorce rate
?
Independent Variable
Dependent Variables
You also can consider intervening variables. For example, you may research how
education affects income level via its effect on job:
Education
Job
Income
Independent Variable
Intervening Variable
Dependent Variable
You should also be aware of extraneous variables. Extraneous variable refers to any
variable other than the independent variable that could cause a change in the
2015 Semester 2
31
STA60004
Childs religiousness
Independent variable
Dependent variable
Parental religiousness
Extraneous variable
Cases
Work Status
Person 1
Male
21
Single
Part time
Person 2
Female
28
Married
Unemployed
Person 3
Female
46
Divorced
Full time
Person 4
Male
34
Single
Full time
Person 5
Female
39
Married
Part time
Person 6
Male
26
Married
Full time
Person 7
Male
52
Separated
Full time
2015 Semester 2
32
STA60004
The case in the data grid refers to a unit of measurement or unit of observation.
Unit of measurement is the unit on which the researcher collects data. The cases in
the data grid are not necessarily people. The unit of measurement can be a country,
a year or an organisation.
The unit of analysis is the major entity that is being analysed in the study. The unit
of analysis should not be confused with the unit of observation. For different
analyses in the same study you may have different units of analysis.
Consider the following example. Imagine a researcher collects data on students from
different schools. There are three possible levels of generalizations: the student, the
classroom, and the school. If the researcher wants to draw conclusions about
students, student should be the unit of analysis. If the researcher wants to make
generalizations about schools, then schools should be the unit of analysis etc.
For example, if the researcher is comparing the children on achievement test scores,
the unit is the individual student because you consider a score for each student (see
Table 2). If you decide to compare average classroom performance then the unit of
analysis is the classroom. In this case, since the data that goes into the analysis is
the average itself, and not the individuals' scores, the unit of analysis is the group.
Even though you had data at the student level, you use average scores in the
analysis (See Table 3). Therefore it is the statistical analysis you do in your study
that determines what the unit of analysis is.
Table 2
Table 3
Variable
Variable
Achievement
Test Score
Average
Achievement
Test Score
Student 1
Class 1
Student 2
Cases
Student 3
Class 2
Cases
Class 3
Student 4
Class 4
...
2015 Semester 2
33
STA60004
Exercise 4
(de Vaus, Chapter 3, Exercise 1, p. 39)
For each of the following statements say what unit of analysis is being used.
a. In the UK for every 1000 women aged 20-24 there were 30.4 who had an
abortion in that year of 1998.
b. In 1998 in the United States the average family in poverty would require an
additional US$6620 per year to get on or above the poverty line.
e. In the UK the official abortion rate per 1000 women aged 20-24 has changed as
follows:
1968 = 3.4
1985 = 20.4
1970 = 10.5
1990 = 28.1
1975 = 15.1
1995 = 25.5
1980 = 18.7
1998 = 30.4
2015 Semester 2
34
STA60004
Ethics in Research
Three important considerations need to be taken into account when developing
surveys:
Technical consideration (sample design, questionnaire construction, etc.);
Practical considerations (budget, deadlines, purpose of the research);
Ethical considerations
In other words, a survey should be technically correct, practically efficient and
ethically sound (de Vaus, p. 58).
Ethical principles
1. Voluntary participation
2. Informed consent
3. No harm
4. Anonymity
5. Confidentiality
6. Privacy
1. Voluntary Participation
People should not be required to participate in a survey. This should be stated
explicitly. For example:
Although your participation in this survey will be greatly valued, you are not required
to participate. You can stop at any point or choose not to answer any particular
question. (de Vaus, p.60)
Although participation in surveys is voluntary, consider the following practices:
Governments can require by law that citizens participate in census collections
and certain surveys.
Some institutions can require people to complete forms (e.g. universities,
hospitals etc.). Reason: information can be useful for monitoring, planning,
reporting etc.
2015 Semester 2
35
STA60004
36
STA60004
2015 Semester 2
37
STA60004
Sometimes surveys are administered through third parties (e.g., to students through
their teacher; to employers through their supervisor etc.). In those cases the third
parties must not see the responses before the surveys are returned to the
researcher.
Once data are collected make sure that confidentiality is maintained. Any identifying
information (e.g., name and address) should be separated from respondents
answers. This is done by providing cases with ID numbers and having a separate file
in which these ID numbers are linked with the participants names. This is usually
done if follow up is required. Access to the file with respondents names and
corresponding ID numbers should be restricted. If follow up is not required then you
dont need to keep any record of participants names.
The survey data must be confidentialised before publishing results. This can be done
by:
Removing information that lead to identification;
Collapsing categories of variables from highly specific and putting individuals into
broad groups.
Informed Consent Example:
Parent-Teacher Interview Satisfaction Survey
March, 24, 2013
Dear Parent/Carer,
You have been randomly selected to participate in a survey that aims to
investigate parents satisfaction level at parent-teacher interviews at XYZ
Secondary School. This study is being conducted by ABC Research
Consultancy, on behalf of XYZ Secondary School. This survey aims to provide
important information about your parent-teacher interview experience and help
guide the future of parent-teacher communication at our school.
You will be asked to complete a 10 minute anonymous questionnaire containing
35 questions about your experience at the parent-teacher interview you have just
attended. Although your participation is greatly valued, you do not have to
complete this survey. You can stop at any time or choose not to answer any
questions. Most questions will ask you to tick the appropriate box while others will
require you to circle a number best relating to your feelings about the parentteacher interview.
2015 Semester 2
38
STA60004
6. Privacy
Privacy means that people can expect to be free from intrusion.
2015 Semester 2
39
STA60004
Ethical issues are dealt in the codes of ethics of the professional organisations. For
example,
National Statement on Ethical Conduct in Human Research:
http://www.nhmrc.gov.au/_files_nhmrc/publications/attachments/e72_national_statement_m
arch_2014_140331.pdf
The Belmont Report identifies three fundamental ethical principles for all human
participant research respect for persons, beneficence, and justice.
2015 Semester 2
40
STA60004
The principle of respect for persons includes two moral requirements: the
requirement to acknowledge autonomy and the requirement to protect those with
diminished autonomy. This means that individuals have a right to decide for
themselves whether to participate in research. The researchers should not use
information about participants without first getting their informed consent.
Beneficence involves two principles: (1) do not harm and (2) maximize possible
benefits and minimise possible harms.
Justice requires that participants are selected fairly and that the risks and benefits of
research are distributed equitably. For example, if research supported by the
government leads to the development of therapeutic devices and procedures, justice
demands that these developments will be available not just to those who can afford
them. Such research also should not unduly involve people from groups unlikely to
be among the beneficiaries of subsequent applications of the research.
2015 Semester 2
41
STA60004
Bibliography
Babbie, E.R. (2010). The practice of social research (12th ed.). Belmont: Wadsworth
Cengage.
Bryman, A. (2012). Social research methods (4th ed.). Oxford University Press.
De Vaus, D.A. (2002). Surveys in social research (5th ed.). Sydney: Allen & Unwin.
Fink, A. (2003). The survey kit, Volume 1: The survey handbook. (2nd ed.) London:
Sage Publications.
Fink, A. (2003). The survey kit, Volume 6: How to design survey studies. (2nd ed.).
London: Sage Publications.
Rea, L.M. & Parker R.A. (2005). Designing and conducting survey research: A
comprehensive guide (3rd ed.). San Francisco: Jossey-Bass.
Trochim, W. (2000). The research methods knowledge base (2nd ed.). Cincinnati:
Atomic Dog Publishing.
Weisberg, H.F., Krosnick, J.A., & Bowen, B.D. (1996). An introduction to survey
research, polling, and data analysis (3rd ed.). Thousand Oaks: Sage Publications.
2015 Semester 2
42
STA60004
2015 Semester 2
43
STA60004
Additional Resources
Blackboard/ Learning Material/ Weekly Activities and Notes/ Week 1: Module1
Topic 1/ Additional Resources
Visit Brymans Social Research Methods (4th ed.) online resources:
http://www.oup.com/uk/orc/bin/9780199588053/
2015 Semester 2
44
STA60004
Contents
Learning Objectives...................................................................................... 3
Optional Reading....................................................................................... 3
Steps in Sampling Process...................................................................................... 4
Sampling Concepts... 4
1. Population (Target Population)... 4
2. Census....................... 4
3. Sample.......................................................................................................... 5
4. Sampling Frame............................................................................................ 5
5. Survey Population......................................................................................... 6
6. Probability Sampling..................................................................................... 6
7. Non-Probability Sampling 6
8. Sampling Error.............................................................................................. 7
9. Sampling Bias...............................................................................................10
10. Error in Survey Research............................................................................ 11
Probability Sampling.............................................................................................. 12
1. Simple Random sampling.......................................................................... 12
2. Systematic Sampling.................................................................................. 13
3. Stratified Sampling...................................................................................... 14
4. Cluster sampling........................................................................................ 15
Non-Probability Sampling.............. 16
1. Quota Sampling................... 16
2. Purposive or Judgment Sampling.... 17
3. Snowball Sampling............. 17
4. Availability or Convenience sampling..... 17
Example......................................... 18
Exercise 1...................................... 19
Exercise 2...................................... 20
Exercise 3...................................... 20
Sample Size................................... 21
Exercise 4...................................... 26
Bibliography...................................... 27
Answers to Selected Exercises. 28
Semester 2 2015
STA60004
Learning Objectives:
On completion of this topic you will be able to:
1. Understand the differences between a sample and a population;
2. Understand the processes involved in selecting a sample;
3. Understand the differences between probability and non-probability sampling;
4. Understand the difference between simple random sampling, systematic
sampling, stratified sampling, and cluster sampling;
5. Understand the difference between quota sampling, purposive sampling,
snowball sampling and convenience sampling;
6. Explain the concepts of sampling error and non-response.
Optional Reading
Bryman, A. (2012). Social research methods (4th ed.). Oxford University Press
Chapter 8
(The book is available at Swinburne Library and Swinburne Bookshop.)
De Vaus, D.A. (2002). Surveys in social research (5th ed.). Sydney: Allen & Unwin
Chapter 6
http://onlineres.swin.edu.au.ezproxy.lib.swin.edu.au/668013.pdf
Semester 2 2015
STA60004
Sampling Concepts
1. Population (Target Population)
2. Census
3. Sample
4. Sampling frame
5. Survey Population
6. Probability Sampling
7. Non-Probability Sampling
8. Sampling Bias
9. Sampling Error
10. Error in Survey Research
2. Census
Census is the enumeration of an entire population. A census is obtained by
collecting information about every member of a population.
Semester 2 2015
STA60004
3. Sample
A sample is a segment or subset of a population. The best sample is representative
of the population. Representative sample is a sample that reflects the population
accurately. A sample is representative if important characteristics (e.g., age, gender,
etc.) of those within the sample are distributed similarly to the way they are
distributed in the population (the profile of the sample is the same as that of the
population).
If a sample is representative of the population, then we can make inferences
(generalizations) about the population based on the known characteristics of the
sample. Therefore, a sample enables us to learn about the characteristics of the
population without surveying every single member of the population.
Selected and Achieved Sample
Selected sample is a subset of the target population that has been chosen to
participate in a survey.
Achieved sample constitutes those members of the selected sample who have
completed the questionnaire
Samples vs. Census
Why should we use samples instead of census?
Samples are less expensive to obtain;
Samples can be studied more quickly than entire target populations.
4. Sampling Frame
A sampling frame is a list of all members of the population. A sample is selected
from the sampling frame. The sampling frame should not contain:
-
duplicate records (no member should appear more than once on the list);
Electoral roll;
Rates registers;
Semester 2 2015
STA60004
5. Survey Population
Theoretically a sample should be drawn from the target population. However, it is not
always possible to know how, or where, to contact each member of the target
population. In this case we use survey population: a population which includes those
elements in the target population that can be reached for inclusion in the sample.
6. Probability Sampling
Probability sampling implies the use of random selection. Probability sampling
requires a sampling frame of members of the target population so that members of
the sample can be selected with an equal (or at least known) chance of selection.
Probability sampling allows researchers to utilise tests of statistical significance that
permit inferences to be made about the population from which the sample was
selected.
7. Non-Probability Sampling
This type of sampling is used when sampling frames are not available. Samples are
chosen based on judgment regarding the characteristics of the target population and
the needs of the survey. With non-probability sampling, some units of the target
Semester 2 2015
STA60004
population have a chance of being selected and others have not. By chance, the
surveys results may not be applicable to the target population at all. Non-probability
methods are usually used for situations in which precise representativeness is not
necessary. Non-probability sampling can be used when you study a relationship or
when developing a theory.
8. Sampling Error
Sampling error results naturally from selecting a sample, rather than measuring the
entire population. As the sample size increases, the sampling error decreases, until
the sample becomes a census in which there is no sampling error. It is important to
understand that probability sampling does not and cannot eliminate sampling error.
Consider Figures 1- 4 to appreciate the significance of sampling error.
Example (Bryman (2012), pp. 188-190; Figures 1- 4 are reproduction of Figures 8.38.7 with permission from Oxford University Press):
Imagine we have a population of 200 people and we want to obtain a sample of 50
people. One of the variables of interest is whether people watch soap operas or not.
Imagine the population is equally divided between those who watch and those who
do not watch soaps (see Figure 1). If the sample is representative of the population
we would expect our sample of 50 to be equally split in terms of the variable of
interest (Figure 2).
If there is a small amount of sampling error, the sample will look like Figure 3. A
sample in Figure 4 has higher degree of overrepresentation of those who do not
watch soap operas.
Semester 2 2015
STA60004
In Figure 5 you can see a very serious over-representation of people who do not
watch soaps (25 watch soap operas and 35 people do not).
Conceptually, sampling error is the degree to which the sample is not representative
of the population, and it arises naturally as a function of selecting a sample. The
main idea of getting a sample is that you can calculate a statistic and estimate the
corresponding population parameter. Standard error reflects the difference between
an estimate derived from a sample and real value for the whole population. In
sampling contexts, the standard error is called the sampling error.
For example, you want to know how much time, on average, university students
spend on self-study per day. Imagine you obtain a random sample of university
students and calculate the sample mean. Say, the mean is 2.5 hours. How confident
are you that the mean of 2.5 hours is likely to be found in the population?
If you take an infinite number of samples from the population, the sample estimate of
the mean of the variable of interest (in our case self-study hours) will vary in relation
to the population mean. This variation will take the form of normal distribution and is
called the sampling distribution (in our case it is the sampling distribution of means).
The standard deviation of the sampling distribution is referred to standard error. So,
in a sample, a standard deviation is the spread of the scores around the sample
Semester 2 2015
STA60004
mean. In a sampling distribution, the standard error is the spread of the sample
means around the population mean (the mean of the sample means).
Note: 95 percent of sample means will lie within the shaded area. SE = standard error of the mean.
(Figure 8.8 from Bryman (2012), reproduced with permission from Oxford University Press)
To calculate sampling error we use the standard deviation (sd) of our sample:
, where n = sample size, N = population size.
From this formula you can see that if the sample size is equal to the population size
(n=N), SE is zero.
=0
You can also see that the bigger the sample size, the smaller the sampling error.
A 95% confidence interval (CI) for the population mean is calculated as follows:
]
In the case of proportions, we use the observed proportion in the sample:
Semester 2 2015
STA60004
The term
sample size n is a reasonably large fraction of the population size N, then fpc needs
to be included in the calculation of SEs and CIs. It is recommended that fpc should
be included if the sampling fraction
is less than 10%. If the fpc is ignored, the formula for the standard error of the
mean becomes
form of
and the formula for the standard error of the proportion takes a
.
Draw an initial sample that is bigger than needed (be careful to pay attention
to the costs);
Send reminders to recipients of mailed and internet surveys and make repeat
phone calls to potential phone survey respondents;
Provide incentives.
Semester 2 2015
10
STA60004
Selecting an initial sample that is larger than needed may not solve the
problem of bias, however. Those who agree to participate may differ in various
ways from non-respondents. Non-response may introduce bias into a surveys
results because of the differences between respondents and non-respondents in
attitudes, patterns of behaviour and other potentially important factors.
Response rate refers to the percentage of a sample that agrees to participate.
However the calculation of the response rate is a bit more complicated:
Response Rate =
(Figure 8.9 from Bryman (2012), reproduced with permission from Oxford University Press)
Semester 2 2015
11
STA60004
Probability Sampling
Types of probability sampling:
1. Simple random sampling;
2. Systematic sampling;
3. Stratified sampling;
4. Cluster sampling
12
STA60004
2. Systematic sampling
Systematic sampling is similar to simple random sampling.
Advantage of systematic sampling is that it is mechanically easier to create. With
this type of sampling, you select units directly from the sampling frame, without using
a table of random numbers.
Steps in selecting a systematic sample:
Obtain a sampling frame;
Determine the population size (e.g., 200);
Decide on the sample size (e.g., 50);
Calculate a sampling fraction: divide the population size by the sample size
(200/50=4);
Select a starting point by choosing a number that falls within the sampling fraction
(a number between 1 and 4, e.g., select number 3);
Use the sampling fraction to select every nth case. (In our example, select every
4th case and obtain 50 cases. So the sequence will be: number 3, number 7, 11,
15, ...).
Problems with systematic sampling:
Similar to problems associated with simple random sampling.
Additional problem:
Periodicity of sampling frames: a certain type of person may reoccur at regular
intervals within the sampling frame. For example, if the sampling frame is a list of
Semester 2 2015
13
STA60004
names, you can obtain a sample that lacks names that appear infrequently, say
names beginning with the letter X. Another example: suppose we have a list of
married couples arranged so that every husbands name is followed by his wifes
name. If a sampling fraction is any even number the sample will only contain
females.
Systematic sampling should not be used if repetition is a natural component of
the sampling frame. You should reorder the list or adjust the sampling intervals in
order to use systematic sampling.
3. Stratified sampling
Stratified sampling is a modification of simple random sampling designed to
produce more representative samples. For a sample to be representative the
proportions of various groups in the sample should be the same as in the population.
For example, if in a study the ethnic background of respondents is expected to affect
participants responses, then we need to ensure that each ethnic group is
represented in the sample in its correct proportion.
Steps in selecting a stratified sample:
Select the stratifying variable (e.g., ethnic background);
Divide the sampling frame into separate lists (strata) one for each category of
the stratifying variable;
Select a simple random or systematic sample from each stratum.
Advantage of stratified sampling:
It ensures representation from each stratum. Hence a more representative
sample is obtained.
Disadvantages:
Similar to problems associated with simple random sampling.
Additional problems:
More complicated than simple random and systematic sampling;
Strata must be identified and justified.
There are two types of stratified sampling: proportionate stratified sampling and
disproportionate stratified sampling. In proportionate stratified sampling, the number
of units allocated to the various strata is proportional to the representation of the
strata in the population. This type of sampling is used when you need to estimate a
populations parameter.
Semester 2 2015
14
STA60004
4. Cluster sampling
A cluster is a naturally occurring unit, such as a school, a university, a hospital, a
city, or a state. Cluster sampling is usually performed when a proper sampling frame
is not available. For example, you cannot obtain a list of all patients in city hospitals
or all members of sporting clubs, but you can obtain lists of hospitals and sporting
clubs.
Steps in selecting a cluster sample:
Clusters are randomly selected;
All members of the selected clusters are included in the sample.
Multistage cluster sampling (an extension of cluster sampling):
Clusters are randomly selected;
A sample is drawn from the cluster members by simple random or systematic
sampling.
15
STA60004
Example: Steps in sampling the population of a city for which there was no
sampling frame of residents:
Divide the city into clusters (e.g., electorates);
Select a random sample of these clusters;
Obtain a list of smaller areas of selected electorates (e.g., suburbs);
Obtain a random sample of suburbs within each of the selected electorates;
For each selected suburb obtain a list of addresses of households;
Select a random sample of addresses within the selected suburbs.
Non-Probability Sampling
Types of Non-Probability Sampling:
1. Quota sampling;
2. Purposive or judgment sampling;
3. Snowball sampling;
4. Availability or convenience sampling
1. Quota Sampling
Quota sampling aims to produce representative samples without random selection of
cases. The population is first divided into subgroups (e.g., males and females,
younger and older). The proportion of people who fall into each subgroup (e.g.,
younger and older males and younger and older females) is then estimated. Quotas
are then assigned for the interviewer to complete the required number of interviews
within each group. For example, in a telephone survey, the interviewer will be
required to work through the list until the required number is achieved. In quota
sampling the sample is arranged so that it mirrors the population with respect to the
defined groups.
Quota sampling is non-random because interviewers can select any cases that fit
specific criteria. Unlike stratified random samples, quota samples are not selected
with a known probability; therefore the sampling error cannot be determined.
Semester 2 2015
16
STA60004
3. Snowball sampling
Respondents are referred to the researcher via word-of-mouth in situations where it
is difficult to locate the members of the population. After interviewing, those
respondents are asked to identify other members of the population. As newly
identified members name others, the sample snowballs. The process is repeated
until the required sample size is achieved.
This technique is used when a population listing is not available and cannot be
compiled. For example, surveys of teenage gang members, illegal immigrants and
marijuana users might use snowball sampling.
Semester 2 2015
17
STA60004
Example
Bryman (2012), Chapter 8: Sampling
Kinsey et al. (1948): Sexual Behavior in the Human Male
Alfred Kinsey was a sexologist who suspected that there was a greater diversity of sexual
behaviour in the USA than had so far been acknowledged. He set out to investigate this by
collecting the personal narratives of 18,000 men (and later a sample of women), inviting
them to write about their sexual life histories. There were two main stages of recruitment in
this study. At first, Kinsey was content to use a non-probability, snowball sample of
volunteers, including students, prison inmates, colleagues and members of gay clubs, who
would put him in touch with other people they knew. He justified this on the grounds that
sexuality was a private and sensitive issue for which random, probability sampling would be
inappropriate: he could not really approach strangers on the street and ask them to provide
an honest and detailed account of their sexual experiences! Later, Kinsey was criticised for
using unscientific methods, for it was argued that his sample was biased towards those who
were forthcoming enough to volunteer their personal stories: indeed, these people did report
a higher level of sexual activity than those in his second sample. The latter was what Kinsey
called a multistage cluster sample of 100% samples, although OConnell Davidson &
Layder (1994) argue that this technique was not rigorous enough to be a probability sample.
Kinsey had identified geographical clusters across the USA and divided these into further
sub-clusters; he then broke these down into cells representing different social groups,
stratified by age, residence, religious affiliation and so on. He aimed to collect data from
every male adult in each cell, but the cells themselves had not been randomly selected;
Kinsey chose those that were most convenient and accessible to him. This meant that
although the sample was very large, it was not drawn randomly and systematically from a
sampling frame and so could not be deemed representative of the American male
population. It is also difficult to know whether the high response rate of those who
volunteered for the study concealed another segment of the population who had not
responded and who might have had quite different stories to tell.
Sources:
Kinsey, A., Pomeroy, W. & Martin, C. (1948). Sexual behavior in the human male.
Philadelphia: W.B Saunders.
OConnel Davidson, J. & Layder, D. (1994). Methods, sex and madness. London: Routledge.
18
STA60004
Exercise 1
(Fink, V.7, p.63)
Name the sampling method used in each of these four scenarios.
a) Two of four software companies are randomly chosen to participate in a new
work-at-home program. Thirty employees are selected at random from each of
the two companies and asked to complete a self-administered questionnaire by
electronic mail.
b) Each of the rangers surveyed at five national parks is asked to recommend two
other rangers to participate.
19
STA60004
first question, Thank you for your time. Thats all I needed to know. She
continues to recruit organic buyers until the required number of 100 is met.
f)
Exercise 2
(de Vaus, p. 92)
Think of a research topic in which you would need to use non-probability sampling
techniques and explain why a probability sample would not be feasible.
(Use Discussion Board/ Blackboard to discuss this exercise)
Exercise 3
Think of a research topic in which you would use stratified sampling techniques
instead of simple random or systematic sampling. Explain why.
(Use Discussion Board/ Blackboard to discuss this exercise)
Semester 2 2015
20
STA60004
Sample Size
Sample size depends on:
The degree of accuracy we require for the sample.
Defined by:
-
Sampling error: the amount of error we are willing to accept in our estimated
value (e.g., to within +2% of the estimated value);
Semester 2 2015
21
STA60004
Table 1
Sample Sizes Required for Various Sampling Errors
at 95% Confidence Level
Margin of Error
%
Sample size
(for 50/50 split on the variable)
1.0
9604
1.5
4268
2.0
2401
2.5
1537
3.0
1067
3.5
784
4.0
600
4.5
474
5.0
384
5.5
317
6.0
267
6.5
227
7.0
196
7.5
171
8.0
150
8.5
133
9.0
119
9.5
106
10.0
96
For example, if in a sample of 2401 respondents it was found that 50% intended to
vote for the Labour Party, we can be 95% confident that 50% + 2% (i.e. between
48% and 52%) of the population intends to vote Labour.
You can use the following website to calculate the required sample size for your
study:
http://www.surveysystem.com/sscalc.htm
n=
Semester 2 2015
22
STA60004
For example, if the amount of error we are willing to accept is 2% (0.02), then the
required sample size would be
n=
= 2401
.
. This
equation requires a value of sd, the estimated standard deviation of the variable we
are going to measure in the survey. As we dont know the value of sd before
conducting the study, we have to resort to a reasonable guess, either from a
previous study or a pilot study. Or we can rely on the fact that approximately 95% of
the normal distribution lies within two standard deviations of the mean. So we can
use the formula: range
size formula.
Points to consider:
As can be seen from Table 1, for small samples a small increase in sample size
will lead to a substantial increase in accuracy. For example, increasing the
sample from 96 to 150 respondents reduces sampling error on 2% (from 10% to
8%). To reduce error from 4% to 2% you would require 2401 respondents
instead of 600. The rule is: to halve the sampling error you need to
quadruple the sample size.
Considerations of sample size are likely to be affected by matters of time and
cost because very big samples are decreasingly cost efficient.
The size of the population is irrelevant for the accuracy of the sample. In
other words, the size of the population from which a sample of a particular size is
drawn has no impact on how well the sample is likely to describe the population.
A sample of 200 people will describe a population of 10,000 and 10 million with
the same degree of accuracy.
The notion, the size of the population is irrelevant for the accuracy of the
sample, relates to very large populations. From the formula for calculating
sampling error
Semester 2 2015
23
STA60004
becomes approximately
=0
If the population variance increases, the sample variance would increase too,
and the sampling error would become bigger.
Or, in case of calculating proportions:
Imagine, for example, you are study voting intentions and all members of the
population intend to vote for candidate X. There is no variability in the population
and the sampling error is zero in this case.
=
=0
If there is some variability, say 90% intend to vote for candidate X and 10%
intend to vote for candidate Y, then
=
If variability is greater, say 50% would vote for candidate X and 50% for
candidate Y, the sampling error would be bigger:
=
Semester 2 2015
24
STA60004
Not very often researchers base a sample size decision on the need for
precision of a single variable. If you need to break the sample into subgroups
(e.g., males and females), the degree of accuracy and variation within each
group should determine the sample size required for each group;
Table 1 and the equations on which figures are based apply to simple random
samples. Many studies use other types of sampling. More often than not, tables
will underestimate the sampling error. Systematic sampling should produce
sampling errors similar to simple random samples. Stratified samples can
produce smaller sampling errors. Cluster sampling tends to produce higher
sampling errors.
There will be errors from sources other than sampling. Therefore, the calculation
of precision based on sampling error alone can be unrealistic oversimplification.
Semester 2 2015
25
STA60004
Exercise 4
Access Brymans Social Research Methods (4th ed.) online resources:
http://www.oup.com/uk/orc/bin/9780199588053/
Click on Multiple choice questions
Go to Chapter 8
Answer the questions and get your score.
Semester 2 2015
26
STA60004
Bibliography
Bryman, A. (2012). Social research methods (4th ed.). Oxford University Press.
De Vaus, D.A. (2002). Surveys in social research (5th ed.). Sydney: Allen & Unwin
Fink, A. (2003). The survey kit, Volume 7: How to sample in surveys. (2nd ed.)
London: Sage Publications.
Fowler, F.J. (2009). Survey research methods (4th ed.). London: Sage Publications.
Sarantakos, S. (2013). Social research (4th ed.). Basingstoke: Palgrave Macmillan.
Semester 2 2015
27
STA60004
Semester 2 2015
28
STA60004
Contents
Learning Objectives..............................................................................................
Optional Reading................................................................................................
Semester 2 2015
STA60004
Learning Objectives:
On completion of this topic you will be able to:
1. Describe the most commonly used methods for collecting survey data;
2. Understand the advantages and disadvantages of each of these methods of data
collection;
3. Explain the relevance of the research questions, target population, expected
response rates, resources and time constraints in determining the most
appropriate method of data collection;
4. Evaluate which data collection method might be the most appropriate in a given
situation;
5. Appreciate the value of secondary data;
6. Identify sources of secondary data;
7. Be familiar with Australian Bureau of Statistics online resources.
Optional Reading
Bryman, A. (2012). Social research methods (4th ed.). Oxford University Press
Chapters 9, 10, & 12
De Vaus, D.A. (2002). Surveys in social research (5th ed.). Sydney: Allen & Unwin
Chapter 8
Semester 2 2015
STA60004
In this topic, the main methods of data collection will be discussed: The three
main methods used to collect data are direct measurement, questionnaires and
observation.
Direct measurement
Direct measurement involves testing subjects or otherwise directly counting or
measuring data. Examples:
-
Semester 2 2015
STA60004
Using
Secondary Data
Direct Measurement
Observation
Self-Completion
Questionnaires
With Interviewer
Overt
Paper Surveys
Face-to-Face
Interviews
Individual
Interviews
Internet Surveys
Telephone
Interviews
Group Interviews/
Focus Groups
At Central
Location
Interviewer
Delivery/Pick up
Semester 2 2015
At Respondents
Location
5
Webbased
E-mail
based
Embedded
At Central
Location
Covert
Attached
STA60004
Clarify misunderstandings;
Can use body language, visual and auditory cues to encourage participation;
At a central location;
At respondents location
Group Interviews
Central location
Semester 2 2015
STA60004
STA60004
Semester 2 2015
STA60004
2. Telephone methods
In telephone interviews, the level of interaction between the interviewer and the
respondent is restricted to speaking and listening; it is not possible to use visual
cues.
Telephone Interview Advantages over Face-to Face Interviewing:
Low cost per interview;
No travel costs;
Less interviewer bias;
Short turn-around time;
Instant data entry with CATI (Computer-Assisted Telephone Interview). The CATI
surveys are conducted from a central interviewing centre. This centralised
location allows greater supervision and quality control;
Can be more appropriate than face-to-face methods for interviewing people about
sensitive issues.
Telephone Interview Limitations:
Possible bias as those without phones are excluded;
Respondent resistance (invasion of privacy);
Easy for respondent to terminate interview
Methods of Obtaining a List of Telephone Numbers:
Obtaining a systematic sample from telephone directories
-
The sample will be biased: people with unlisted numbers will be excluded.
Disadvantages:
-
Semester 2 2015
STA60004
If mobile phones are included in the frame, duplication of units then becomes
an issue.
Read the following article to learn about using mobile phones for survey research:
Vicente, P., Reis, E., & Santos, M. (2009). Using mobile phones for survey research:
A comparison with fixed phones. International Journal of Market Research, 51, 613633.
http://web.a.ebscohost.com.ezproxy.lib.swin.edu.au/ehost/pdfviewer/pdfviewer?sid=d9b2bc5
f-726f-4ca0-bc0f-39a51303230a%40sessionmgr4004&vid=1&hid=4207
Introductory scripts are very important in telephone surveys. Read the following
publication to learn how to write a script for a telephone interview.
https://math.uwaterloo.ca/survey-research-centre/sites/ca.survey-researchcentre/files/uploads/files/SRCIntroductoryScripts.pdf
3. Postal Surveys
Postal Surveys Advantages
Relatively inexpensive method of collecting data;
It is possible to distribute large numbers of questionnaires in a very short time;
The ability to cover a wide geographic area;
Respondent can complete the survey in their own time;
No interviewer bias;
May get better answers on sensitive questions
Postal Surveys Disadvantages
The need for questionnaires to be kept simple and straightforward to avoid
confusion or errors;
The time taken to answer correspondence or resolve queries by mail;
Usually lower response rates than other methods;
Longer response time;
Incomplete data on questionnaires;
Coding responses of PAPI surveys can be expensive;
Cannot be certain that the right person has completed the questionnaire
Semester 2 2015
10
STA60004
Incentives
Incentives can help increase response rates, but can add significantly to the cost
of the research;
Use of incentives is a contentious issue. While they may encourage response, we
cannot be sure whether they are encouraging considered responses, and
therefore valid useful information, or whether some people give any answer for
the chance to receive an incentive.
Advantages of hand-delivering questionnaires over posting them
Personal contact between the respondent and the interviewer or data collector
may improve response rates;
If the questionnaire is complicated, then complexities can be directly explained.
4. On-Line Methods
Email surveys:
-
Attached to email;
Web-based surveys
Email Surveys
Email surveys are appropriate where the population can be clearly defined, and the
email addresses of population members are known.
Email surveys Advantages
Inexpensive data collection method;
Respondents can complete the survey in their own time;
Easy to monitor response/follow up non-response;
Short turn-around period for data collection;
No double handling of data respondents enter their own data;
Can incorporate visual aids
Semester 2 2015
11
STA60004
Web Surveys
Web Surveys Advantages
Short turn-around period for data collection;
No double handling of data respondents enter their own data into a database;
Can incorporate visual aids;
Easier to reach widespread population;
Simpler to implement complex branching;
Response errors are easily detected and the respondent can be prompted to give
a valid response;
Can enforce question answering;
Can randomize question order;
Limited results can be automatically available.
Graphically sophisticated surveys are mostly used in market research. See for
example:
http://ecustomeropinions.com/survey/survey.php?sid=715363891
http://net05.mwm2.nl/go.aspx?vp=B62AA890-9312-4D87-AEC5-5C74597A84DC
Semester 2 2015
12
STA60004
Semester 2 2015
13
STA60004
Exercise 1
Develop the following questionnaire using Google Forms:
Parent Opinion Questionnaire
(Note: the questionnaire is intentionally simplified)
Female
Q.2 Your age: 20-25 years old 26-30 31-35 36-40 41-45 50+
Q.3 The academic standards at this school provide adequate challenge for my child
Strongly disagree 1 2 3 4 5 Strongly agree
Q.4 In this question, please indicate which elements of a parent-teacher interview
IMPORTANT to you (choose as many as you wish):
Information on my childs participation in the classroom
Information on my childs academic achievement in relation to the rest of the
class
Suggestions on how my child could improve
Information on my childs relationships with other children
Receiving clear answers to my questions
Please use the space below if there are any comments you would like to make about
this school.
Semester 2 2015
14
STA60004
Semester 2 2015
15
STA60004
Click on Forms
Alternatively, you can click on Drive, then click New in the top left corner,
hover over More, and select Google Forms
Semester 2 2015
16
STA60004
Fill in the title of your questionnaire, Parent Opinion Survey, by replacing the
text in the Untitled Form box.
Choose a Theme for your questionnaire
Click OK
17
STA60004
Semester 2 2015
18
STA60004
19
STA60004
Semester 2 2015
20
STA60004
Click on Done
21
STA60004
Use Paragraph text to create an open response question, Please use the
space below if there are any comments you would like to make about this
school.
Semester 2 2015
22
STA60004
To view how your questionnaire looks like, clock on View live form
Semester 2 2015
23
STA60004
Semester 2 2015
24
STA60004
To learn more about creating surveys using Google Forms, use the following link:
https://support.google.com/docs/answer/2839737
Semester 2 2015
25
STA60004
Semester 2 2015
26
STA60004
5. Cost
-
6. Time
-
In paper surveys, the time required depends on how long respondents take to
complete and return their questionnaires; and the researcher has limited
control over this.
Multi-Mode Methods
Sometimes researchers use more than one data collection methods. This allows
using the advantages of one method to counteract the disadvantages of another.
For example, to reduce the cost of face-to-face interviews, telephone or postal
surveys can be used for respondents who live far away. Or if you conduct a webbased survey, those without internet access can be interviewed using face-toface, postal or telephone methods.
Although multi-mode method may allow obtaining more representative samples,
the mode of administration can affect the way people respond (so called mode
effects).
Semester 2 2015
27
STA60004
5. Observation Methods
Observation methods can be a valuable tool for learning about behaviour, and
they may provide a more valid measurement of some behaviour than interviewing
techniques.
However, there may be some ethical issues associated with covertly surveying
people.
Observation Methods Advantages
No interviewer effects;
Good for studies about relationships/ interactions;
Suitable for studies about behaviour (e.g., children)
Observation Methods Limitations
May be time consuming;
May be difficult to objectively collect data.
Read "Structured Observation" (Bryman) to learn more about this method of
collecting data:
http://ezproxy.lib.swin.edu.au/login?url=http://onlineres.swin.edu.au/993314065.pdf
Semester 2 2015
28
STA60004
29
STA60004
30
STA60004
Exercise 2
Go to http://www.abs.gov.au/
Click on Census Data
Semester 2 2015
31
STA60004
Semester 2 2015
32
STA60004
Click on GO
The following information will be displayed:
Semester 2 2015
33
STA60004
You also can compare 2011 Census data with 2006 Census data or 2001 Census
data:
Semester 2 2015
34
STA60004
4. Has the number of families who live in Hawthorn increased or decreased over
the last 10 years?
6. Has the percentage of Hawthorn residents who speak Greek at home changed?
Semester 2 2015
35
STA60004
Exercise 3
Case study
Bryman (2012), Chapter 10: Self-completion questionnaires
36
STA60004
OPCS (1995) Living in Britain: Results from the 1994 General Household Survey.
London: HMSO.
Pilgrim, D. & Rogers, A. (1993) A sociology of mental health and illness.
Buckingham: Open University Press.
Question:
1. If women consult their local doctor more often than men, does this indicate that
women are ill more often than men? Explain.
(Discuss the question on the Blackboard/ Discussion Board)
37
STA60004
Bibliography
Boyce, J. (2003). Market research in practice. Sydney: McGraw Hill.
Bryman, A. (2012). Social research methods (4th ed.). Oxford University Press.
De Vaus, D.A. (2002). Surveys in social research (5th ed.). Sydney: Allen & Unwin.
Sarantakos, S. (2013). Social research (4th ed.). Basingstoke: Palgrave Macmillan.
Semester 2 2015
38
STA60004
Semester 2 2015
39
STA60004
Contents
Learning Objectives
Optional Reading
What can we measure in a questionnaire?
Types of questions
Behavioural Questions
Belief Questions
Knowledge Questions
Attitude Questions
Attribute Questions
Exercise 1
Principles of Question design
Relevance
Reliability
Validity
Discrimination
Question Wording
Exercise 2
Exercise 3
Closed and Open-Ended Questions
Principles of Developing Question Response Formats
Exhaustiveness (or Inclusiveness)
Exclusiveness
Balancing categories
Types of Question Response Formats
Simple Itemised Rating Scales
Likert Scale
Horizontal Rating Scales
Semantic Differential Scales
Ranking Scales
Checklists
Dichotomous questions
Paired comparisons
Exercise 4
Exercise 5
Response Sets
Order of Questions in the Questionnaire
Pilot Testing or Pretesting Questions
Examining Existing Questions
Bibliography
Answers to Selected Exercises
Semester 2 2015
3
3
4
4
5
5
6
6
7
8
9
9
9
9
10
11
16
16
18
21
21
21
22
23
23
24
24
24
25
25
26
26
27
28
29
30
31
32
33
34
2
STA60004
Learning Objectives:
On completion of this topic you will be able to:
1. Recognise and construct different types of survey questions, including
-
Behavioural questions;
Knowledge questions;
Attitudinal questions;
Attribute questions;
Optional Reading
Bryman, A. (2012). Social research methods (4th ed.). Oxford University Press
Chapter 11
De Vaus, D.A. (2002). Surveys in social research (5th ed.). Sydney: Allen & Unwin
Chapter 7
Semester 2 2015
STA60004
Types of questions
There are various ways of classifying questions, but here are the main types of
questions that are used in surveys:
Behaviour questions about what people do;
Beliefs questions about what people think is true or false;
Knowledge questions about the accuracy of beliefs;
Attitudes questions about what people think is desirable;
Attributes questions about respondents characteristics.
Semester 2 2015
STA60004
Behavioural questions
Behavioural questions are questions that measure what people do, or what they
have done. When we measure behaviour, we are usually trying to establish:
-
Can provide information on which types of mothers work and which types do not;
Belief questions
Belief questions measure what people believe is true or false
Topic: Workforce participation of mothers of preschool age children
Belief question: Ask respondents about what they believe to be the effects of day
care centres on the emotional development of preschool-age children
Reasons of asking belief questions: To establish what people think is true rather than
on the accuracy of their beliefs
Answers to questions about beliefs and attitudes are not necessarily an indicator of
behaviour. Likewise, what a person does may have no bearing on their beliefs.
Peoples behaviour and beliefs are often inconsistent or irrational, and therefore they
do not necessarily behave as they would like to behave. For example, if a person
drives though a red traffic light, we cannot infer that he/she believes that to drive
through a red light is the right thing to do; and we cannot necessarily say that he/she
has a careless attitude to road safety.
Semester 2 2015
STA60004
Knowledge questions
Knowledge questions assess respondents knowledge of particular facts.
Topic: Workforce participation of mothers of preschool age children
Knowledge question: Ask respondents what they know about government programs
aimed to assist parents with preschool age children to work parttime
Reasons for asking knowledge questions: To establish the accuracy of respondents
beliefs
In the consumer market, knowledge questions may relate to product awareness,
awareness of product attributes and the price of the product.
Attitude Questions
Attitude questions try to determine what people like and what they think is desirable;
Topic: Workforce participation of mothers of preschool age children;
Attitude question: Ask respondents about their attitudes regarding whether or not
mothers with pre-schoolers ought to participate in the workforce.
STA60004
Semester 2 2015
STA60004
Exercise 1
Identify whether the following questions measure behaviour, beliefs, knowledge,
attitude or attributes.
What is your highest level of education?
Did you take any natural herbs to improve your athletic or sporting performance?
There are plenty of viable alternatives to the use of animals in biomedical and
behavioral research.
Semester 2 2015
STA60004
Relevance
Always keep in mind your research questions when designing a questionnaire. Ask
questions that relate to your research objectives. For each question, you should ask
yourself whether it really is necessary. Also, you should avoid asking questions not
related to your research objectives because extra questions will unnecessary
lengthen your questionnaire and it is not fair to waste respondents time.
Reliability
The same respondent should answer the question in the same way on different
occasions (assuming that the respondent has not changed in the meantime). For
example, ambiguous questions may produce unreliable responses because
respondents may read the question differently on different occasions.
Validity
The question should measure what it is supposed to measure. For example, if we
use self-rated health (i.e. how healthy are you?) as a measure of health we should
be confident that it measures health rather than something else such as optimism
and pessimism). Decide exactly what it is you want to measure.
Example (Bryman, p. 254):
Consider the following question: Do you have a car?
What this question is designed to measure? If it is a car ownership, the question is a
bit ambiguous because it can be interpreted as: personally owning a car; having
access to a car in a household; and having a company car. Therefore, an answer of
yes may or may not be indicative of car ownership. It would be better to ask: Do
you own a car?
Semester 2 2015
STA60004
Discrimination
There should be variation in the sample on the key variables (e.g. if we want to study
whether there is a link between gender and income we need to have a sample in
which there is a good variety of income levels). Low variance may be a result of poor
question design. For instance, a limited range of response alternatives can produce
low variance. If you ask about income and offer only two alternatives of less than
$100,000 a year and more than $100,000 a year, you would not identify much
variation in the sample.
Semester 2 2015
10
STA60004
Question Wording
Survey questions must be formulated so that respondents can answer them easily
and accurately.
Semester 2 2015
11
STA60004
Strongly agree
Agree
Undecided
Disagree
Strongly disagree
Semester 2 2015
12
STA60004
10. Is the question too general? Is the frame of reference for the
question sufficiently clear?
How satisfied are you with your job?
This question lacks specificity. Do you mean pay, conditions, relationship with
colleagues?
How often do you see your mother?
Not clear within what time frame:
-
Semester 2 2015
13
STA60004
Better:
Within the last year how often would you have seen your mother on average? and
provide alternatives such as 'daily' through to 'never' to help further specify the
meaning of the question.
14. Is
the
question
wording
unnecessarily
detailed
or
objectionable?
Questions about precise age or income can create problems (e.g., low response
rate). If we do not need precise data on these variables it is better to ask
respondents to put themselves in categories such as age or income groups.
Semester 2 2015
14
STA60004
Semester 2 2015
15
STA60004
Exercise 2
Provide an example of an ambiguous question and explain why this question is
ambiguous.
(Use Discussion Board/ Blackboard to discuss this exercise)
Exercise 3
Comment on any potential problems with each of the following questions.
Semester 2 2015
16
STA60004
5. Dont you agree that social workers should earn more money than they currently
earn?
Semester 2 2015
17
STA60004
Open-Ended Questions
Open-ended questions are useful in the following situations:
To collect attribute information where the number of response options is too large
to precode: (e.g., Where were you born?)
To collect information where the response options are unknown, or feedback is
required (e.g., What aspects of this subject interest you the most?)
To get at general feelings;
To find out respondents reasons for their opinions.
Advantages
For respondents:
Many possible answers are allowed.
For researchers:
The researcher does not have to advance-guess the possible responses;
Unusual responses may be derived;
Useful for generating fixed-choice format answers;
A clearer insight into the respondents logic and way of thinking;
Data can be analysed qualitatively and quantitatively;
Useful for exploring new areas or areas in which the researcher has limited
knowledge.
Disadvantages
For respondents:
More demanding to answer.
For researchers:
More demanding (time-consuming) to process and code;
Semester 2 2015
18
STA60004
Responses may not be relevant (e.g., the respondent may have misinterpreted a
question);
Researchers can misinterpret the answers and thus misclassify responses.
The responses to open questions are often difficult to compare and interpret.
Example (Fink, v.2, p.36):
An Open-Ended Question and Three Answers
Question: How often during the past month did you find yourself having difficulty
trying to calm down?
Answer 1: Not often
Answer 2: About 10% of the time
Answer 3: Much less often than the month before
It is not very easy to compare the three answers. Does 10% of the time (Answer 2)
mean not often? How does Answer 3 compare to the other two?
Closed Questions
Example (Fink, v.2, p.36):
A closed Question
How often during the past month did you find yourself having difficulty trying to
calm down?
[Circle one number]
Always
Very often
Fairly often
Sometimes
Almost never
Never
Semester 2 2015
19
STA60004
For researchers:
Quicker to process;
Easier to analyse;
Cheaper to process and analyse;
Enhance the comparability of answers.
Disadvantages
For respondents:
The response options may be too narrow for the respondent
For researchers:
The response options must be exhaustive, so the researcher has to advanceguess the possible responses
Semester 2 2015
20
STA60004
Better:
Your relationship status:
- Single
- Married
- De facto
- Separated
- Divorced
- Widow
1. Exclusiveness
The response choices should be mutually exclusive.
Example:
What was your personal income in 2009?
-
$20,000 or less
$20,000 to 50,000
$50,000 to 75,000
$75,000 or more
Better:
What was your personal income in 2009?
-
$20,000 to 49,999
$50,000 to 74,999
$75,000 or more
Semester 2 2015
21
STA60004
2. Balancing categories
Where response categories can be ordered from high to low there should be the
same number of response alternatives either side of the neutral position. The two
endpoints should mean the opposite of each other.
For example:
- Strongly approve
- Approve
- Neither approve nor disapprove
- Disapprove
- Strongly disapprove
Did you encounter any problems using the questions response scale?
Semester 2 2015
22
STA60004
Likert scale;
Ranking Scales
Checklists
Binary Choice Formats
-
Dichotomous questions;
Paired comparisons
Very true
Definitely yes
Fairly good
Somewhat true
Probably yes
Probably no
Definitely no
Very important
Very different
Very interested
Fairly important
Somewhat different
Somewhat interested
Neutral
Slightly different
Not so important
Always
Completely satisfied
Very often
Very satisfied
Fairly often
Somewhat satisfied
Sometimes
Somewhat dissatisfied
Almost never
Very dissatisfied
Never
Completely dissatisfied
Semester 2 2015
23
STA60004
Likert Scale
The scale was developed in 1932 by American psychologist Rensis Likert. Likert
scale is usually used for measuring attitudes. Respondents are asked to indicate
their level of agreement or disagreement with the statement.
Strongly disagree
Disagree
Neither agree nor disagree
Agree
Strongly agree
Families
should be fully
should be fully
responsible for
responsible for
elder care
elder care
________________________________________________________
1
Dont know
_______
disorganised
____________________________________________
1
A good
employer
Poor
employer
____________________________________________
1
Semester 2 2015
24
STA60004
Ranking Scales
Ranking scales require respondents to indicate relative importance of items.
Example (de Vaus, p.104):
Listed below is a set of issues that can influence the way in which people decide to
vote in general elections. Please rank each of these issues to indicate how important
they are to you when you decide to vote. Place 1 in the box next to the most
important issue, 2 next to the second most important issue and so on. Do not place
the same number in more than one box.
Policies to reduce unemployment
Improving the environment
Spending more money on education
Getting tough on crime
Reducing taxation
Improving social welfare support
Improving health services
Reducing immigration
Checklists
Respondents are provided with a list of items and asked to select those that apply.
Example:
What subjects did you do at school? Please choose all that apply.
Biology
Chemistry
English
Geography
History
Information Technology
Legal Studies
Mathematics
Psychology
Semester 2 2015
25
STA60004
Dichotomous questions
Respondents are asked to choose between one of two alternatives. For example,
Do you smoke cigarettes?
- Yes
- No
Paired comparisons
Respondents are given a set of pairs of items and asked to select one response from
each pair.
Example (de Vaus, p. 105)
Governments have to make choices between the areas to which they give priority
when allocating government expenditure. For each pair of expenditure areas tick the
one you think ought to be given priority.
Education
Education
Health
Social welfare
Health
Social welfare
Defence
Defence
Environment
Health
Industry support
Health
Environment
Family support
Recreation
Defence
Semester 2 2015
26
STA60004
Exercise 4
Comment on any potential problems with each of the following questions.
1. Which of the following best describes where you were when you first started
smoking?
(A) Alone
(B) With members of your family
(C) With friends
Semester 2 2015
27
STA60004
Exercise 5
A researcher is conducting a survey of anxiety and depression in the workplace. He
would like to ask, In the past month, how often has feeling depressed interfered with
doing your job? What response choices can the researcher use for this question?
Semester 2 2015
28
STA60004
Response Sets
Response sets refers to a tendency to respond to a question in some characteristic
manner regardless of the content of the question.
Social desirability - the tendency to provide the respectable rather than the true
response. As a result, socially 'desirable' behaviours (e.g. amount of physical
exercise) are over-reported while socially 'undesirable' behaviours (e.g. alcohol
consumption, sexist and racist attitudes) are under-reported.
Acquiescence - the tendency to agree with a statement regardless of its content;
Nonacquiescence - the tendency to disagree with a statement regardless of its
content.
Semester 2 2015
29
STA60004
factual questions;
Semester 2 2015
30
STA60004
Semester 2 2015
31
STA60004
Semester 2 2015
32
STA60004
Bibliography
Bryman, A. (2012). Social research methods (4th ed.). Oxford University Press.
De Vaus, D.A. (2002). Surveys in social research (5th ed.). Sydney: Allen & Unwin.
Fink, A. (2003). The survey kit, Volume 2: How to ask survey questions (2nd ed.)
London: Sage Publications.
Semester 2 2015
33
STA60004
the
following questions
measure
behaviour,
beliefs,
Semester 2 2015
34
STA60004
Semester 2 2015
35
STA60004
Semester 2 2015
36
STA60004
Often
Sometimes
Never
Always
Very often
Fairly often
Sometimes
Almost never
Never
All of the time
Semester 2 2015
37
STA60004
Contents
Learning Objectives
Topic Introduction
Examples of Scales
15
15
16
Exercise 1
17
18
Step Four: Review of Items, Pilot Testing and Developmental Testing of the Scale
18
19
20
Test-Retest Reliability
21
21
Split-Half Reliability
22
Internal Consistency
23
Exercise 2
25
Exercise 3
35
38
Content Validity
38
Face Validity
39
Criterion Validity
40
Predictive Validity
41
Construct validity
42
44
Exercise 4
45
Bibliography
46
47
Semester 2 2015
STA60004
Learning Objectives
By the end of this topic you should:
Have an understanding of the issues associated with measurement in the social
sciences
Be familiar with the notion of reliability
Understand the use of Cronbach's alpha and its interpretation
Understand the various types of validity relevant to scale evaluation
Have the necessary SPSS computing skills to test the reliability of a scale
Semester 2 2015
STA60004
Topic Introduction
In research we often need to measure complex constructs or concepts. This is
commonly done using scales. A scale is a composite measure of a concept.
The development of good scales is a very complex issue involving a variety of tools.
This topic provides an introduction to this important area of survey research.
Suppose we want to develop a scale for measuring environmental footprint. It is
important to have a sound understanding of the literature in this area before you
start. A conceptual model of the basic construct needs to be developed, perhaps
something like the following. To fully represent this construct, items representing
each of the components are required.
Semester 2 2015
STA60004
Example
NEO Personality Inventory (Costa & McCrae, 1992) measures major domains of
personality.
It
contains
five
scales:
Neuroticism,
Extraversion,
Openness,
Warmth
Gregariousness
Extraversion
Assertiveness
Activity
Excitement-Seeking
Positive Emotions
To fully represent this construct, items representing each of the facets are required.
Each facet in this scale is represented by 8 items. For example, an item representing
the Warmth facet is Im known as a warm and friendly person.
Gregariousness:
I really feel the need for other people if I am by myself for long.
Assertiveness:
I have often been a leader of groups I have belonged to.
Activity:
I often feel as if Im bursting with energy.
Excitement-Seeking:
I like being part of the crowd at sporting events.
Positive Emotions:
I am a cheerful, high-spirited person.
Semester 2 2015
STA60004
All facets of the Extraversion scale are strongly correlated with each other.
Any scale you might want to develop involves the assumptions of additivity and
interval scaling.
Additivity means that respondents are asked to answer several items that constitute
a scale and then all the answers are added up to obtain an overall score. For
example, instead of measuring depression by asking respondents how much they
feel depressed, we would ask about a range of behaviours which tap depression. We
then add up the answers, and obtain an overall measure of depression.
An analogy for a scale is a students marks in a subject. The student usually
completes a number of pieces of work (an essay, a report, an examination) and
receives a final mark. The final mark is meant to reflect the students knowledge of
the subject. This is measured by summing the scores for each piece of work into the
overall score.
Interval Scaling: Items of a questionnaire are measured on an interval scale. Most
often researchers use Likert summative scale which asks people to say how much
they agree or disagree with the scale items.
Reasons for measuring a concept by using multiple indicators rather than one:
1. It helps reflect the complexity of the concept.
2. It leads to developing more valid measures. It can help to avoid some of the
distortions and misclassification which can occur by using one-item measures of
complex concepts.
3. Multiple indicators increase reliability. For example, question wording can affect
the way respondents answer it. Respondents answers could be largely a function
of the wording of the question. Using a number of questions should minimize the
effect of one question which is poorly worded.
4. Multiple indicators allow greater precision. For example, using suburb of
residence as a measure of persons social status may lead to a very crude, and
even wrong, classification. Much better to take into account other indicators, such
as education, occupation, income etc.
Semester 2 2015
STA60004
Semester 2 2015
STA60004
Examples of Scales
Questionnaire 1
Instructions: For each of the following statements, circle the number on the 5-point
scale that best describes how that statement applies to you.
Strongly
Disagree
Disagree
Neither
disagree
nor agree
Agree
Strongly
Agree
1. I chose my present courses largely with a view to the job situation when
I graduate rather than out of their intrinsic interest to me.
7.
9. Whether I like it or not, I can see that further education is for me a good
way to get a well-paid or secure job.
10. I feel that virtually any topic can be highly interesting once I get into it.
11. I tend to choose subjects with a lot of factual content rather than
theoretical kinds of subjects.
12. I find that I have to do enough work on a topic so that I can form my own
point of view before I am satisfied.
13. Even when I have studied hard for a test, I worry that I may not be able
to do well in it.
I learn some things by rote, going over and over them until I know them
by heart.
Semester 2 2015
STA60004
Disagree
Neither
disagree
nor agree
Agree
Strongly
Agree
16. I try to relate what I have learned in one subject to that in another.
19. I learn best from lecturers who work from carefully prepared notes and
outline major points neatly on the blackboard.
20. I find most new topics interesting and often spend extra time trying to
obtain more information about them.
21. I almost resent having to spend a further three or four years studying
after leaving school, but feel that the end results will make it worthwhile.
24. I spend a lot of my free time finding out more about interesting topics
which have been discussed in different classes.
27. I am very aware that lecturers know a lot more than I do and so I
concentrate on what they say is important rather than rely on my own
judgment.
28. I try to relate new material, as I am reading it, to what I already know on
that topic.
Semester 2 2015
STA60004
Motive
Strategy
SA: Surface
Surface
Motive
(SM)
is
instrumental: main purpose is to
meet requirements minimally: a
balance between working too hard
and failing.
Surface
Strategy
(SS)
is
reproductive: limit target to bare
essentials and reproduce through
rote learning.
DA: Deep
To calculate Surface Approach (SA) score, sum up the scores for the following
questions:
SA
=
SM
+
SS
question1+q5+q9+q13+q17+q21+q25+q3+q7+q11+q15+q19+q23+q27
Semester 2 2015
10
STA60004
Questionnaire 2
Instructions: For each of the following statements, circle the number on the 5-point
scale that best describes how that statement applies to you and your mother.
Strongly
Disagree
Disagree
Neither
disagree
nor agree
Agree
Strongly
Agree
1.
2.
Even if her children didnt agree with her, my mother felt that it was for
our own good if we were forced to conform to what she thought was
right.
3.
4.
5.
6.
My mother has always felt that what children need is to be free to make
up their own minds and to do what they want to do, even if this does not
agree with what their parents might want.
7.
8.
9.
My mother has always felt that more forces should be used by parents
in order to get their children to behave the way they are supposed to.
10. As I was growing up my mother did not feel that I needed to obey rules
and regulations of behaviour simply because someone in authority had
established them.
12. My mother felt that wise parents should teach their children early just
who is boss in the family.
14. Most of the time as I was growing up my mother did what the children in
the family wanted when making family decisions.
Semester 2 2015
11
STA60004
Disagree
Neither
disagree
nor agree
Agree
Strongly
Agree
18. As I was growing up, my mother let me know what behaviours she
expected of me, and if I didnt meet those expectations, she punished
me.
21. My mother did not view herself as a responsible for directing and
guiding my behaviour as I was growing up.
22. My mother had clear standards of behaviour for the children in our
homes as I was growing up, but she was willing to adjust those
standards to the needs of each individual children in the family.
25. My mother has always felt that most problems in society would be
solved if we could get parents to strictly and forcibly deal with their
children when they dont do what they are supposed to as they are
growing up.
26. As I was growing up my mother often told me exactly what she wanted
me to do and how she expected me to do it.
28. As I was growing up my mother did not direct the behaviours, activities
and desires of the children in the family.
Semester 2 2015
12
STA60004
Disagree
Neither
disagree
nor agree
Agree
Strongly
Agree
30. As I was growing up, if my mother made a decision in the family that
hurt me, she was willing to discuss that decision with me and to admit it
if she had made a mistake.
Semester 2 2015
13
STA60004
Questionnaire 3
Please rate the following statements on the 4-point scale
Strongly
Disagree
Disagree
Somewhat
Agree
Somewhat
Strongly
Agree
2. I was always able to gain access to my online course(s) and the applicable
network resources (library, e-mail, etc) when needed.
3. I was given multiple ways to interact with the teacher and other
students (e.g., e-mail, discussion) in all online course(s)
5. Before starting my online course(s), I was well advised about the selfmotivation and commitment needed to succeed at distance learning
Semester 2 2015
14
STA60004
2.
3.
4.
Review of the items, pilot testing and developmental testing of the scale;
5.
reliability and
validity.
When it comes to the evaluation of scales we will only be touching the surface.
There are many evaluation techniques such as exploratory factor analysis,
confirmatory factor analysis and Rasch analysis which will not be covered in this unit.
Semester 2 2015
15
STA60004
want to develop a measure of all the dimensions of the concept or focus on just one
or two.
You need to think about whether you are assessing a single psychological factor, or
more than one. Often what seems like a single mood, set of beliefs or other
psychological factor, may prove to be complicated. For example, with self-rated
religiosity, you would need to decide whether to distinguish between internal
religious feelings and experiences, and outward observances, such as affiliation and
practice.
Another example: If you wish to study musical preference, you would need to decide
whether to look at liking music in general or to consider different uses of music (e.g.,
listening, performance, therapy, dance, and so forth). You also need to decide
whether to consider different types of music or just concentrate on one type of music.
Semester 2 2015
16
STA60004
Exercise 1
(Litwin, 2003, p.77, exe.7)
The housing office of a large university wants to measure student satisfaction with
various aspects of the campus dormitories. After researching the relevant published
literature, the housing director cannot find a survey instrument that she thinks is
appropriate, so she decides to develop her own. How would you advise her to begin
her project?
Semester 2 2015
17
STA60004
The first version of your scale should be longer than the final scale. Develop more
items than you really want to include in the final version. Some items may have to be
discarded because they do not meet all the necessary criteria.
So how many items should the scale contain? There are two points to consider:
Too few items may not produce a reliable measure.
Reliability coefficient alpha tends to be too low when there are few items on a
scale.
Too many items may put too much burden on the participants and there may be
repetitiousness in the items.
Something between 6 and 15 items should be enough for assessing a single factor.
To ensure this, you should start with between 10 and 30 items. If you have several
subscales, you need to keep each subscale as short as possible. If you have a very
long scale, no matter how interesting the topic is, most people will loose interest
before they have finished.
Semester 2 2015
18
STA60004
necessary for testing the clarity of the instructions, the clarity of the items and how
much time is needed in order to complete the scale.
Developmental testing of the scale is used to assess the scales reliability and
validity, so it is important that the sample you are using is representative of the
population for which the scale has been developed.
Semester 2 2015
19
STA60004
20
STA60004
Semester 2 2015
21
STA60004
Split-Half Reliability
Conceptually, the split-half procedure is somewhat similar to the alternate-forms
procedure. The scale is administered to a group of people. The scale items are then
divided into two half-length tests. Each respondent thus receives two scores, one for
each half-tests. These two sets of scores are then correlated.
The correlation coefficient indicates only the reliability of the half-length test. Since
correlation is directly associated with variance, and variance is directly associated
with test length, a full-length test would be expected to have somewhat higher
reliability than a half-length test. Therefore, the correlation coefficient is adjusted
using Spearman-Brown formula:
If, for example, the half-length tests intercorrelate .70, the reliability of the whole test
equals .82
(2 x 0.70)/(1 + 0.70) = 1.4/1.7 = 0.82
How do we split the test into halves? There are, of course, many possible ways. One
of the very often used procedures is the odd-even procedure: scoring oddnumbered items in one half and even-numbered items in the other.
Semester 2 2015
22
STA60004
The split-half procedure requires only a single administration, in comparison with the
test-retest and the alternate forms which require two test administrations.
Internal Consistency
Internal consistency differs somewhat from other reliability testing procedures. A
procedural difference is that the correlation statistic is not used directly. A conceptual
difference is that an internal consistency coefficient tells us about similarity in
measurement across items rather than stability over time or across forms. Split-half
procedure is an index of consistency between halves, and the internal consistency
procedure is an index of inter-item consistency.
Internal consistency reliability is concerned with the homogeneity of the items within
a scale. A scale is internally consistent to the extent that its items are highly
intercorrelated. Ideally, scale items should show relatively high variance, with mean
scores falling close to the centre of the range of possible scores. Items with low
variance do not discriminate among individuals with different levels of the construct
of interest, and therefore do not contribute to the scale as a whole.
The reliability of the scale is determined by the intercorrelation among each of its
items (Item-item correlations). Items with very low or negative correlations with other
items in the scale should be identified and marked for deletion from the final version
of the scale.
Cronbachs (1951) alpha is the most commonly used method of testing the reliability
of a scale. The formula for coefficient alpha is
is the variance of
Semester 2 2015
23
STA60004
are dependent on both the average correlation among the items and also the
number of items included in the scale.
In general, a minimum level of 0.7 is recommended for Cronbachs alpha. If alpha is
above 0.9, this usually means that there are some items in the scale which are very
similar to each other. In this case, shortening of the scale is recommended through
discarding some of those items.
The size of alpha is affected by the reliability of individual items. To increase the
alpha of the scale, and thus the scales reliability, we need to delete all unreliable
items. To identify which items are unreliable we need to examine various statistical
properties of the items. The most common and useful measures are item-total
correlations and alpha if item deleted.
Item-total correlation (or item-rest of test correlation) is the correlation between the
item and the total score of the scale, calculated without including the item being
investigated. Good item-total correlations are higher than 0.5. Some authors
suggest that item-total correlations should be at least higher than 0.3.
Alpha if item deleted statistic involves calculating what the alpha would be if a
particular item was dropped. If alpha becomes higher when an item is deleted, then
that item is unreliable and should be discarded.
If you use SPSS, the procedure to obtain alpha, item-total correlations and alpha if
item deleted is as follows:
Select ANALYZE, SCALE,RELIABILITY ANALYSIS;
Select variables which constitute your scale, for example item1, item2, etc. and
arrow them across;
Select model as ALPHA;
Click on STATISTICS;
Under DESCRIPTIVES select ITEM, SCALE, SCALE IF ITEM DELETED;
Under CORRELATIONS select CORRELATIONS;
Click CONTINUE and then OK to run.
Semester 2 2015
24
STA60004
Exercise 2
Internal Consistency
One hundred and fifty Swinburne students completed Spielbergers (1983) StateTrait Anger inventory. This widely used test is designed to assess a persons level
of state anger and their level of trait anger.
State anger is the level of anger a person is experiencing at the time of the test.
That is how angry a person is in a particular situation at a particular point in time.
Individuals who score highly on state anger are assumed to be experiencing high
levels of anger at the time the test was taken.
Trait anger is a measure of a persons general predisposition towards anger. People
who score highly on trait anger tend to be more vulnerable to anger across a number
of situations and across time.
While a person high on trait anger is expected to also score highly on state anger,
state anger is hypothesized to change across time and across situations while trait
anger is expected to remain fairly stable.
The items are:
State Anger Scale items: rated according to how you feel right now (1=not at all,
2=somewhat, 3=moderately, 4=very much)
q1
I am furious
q2
I feel irritated
q3
I feel angry
q4
q5
q6
I am mad
q7
q8
q9
I am burned up
q10
Semester 2 2015
25
STA60004
Trait Anger Scale items: rated according to how you generally feel
q11
I am quick tempered
q12
q13
I am a hotheaded person
q14
q15
I feel annoyed when I am not given recognition for doing good work
q16
q17
q18
q19
q20
In the file State Trait Anger.sav (available on the Blackboard) you have the following
variables: sex, q1 to q10 (10 state anger items taken at Time 1), q11 to q20 (trait
anger items taken at Time 1), state1 (total score of the state items at Time 1), trait1
(total score of the trait items at Time 1), state2 (total score of the state items at Time
2), trait2 (total score of the trait items at Time 2).
Semester 2 2015
26
STA60004
Alpha for the State Anger Scale, item-total correlations and alpha if item is deleted
for each item were obtained.
Semester 2 2015
27
STA60004
Semester 2 2015
28
STA60004
Reliability Statistics
Cronbach's Alpha
Based on
Standardized
Cronbach's Alpha
Items
.911
N of Items
.921
10
Item Statistics
Mean
Std. Deviation
q1
1.3172
.64230
145
q2
1.9172
.90141
145
q3
1.3931
.71973
145
q4
1.2621
.62384
145
q5
1.1172
.39971
145
q6
1.3241
.69605
145
q7
1.1793
.53579
145
q8
1.1310
.47515
145
q9
1.4138
.75080
145
q10
1.5655
.87253
145
Semester 2 2015
29
STA60004
q2
q3
q4
q5
q6
q7
q8
q9
q10
q1
1.000
.477
.735
.484
.422
.716
.540
.500
.460
.533
q2
.477
1.000
.596
.533
.374
.552
.390
.350
.349
.572
q3
.735
.596
1.000
.651
.515
.825
.554
.478
.391
.606
q4
.484
.533
.651
1.000
.572
.683
.586
.563
.360
.504
q5
.422
.374
.515
.572
1.000
.586
.712
.650
.439
.486
q6
.716
.552
.825
.683
.586
1.000
.662
.585
.526
.657
q7
.540
.390
.554
.586
.712
.662
1.000
.725
.436
.599
q8
.500
.350
.478
.563
.650
.585
.725
1.000
.431
.406
q9
.460
.349
.391
.360
.439
.526
.436
.431
1.000
.499
q10
.533
.572
.606
.504
.486
.657
.599
.406
.499
1.000
Valid
a
Excluded
Total
%
145
96.7
3.3
150
100.0
Cronbach's
Scale Mean if
Scale Variance
Corrected Item-
Multiple
Alpha if Item
Item Deleted
if Item Deleted
Total Correlation
Correlation
Deleted
q1
12.3034
21.005
.713
.617
.900
q2
11.7034
19.863
.614
.457
.910
q3
12.2276
19.996
.792
.765
.895
q4
12.3586
21.162
.707
.576
.900
q5
12.5034
22.905
.666
.585
.906
q6
12.2966
19.821
.856
.789
.891
q7
12.4414
21.679
.731
.702
.900
q8
12.4897
22.474
.646
.610
.905
q9
12.2069
21.249
.549
.380
.911
q10
12.0552
19.358
.715
.577
.901
Scale Statistics
Mean
13.6207
Variance
25.612
Semester 2 2015
Std. Deviation
5.06084
N of Items
10
30
STA60004
2. What does it tell us about the scales overall internal consistency? (Remember:
when Alpha is: <.60 reliability is unacceptable, .61-.80 low to moderate, .81-.90
moderate to high, >.90 very high).
3. Which items do you think are contributing to the scales overall reliability and
why? (Those items with high item-total correlations (higher than .50) and Alpha if
item is deleted values which are lower than the overall Alpha.)
4. Which items would you suggest have poor reliability and why?
Semester 2 2015
31
STA60004
Also, alpha for the Trait Anger Scale and item-total correlations and alpha if item is
deleted for each item were obtained using SPSS.
SPSS output is as follows:
Valid
%
149
99.3
.7
150
100.0
Excluded
Total
Reliability Statistics
Cronbach's
Alpha Based on
Cronbach's
Standardized
Alpha
Items
.838
N of Items
.843
10
Item Statistics
Mean
Std. Deviation
q11
1.8725
.73786
149
q12
1.8121
.81679
149
q13
1.6376
.72796
149
q14
2.2148
.83475
149
q15
2.4698
.85864
149
q16
1.6376
.65980
149
q17
2.0604
.79889
149
q18
2.5168
.92710
149
q19
1.5638
.76514
149
q20
2.5302
.90462
149
Semester 2 2015
32
STA60004
q12
q13
q14
q15
q16
q17
q18
q19
q20
q11
1.000
.621
.593
.549
.394
.501
.380
.255
.176
.325
q12
.621
1.000
.714
.426
.252
.562
.463
.299
.333
.227
q13
.593
.714
1.000
.507
.328
.555
.398
.329
.394
.191
q14
.549
.426
.507
1.000
.386
.216
.284
.283
.158
.439
q15
.394
.252
.328
.386
1.000
.207
.234
.440
.098
.469
q16
.501
.562
.555
.216
.207
1.000
.439
.231
.367
.064
q17
.380
.463
.398
.284
.234
.439
1.000
.359
.364
.329
q18
.255
.299
.329
.283
.440
.231
.359
1.000
.129
.404
q19
.176
.333
.394
.158
.098
.367
.364
.129
1.000
.053
q20
.325
.227
.191
.439
.469
.064
.329
.404
.053
1.000
Item-Total Statistics
Squared
Cronbach's
Scale Mean if
Scale Variance
Corrected Item-
Multiple
Alpha if Item
Item Deleted
if Item Deleted
Total Correlation
Correlation
Deleted
q11
18.4430
21.492
.655
.561
.813
q12
18.5034
20.900
.662
.610
.810
q13
18.6779
21.341
.690
.632
.810
q14
18.1007
21.470
.562
.448
.820
q15
17.8456
21.848
.490
.361
.828
q16
18.6779
22.787
.523
.464
.825
q17
18.2550
21.745
.554
.375
.821
q18
17.7987
21.594
.471
.314
.831
q19
18.7517
23.472
.331
.246
.841
q20
17.7852
21.981
.438
.391
.834
Scale Statistics
Mean
20.3154
Variance
26.515
Std. Deviation
5.14924
N of Items
10
Semester 2 2015
33
STA60004
7. Which items do you think are contributing to the scales overall reliability and
why?
8. Which items would you suggest have poor reliability and why? (Look at the
content of the items for clues!)
Semester 2 2015
34
STA60004
Exercise 3
Test-retest reliability of the state and trait anger was assesses by calculating two
Pearsons correlation coefficients (State Trait Anger.sav file).
Select ANALYZE, CORRELATE, BIVARIATE
Select STATE1, STATE2, TRAIT1, TRAIT2.
Click OK to run the analysis.
Semester 2 2015
35
STA60004
SPSS output:
Correlations
state anger time state anger time
1
state anger time 1
Pearson Correlation
2
1
Sig. (2-tailed)
N
state anger time 2
Pearson Correlation
150
.454
**
Sig. (2-tailed)
.000
131
**
Pearson Correlation
.270
.454
**
2
.270
**
.363
.000
.001
.000
131
150
132
**
.243
.254
**
.005
.003
131
131
131
**
.243
.796
**
Sig. (2-tailed)
.001
.005
150
131
150
132
**
**
**
Pearson Correlation
.363
.254
.000
.796
Sig. (2-tailed)
.000
.003
.000
132
131
132
132
Semester 2 2015
**
36
STA60004
3. Which is the most reliable and is this what you would expect?
Semester 2 2015
37
STA60004
Semester 2 2015
38
STA60004
---------------------------------------------------------------------------------------------------------------Face validity
Face validity is often confused with content validity. It is the least scientific of all the
validity measures. Face validity refers to how respondents perceive the
appropriateness of the test. Content validity is evident when the items are about
what you are measuring, and face validity is present when the items appear to be
about what you are measuring. Establishing face validity involves casual assessment
of item appropriateness. It might involve showing your test to a few untrained
individuals to see whether they think the items look right to them. Therefore, face
validity is not validity in the technical sense. However, face validity itself is a
desirable feature of tests.
Why is it important? If respondents consider the test to have face validity, they may
offer a more conscientious effort to complete the test. If a test does not have face
validity the respondents might rush through the test and take it less seriously.
Therefore the lack of tests face validity can affect testtakers cooperation or
motivation to do the test.
However, there are times when it is necessary that the construct being measured is
not evident to participants. For example, this is done when you need to avoid the
possibility of respondents faking good (appearing better than they are). Therefore,
depending on circumstances, there may be advantages or disadvantages to a tests
purpose being evident from its appearance.
Semester 2 2015
39
STA60004
Criterion validity
Criterion-related validity concerns the relationship that exists between scale scores
and some specified, measurable criterion. There are two types of criterion validity:
concurrent validity and predictive validity.
Concurrent validity
Concurrent validity is shown when a new measure relates concurrently to some
other measures of the same concept. Using this approach we compare a new test of
a concept with existing, well-accepted measures of the concept. The statistic
calculated is a correlation coefficient. If scores on both the new and the
established measure are highly correlated this is taken to mean that the new
measure is valid.
One of the major problems with this type of validation is the choice of an appropriate
criterion. We must assume the validity of the established measure against which we
assess our new measure. A low correlation between the new and existing measure
means that the new measure is invalid. However, the validity of the old measure
could be invalid.
You need to justify why you want to develop a new scale. If there is a good measure
of a construct which you use as a criterion, then you might be asked why your test is
necessary at all. Therefore you need a rationale for creating a new test. For
example, your new measure is simpler, more user-friendly, more useful or costeffective than the measure against which you have validated your test.
Semester 2 2015
40
STA60004
---------------------------------------------------------------------------------------------------------------Another problem with concurrent validity is that for some concepts there are no
appropriate, well-established measures against which to check a new scale.
A different approach is to give the new measure to criterion groups. For example, a
new measure of political conservatism might be given to members of conservative
and radical political groups. If the members of the conservative group come out as
conservative on the test and the radical group members emerge radical, this
provides good evidence for the tests validity.
Predictive validity
A scales predictive validity is its usefulness in predicting future events, behaviours,
attitudes, or outcomes. Predictive validity may be used, for example, to predict
election winners, success of an intervention, or other objective criteria.
Like concurrent validity, predictive validity is calculated as a correlation coefficient
between the initial test and the secondary outcome. The following example
demonstrates that the Pain Tolerance index that the researcher tested for concurrent
validity in the previous example may also be tested for predictive validity.
Semester 2 2015
41
STA60004
---------------------------------------------------------------------------------------------------------------Construct validity
Construct validity involves testing the scales performance in terms of theoretically
derived hypotheses concerning the nature of the underlying variable or construct.
According to Kline (1993) construct validity is the most important approach to validity
testing. Consideration of construct validity is particularly important when a single
criterion is not available to test criterion-related validity.
Support for a scales construct validity can be sought by exploring its relationship
with
other
constructs,
both
related
(convergent
validity)
and
unrelated
Semester 2 2015
42
STA60004
Semester 2 2015
43
STA60004
degree to which scores on the scale under investigation are influenced by subjects
motivation to present themselves in a positive light.
Semester 2 2015
44
STA60004
Exercise 4
Evaluating the Psychometric Properties of the New Well-Being
Measures
(the Flourishing Scale and the Scale of Positive and Negative
Experience)
Read the article by Diener et al. and write a critical review of the scale, New WellBeing Measures.
Diener, E., Wirtz, D., Tov, W., Kim-Prieto, C, Choi, D-W., Oish, S., Biswas-Diener, R.
(2010). New Well-Being Measures: Short scales to assess flourishing and positive
and negative feelings. Social Indicators Research, 97, 143-156. (Available on the
Blackboard)
(http://web.ebscohost.com.ezproxy.lib.swin.edu.au/ehost/pdfviewer/pdfviewer?sid=97134d7
1-9537-473e-b00e-b0fed8a476cc%40sessionmgr113&vid=2&hid=106)
Semester 2 2015
45
STA60004
Bibliography
De Vaus, D.A. (2002). Surveys in social research (5th ed.). Sydney: Allen & Unwin.
De Vellis, R.F. (2003) Scale development: Theory and applications (2nd ed.).
Thousand Oaks, CA: Sage
Kline, P. (1986). A handbook of test construction: Introduction to psychometric
design. London: Methuen.
Kline, P. (2000). A psychometrics primer. London: Free Association Books.
Kline, J.B. (2005). Psychological testing: A practical approach to design and
evaluation. Thousand Oaks, CA: Sage.
Litwin, M.S. (2003). Survey kit, Vol.8: How to assess and interpret survey
psychometrics, Thousand Oaks, CA: Sage
Mueller, D.J. (1986). Measuring social attitudes: A handbook for researchers and
practitioners, New York: Teachers College Press
Netemeyer R.G., Bearden W.O., & Sharma S. (2003) Scaling procedures: Issues
and applications. Thousand Oaks, CA: Sage.
Semester 2 2015
46
STA60004
Answers to Exercises
Exercise 1
If she wants to ensure content validity, she must tailor her survey instrument to the
needs of the students themselves. The best way to start would be to put together a
focus group of students currently living in the campus dorms. During this exploratory
session, she could get an idea of what issues are important to the students. She
might then put together a first draft of her questionnaire and show it to these
students for their comments. This would provide initial testing of content validity.
Exercise 2
1. Alpha = .91
2. High internal consistency
3. & 4. All items are good
5. Alpha = .84
6. High internal consistency
7. All except q19
8. q19
Exercise 3
1. Test-retest for state anger is .45; not reliable over time.
2. Test-retest for trait anger is .80; reliable.
3. Trait anger is more reliable as evidenced by the larger correlation coefficient. We
would expect a persons general disposition towards anger to remain fairly stable
over time if the scale is reliable, whereas we would expect state anger to be a
result of the situation and therefore be more susceptible to change.
Semester 2 2015
47
STA60004
Exercise 4
The Flourishing Scale
Statement of What the Scale Measures
The scale assesses major aspects of socialpsychological functioning (social
psychological prosperity); specifically it assesses the respondents self-perceived
success in important areas such as relationships, self-esteem, purpose and
optimism. It is an 8-item summary measure which provides a single psychological
well-being score.
All items in the scale are phrased in a positive direction. Each item is answered on a
17 point scale that ranges from strong disagreement to strong agreement. A high
score represents a person with many psychological resources and strengths. High
scores signify that respondents view themselves in positive terms in important areas
of functioning.
Justification for the Scale (Advantages over the Existing Measures)
It is a brief scale which measures an overall psychological well-being. The scale
does not assess the individual components of socialpsychological well-being.
However, if an overall psychological well-being score is needed, and a brief scale is
desirable, the FS appears to be useful.
Reliability of the scale
To test reliability and validity of the scales convenient samples of university students
were used. The total sample comprised of 689 participants (468 females and 175
males); 181 participants were from Singapore Management University, the rest
respondents were from five American universities.
Internal consistency was found to be high (Cronbachs alpha = .87). Test-retest
reliability, assessed one month apart (N=257), was moderate (r = .71).
Semester 2 2015
48
STA60004
The Satisfaction with Life Scale (Diener et al., 1985) - traditional subjective wellbeing measure;
Cantrils Ladder.
The Flourishing Scale correlated at substantial levels with the other wellbeing
measures (r ranged from .54 to .73), except at a medium level with the Ryffs
autonomy scale (r = .43) and at low level with the Loneliness scale (r = -.28).
Men and women did not score significantly different on the scale.
Respondents in Singapore scored lower than American students. It is not clear
whether the difference was statistically significant.
The Scale of Positive and Negative Experience (SPANE)
Statement What the Scale Measures
The scale assesses subjective feelings of well-being and ill-being. Assessment is
based on the amount of time the feelings were experienced during the past 4 weeks.
Six items of the scale assess positive feelings and six items assess negative
feelings. For both the positive and negative items, three of the items are general
(e.g., positive, negative) and three are more specific (e.g., joyful, sad). The summed
positive/negative score (SPANE-P/SPANE-N) can range from 6 to 30. The positive
and negative scales are scored separately because of the partial independence or
separability of the two types of feelings. However, the two scores can be combined
by subtracting the negative score from the positive score, and the resulting SPANE-B
score (an overall affect balance score) can range from -24 (unhappiest possible) to
24 (happiest possible).
Semester 2 2015
49
STA60004
Semester 2 2015
50
STA60004
PANAS (Watson et al., 1988) - the most widespread measure of positive and
negative feelings;
The Satisfaction with Life Scale (Diener et al., 1985) - traditional subjective wellbeing measure;
Cantrils Ladder.
The scales correlated at substantial levels with the other measures, except at a low
level with the Loneliness scale (r = -.32 (SPANE-P), r = -.29 (SPANE-N), r = -.34
(SPANE-B)).
Men and women did not score significantly differently on the scale.
Respondents in Singapore were reported to score lower than American students, but
it is not clear whether the difference was statistically significant.
Overall conclusion
The Flourishing Scale performed well, with high internal consistency, modest testretest reliability and high convergence with similar scales. Although it does not
assess the individual components of socialpsychological well-being, the FS seems
to be a good assessment of overall self-reported psychological well-being. If an
overall psychological well-being score is needed, and a brief scale is desirable, the
FS appears to be adequate.
The Scale of Positive and Negative Experience performed well in terms of internal
consistency and convergent validity with other measures of emotion, well-being,
happiness, and life satisfaction. Temporal stability was found to be low. The authors
claim that the scale has advantages over other existing measures of feelings. The
Semester 2 2015
51
STA60004
scales assess all positive and negative feelings, not just specific feelings. The
SPANE is an improvement on existing scales by succinctly measuring a broad range
of feelings based on the recent experience and duration of those feelings. It is also
purportedly less culturally specific which may increase its utility.
Authors suggestions for future research
The samples only included students. Broader samples should be a high priority for
future studies.
Establishing stability of the scales over longer time periods beyond 1 month is
required.
Validity studies should determine the associations of the scales with nonself-reported
assessments of the same concepts (e.g., from informants).
The scales should be tested for predicting nonself-reported behaviors.
The degree to which the new scales and existent scales differ and converge across
cultures and groups should be analysed.
Additional Suggestions for Future Research
The scales should be tested on a random sample drawn from a wider population.
Angry and Afraid were not as well correlated with other items in the SPANE-N
scale, and may warrant further analysis to see if other words give more consistent
correlation with other items in the SPANE-N scale.
The test could be compared across groups known to have higher levels of the
concept against groups known to have lower levels of the concept
The researchers did not assess the scale against unrelated constructs to examine
discriminate validity.
As responses to some FS items could be influenced by social desirability it is
recommended that a social desirability measure is included to assess its impact on
the scores.
Semester 2 2015
52
STA60004
Contents
Learning Objectives
Optional Reading
Exercise 1
Exercise 2
Exercise 3
11
Exercise 4
12
13
13
14
Changing Categories
14
18
Standardising Variables
19
19
Bibliography
22
23
Semester 2 2015
STA60004
Learning Objectives:
On completion of this topic you will:
1. Understand the purpose of coding;
2. Be familiar with standard code-frames, such as those developed by the
Australian Bureau of Statistics;
3. Be able to create code-frames for open-response questions;
4. Be able to prepare variables for analysis;
5. Know how to change, collapse and reorder the categories of variables;
6. Know how to create new variables from existing ones;
7. Know how to deal with missing data.
Optional Reading
De Vaus, D.A. (2002). Surveys in social research (5th ed.). Sydney: Allen & Unwin.
Chapters 9 and 10.
Semester 2 2015
STA60004
Semester 2 2015
STA60004
Semester 2 2015
STA60004
Semester 2 2015
STA60004
Semester 2 2015
STA60004
Exercise 1
The following question was asked of respondents in a survey about mobile phones:
Should children be allowed to bring mobile phones to school? Please give reasons.
A list of 10 peoples answers to this question is presented below. Create a code-list
for this question.
ID
Yes, of course. Mobile phones keep children safer. They can call their
parents in case of an emergency.
No, children shouldnt be allowed to use mobile phones at all. There are
possible health risks from using mobile phones. Some research suggests
that the radio waves from mobile phones may harm peoples brains.
No, kids will be texting, playing games etc. instead of doing class work.
No, mobile phones are too expensive for children. Even if some models
are cheap to buy, calls are expensive. Many kids run up big bills their
parents have to pay.
10
Yes, why not. Mobile phones are now a normal part of modern life.
Code(s)
Code list:
1
2
..
Semester 2 2015
STA60004
Exercise 2
The following question was part of a survey for students who have used the
Blackboard Learning Management System during their studies. This survey
investigated students access to the flexible provision of learning and their use of
Blackboard. This study was done several years ago when Blackboard was
introduced.
Question: My Blackboard subject pages assist my learning
Rating scale:
1. not at all
2. to some extent
3. significantly
Any comments?
Task: Code the following responses to the open-ended question Any
comments? and write a brief summary
As long as the information is provided in a way that allows private study to attain
the required standard. Unfortunately this is not consistent across subjects
Some pages have assisted more than others
The Discussion Boards were very helpful.
Lecture notes.
Only with discussion boards to ask questions.
Quicker, more flexible and more convenient access to information relating to
subjects!
My assessment and schedule is available there, but my learning is mostly
facilitated by newsgroup interaction, textbook reading and CD (offline) content.
Depend on subjects and lecturers. Some subjects with discussion board would
be better, but the lecturers should visit the boards to discussion with students.
External links for reading etc. are great.
You can study anywhere if you have a hardcopy of the notes. reading off the
screen is detrimental to their health, it is also very annoying.
Course material is good to have on the web, but without explanation, its
useless.
Semester 2 2015
STA60004
It actually takes quite a lot of time each week to download lecture notes and print
them out. I'd much rather have the lecture slides in a hardcopy book from the
bookshop, despite the cost.
Very helpful... keeps me up to date, helps with revision, keeps me on task more
than tutorials etc
Lecture notes etc helpful, however that is about it.
Maybe all online material should be made available on a CD-ROM, as well as
printed in the bookshop and sold for a small price to cover costs.
Depending on the lecturer and the availability of material on line
Availability of taped lectures from home or work would be really helpful.
Nothing is as good as learning at school in the class room
I have found the information related to the subjects that I have been studying
very, very helpful.
Using blackboard can be significantly slower and more frustrating than being
provided with printed copies of course notes. Printing your own notes costs more
than buying a copy from the bookshop, they doesn't last as long (not bound), get
lost more easily
Good additional resource to lecture notes etc
It depends on the subject and how well the material is presented.
Online access to subject pages is a definite bonus.
Amazing can be accessed anywhere
Need to train staff and lecturers or provide the resources to get content on site.
Need to train students in the use of it.
It saves time that I would otherwise have to spend traveling to Swinburne to
access information.
Allows me to get online information without going to library.
Semester 2 2015
10
STA60004
Exercise 3
Thematic Coding
Based on the research of Marwell and Schmitt (1967, 1990)
Marwell, G. & Schmitt, D.R. (1967). Dimensions of compliance-gaining behavior: An
empirical analysis. Sociometry, 39, 350-364.
Marwell, G. & Schmitt, D.R. (1990). An introduction. In J.P. Dillard Seeking
compliance: The production of interpersonal influence messages. Scottsdale, AZ:
Gorsuch Scarisbrick, pp. 3-5.
Hypothetical situation: Imagine that your teen-age son, Nick, who is a high school
student, has been getting poor grades. You want him to increase the amount of time
he spends studying from 6 to 12 hours a week.
Task: Try to describe and classify the following compliance-gaining strategies:
Example
Description of Strategy
Strategy
11
STA60004
10
11
12
13
14
15
16
Exercise 4
Task: Using your classification or Marwell and Schmitts classification (see
Solutions to Exercises), code the following examples (the topic now is
Divorce)
Example of Compliance-Gaining Strategy
1
Youll see. Youll be a lot better off without me; youll feel a lot better after
the divorce.
Only a cruel and selfish neurotic could stand in the way of anothers
happiness.
If you dont give me a divorce, youll never see the kids again.
Any intelligent person would grant their partner a divorce when the
relationship had died.
Semester 2 2015
Strategy
12
STA60004
13
STA60004
Logical Checks
Certain set of responses will be illogical (e.g., if someones age is coded as 16 it
seems illogical if that persons highest level of education is recoded as PhD).
Collapsing Categories
Collapsing categories is used when the initial coding of a variable resulted in more
categories than we require. The advantage of the initial detail, though, is that it
provides the flexibility to enable us to collapse the categories in a variety of different
ways.
Reasons for collapsing categories of variables:
The detailed coding may not reflect the form of the variable which is relevant to
the research problem (e.g., we might recode detailed occupational codes into
blue-collar and white-collar categories).
If there are very few people in a category it is often better to combine the
category with another suitable category because very low frequencies can
produce misleading tables and statistics.
Collapsing categories can highlight patterns in the data.
There should always be a sound justification for collapsing categories. Care should
be taken not to combine the categories in such a way as to mask a relationship as is
shown in the Table:
Semester 2 2015
14
STA60004
Female
Strongly agree
50%
15%
Agree
10%
45%
Disagree
30%
5%
Strongly
disagree
10%
35%
500
500
Recoded version
}
}
Male
Female
Agree
60%
60%
Disagree
40%
40%
500
500
15
STA60004
The distributional approach involves dividing the sample up into roughly equal sized
groups of cases. The substantive approach involves dividing the categories of the
variable into equal lots.
Rearranging categories
Involves arranging categories in a more logical order
Reasons for rearranging categories:
Creating an order more appropriate to the focus of the analysis;
Making tables easier to read;
Changing the level of measurement of a variable and thus affecting the methods
of analysis that can be applied to the variable.
Example (de Vaus, p. 167)
Imagine we have a variable indicating the industry in which a person works. Table 2
shows the initial order of industry categories. Suppose we want to perform analysis
that is focusing on unionization in the workplace and its impact on job satisfaction.
For this analysis it might be better to organize the industry categories according to
the level of unionization of the industry. This would provide a logical order to the
categories and make it easier to read tables later on. The table shows the revised
version in which the categories of the variable are rearranged in order to reflect the
unionization of the industry.
Table 2: Rearranging categories into a logical order appropriate to project
a) Original version
Code
b) Revised version
Industry
New
code
Industry
15
Agriculture,
fishing
in unions
and
in unions
Agriculture,
fishing
Mining
54
18
Manufacturing
40
Construction
37
59
Manufacturing
40
Construction
37
Mining
54
18
59
Semester 2 2015
forestry
forestry
and
15
16
STA60004
Reverse coding
Reverse coding is mostly used when constructing scales. A scale is a composite
measure of a concept that is created by asking respondents a set of questions and
then combining answers to those questions into a single composite measure of the
underlying concept.
Each of the variables that constitute the composite measure should be scored in the
same direction. However, when constructing items for a scale it is normal to mix up
the direction of the statements to which people respond: some will be positive and
some will be negative. If we want to combine variables that are coded in different
directions we need to reverse code some variables so that they are all coded in the
same direction.
Suppose a person was asked to complete the following questionnaire (Vulnerability
Facet of Neuroticism Scale, Costa & McCrae, 1992):
Question
Strongly
disagree
Disagree
Neutral
Agree
Strongly
agree
To calculate the persons Vulnerability score we need to add up all items scores:
Vulnerability Score = Q1 score + Q2 score + Q3 score + Q4 score + Q5 score + Q6
score + Q7 score + Q8 score.
Before doing this we would need to reverse code some items (Q2, Q4, Q6, Q7, and
Q8).
Vulnerability Score = Q1 + Q2 + Q3 + Q4 + Q5 + Q6 + Q7 + Q8 = 2 + 1 + 1 + 2 + 3
+ 2 + 2 + 2 = 15
Semester 2 2015
17
STA60004
Conditional transformations
Conditional transformation involves specifying a new variable and its categories and
then specifying the conditions a person must meet to be placed in a given category.
Example (de Vaus, p.169)
Suppose that in a study of marriages we want to create a variable that reflects the
marital history of both husband and wife. We would create three categories: 1) firsttimer marriage; 2) mixture; 3) both previously married.
Conditional transformations are performed in most computer packages by using IF
statements.
Arithmetic transformations
Arithmetic transformations are used for interval level variables. New variables can be
created by various arithmetic computations.
Example (de Vaus, p. 171)
Suppose we want to study if the age difference between a husband and a wife
affected the degree of equality in their marriage. We can construct a new variable by
substracting the wifes age from the husbands age to indicate the age difference.
Suppose we obtained information about respondents annual income but for our
study we need to know their fortnightly income. This can be achieved by creating a
new variable by dividing annual income by 26 (number of fortnights in the year) to
construct a new variable indicating fortnightly income.
Semester 2 2015
18
STA60004
Standardizing Variables
Sometimes we are interested not in the exact scores people have on a variable but
their scores relative to other people in the sample. In this case we need to
standardize variables.
Some situations when standardization may be required:
1. Comparing and combining scores on variables with very different distributions;
2. Comparative
studies
where
units
of
measurement
(e.g.,
income)
are
incomparable;
3. Change over time where the value of units changes over time (e.g., income
changes with inflation) so adjustments need to be made to express income in
some common unit that removes the effect of inflation. (de Vaus, p.171)
For interval-level variables raw scores are usually converted into z-scores. For
ordinal-level variables scores are usually converted into percentiles.
19
STA60004
2. Statistical imputation:
Substituting the missing values with a new, best guess value
Sample mean approach;
Group means approach;
Random assignment within groups;
Regression analysis
Semester 2 2015
20
STA60004
Calculate the mean for the missing data variable within each category of the
background variable;
Find the value on the same variable of the nearest preceding case with a valid
code;
21
STA60004
Bibliography
Costa, P.T.Jr., & McCrae, R.R. (1992). NEO-PI-R professional manual. Odessa, FL:
Psychological Assessment Resources.
De Vaus, D.A. (2002). Surveys in social research (5th ed.). Sydney: Allen & Unwin.
Marwell, G. & Schmitt, D.R. (1967). Dimensions of compliance-gaining behavior: An
empirical analysis. Sociometry, 39, 350-364.
Marwell, G. & Schmitt, D.R. (1990). An introduction. In J.P. Dillard Seeking
compliance: The production of interpersonal influence messages. Scottsdale, AZ:
Gorsuch Scarisbrick, pp. 3-5.
Semester 2 2015
22
STA60004
ID
Code(s)
10
Code list:
1. Yes, keep children safe
2. Yes, normal part of life
3. No, distracting in class
4. No, health risks
5. No, cheating in class
6. No, too expensive
Semester 2 2015
23
STA60004
Exercise 3
Try to describe the following compliance-gaining strategies
Marwell and Schmitt classified the strategies as follows:
(You may have different classification system)
Example of Compliance-Gaining Strategy
Description of Strategy
Strategy
Promise/
Reward
Threat
Expertise
(Positive)
Expertise
(Negative)
Liking
Pre-Giving
Aversive
Stimulation
Debt
Moral Appeal
10
Self-Feeling
(Positive)
11
Self-Feeling
(Negative)
12
Altercasting
(Positive)
Semester 2 2015
24
STA60004
13
Altercasting
(Negative)
14
Altruism
15
Esteem
(Positive)
16
Esteem
(Negative)
Semester 2 2015
25