Research Design Notes Weeks 1 To 6

STA60004 Research Design
Module 1
Topic 1: Introduction to Survey Research
STA60004
Contents
Learning Objectives..............................................................................................
Optional Reading..................................................................................................
Research Designs................................................................................................
Classical Experimental Design...
Cross-Sectional Survey Research Design.... 6

Longitudinal Designs.... 11
Case Study Design........................................................................................... 15
Exercise 1.............................................................................................................. 17
Exercise 2.............................................................................................................. 18
Research Methods (Methods of Data Collection).................................................. 21
Steps in Survey Research. 24
Choosing Research Topic................................................................................ 26
Exercise 3......................................................................................................... 26
Setting Measurable Objectives......................................................................... 27
Defining Terms................................................................................................. 27
Formulating Research Questions and Hypotheses.......................................... 28
Descriptive and Explanatory Research.................................................................. 29
Units of Measurement/ Units of Analysis in Survey Research............................... 32
Exercise 4.............................................................................................................. 34
Ethics in Research................................................................................................. 35
Bibliography.... 42
Answers to Selected Exercises .. 43
Additional Resources 44
2015 Semester 2
STA60004
Learning Objectives:
On completion of this topic you will be able to:
1. Explain the characteristics, benefits, and concerns of basic research designs;
2. Explain the difference between different research designs;
3. Understand basic steps involved in developing a survey project;
4. Formulate and clarify research objectives;
5. Outline the difference between explanatory and descriptive research;
6. Explain the ethical principles that should guide survey research design;
7. Describe the main components of an informed consent form.
Optional Reading
Bryman, A. (2012). Social research methods (4th ed.). Oxford University Press
(Chapters 1 and 2).
(The book is available at Swinburne Library and Swinburne Bookshop.)
Chapter 1: The nature and process of social research.
http://onlineres.swin.edu.au.ezproxy.lib.swin.edu.au/1134862.pdf
Chapter 2: Social research strategies

De Vaus, D.A. (2002). Surveys in social research (5th ed.). Sydney: Allen & Unwin
(Chapters 1, 3 and 5).
(The book is available at Swinburne Bookshop, Swinburne Library and as online
resource (E-book*) through Swinburne Library.)
*E-book web-link:
http://www.swin.eblib.com.au.ezproxy.lib.swin.edu.au/patron/FullRecord.aspx?p=1111585&echo=1&u
serid=Q5n5hJLVgt5fL7c6yQENgA%3d%3d&tstamp=1394251339&id=A0028CD2E072B4F2CD91B85
563B3B88863439F44
(Electronic edition: de Vaus, D.A. (2013). Surveys in social research (5th ed.). Taylor
& Francis.)
2015 Semester 2
STA60004
In this topic, four major types of research design will be discussed:

experimental design, cross-sectional survey design, longitudinal design and case
study design. The characteristics, benefits, and concerns of basic research designs
will be discussed. The difference between research design and research method will
be outlined. Then we will consider the basic steps in survey research and discuss
how to formulate and clarify objectives and research questions. Finally the ethical
principles that should guide survey research design will be explained.
Research process consists of several phases:
Choosing research topic;
Formulating research objectives;
Choosing research design;
Choosing methods of data collection;
Recruiting research participants;
Collecting data;
Analysing data;
Interpreting data;
Disseminating findings to others.
After you chose a topic for your study you will need to formulate research objectives.
Research objective is a goal statement defining the specific information needed to
provide insight to the research problem. So developing clear, concise and
meaningful research objectives is vital.
For example, consider the following topic:
Sunscreen knowledge survey of the Royal Botanic Gardens staff.
The objectives for this study were:
To assess general sunscreen knowledge of the employees;
To determine factors associated with their sunscreen purchasing decisions.
Another example:
Topic: Satisfaction levels of customers of a coffee shop.
Objectives for this study were set as follows:
2015 Semester 2
STA60004
To determine the characteristics and preferences of the shops customers;

To measure the impact of the advertising campaign;
To identify any customer needs that are not being met.
Your research objectives will dictate what type of design you can choose for your
study.
Research Designs
A research design provides a framework for the collection and analysis of data
(Bryman, 2012). There are four major types of research design:
Experimental Design;
Cross-Sectional Design or Survey Research;
Longitudinal Design;
Case Study Design
Classical Experimental Design

In the classical experimental design (also called true experiment, or pretestposttest group design with random assignment, or randomised controlled
trial), there are two groups: experimental and control. Participants are randomly
allocated to the experimental and control (or comparison) groups. Data are collected
at, at least, two points in time (before and after). Between Time 1 (before) and Time
2 (after) the experimental group is given a new innovative program, intervention, or
treatment. The control group is given an alternative (e.g., the traditional program or
no program at all). At both Time 1 and Time 2 both groups are measured in relation
to the key dependent variable. If the experimental group changed significantly more
than the control group, we would conclude that this is because of the experimental
intervention.
Example (de Vaus, p.33)
Research question: Does QUIT program help smokers stop smoking?
A sample of people who smoked was obtained and participants were randomly
assigned to either an experimental group or control group. People in the
experimental group participated in the QUIT smoking program designed to help
people stop smoking. Participants in the control group did not do the QUIT program.
2015 Semester 2
STA60004
The level of smoking in each group was measured before the QUIT program began
and six months after. It was found that in the experimental group ten per cent fewer
smoked by Time 2 and in the control group three per cent fewer smoked by Time 2.
A reduction of three per cent among the control group was likely to be due to factors
not related to the program. The effect of the QUIT program was measured by the
difference in the amount of change between the experimental and the control group.
E1, E2, C1, C2 - the measure of the dependent variable (E experimental group, C control group).
Echange = E2 E1 = 10%
Cchange = C2 C1 = 3%
Effect = Echange - Cchange = 7%
Typically, in the experimental design the researcher manipulates a causal variable

and sees whether the group receiving the treatment then differs from the control
group. In other words, in a true experiment it is necessary to manipulate the
independent variable in order to see whether it has an influence on the dependent
variable.
Cross-Sectional Survey Research Design

A cross-sectional design involves the collection of data on more than one case
(usually quite a lot more than one) and at a single point in time in order to collect a
body of quantitative or quantifiable data in connection with two or more variables
(usually many more than two), which are then examined to detect patterns of
association (Bryman, p.44).
A poll to determine voting intentions is an example of a cross-sectional survey.
Respondents in such a poll are typically asked: "If the election were held today, who
would you vote for?" The results then are reported as follows: "If the election were
held today, Candidate X would win."
2015 Semester 2
STA60004
In a cross-sectional survey, data are collected at one point in time from a sample
selected to describe some larger population at that time.
For the study of smoking behavior, discussed in the Experimental Design section, we
would ask a sample of people about their level of smoking some time after an antismoking campaign.
Effect = E2 ?
In this case we will not be able to tell anything about the effectiveness of the
campaign. We need to have an empirical frame of reference against which to
compare the 30% figure. Otherwise we cannot say anything about the causal
process. This type of survey research is sometimes referred to one group post-test
only design.
A better option is to collect measures from two groups of people at one point of time
and to compare the extent to which groups differ on the level of smoking. For
example, we would obtain a random sample of people after the anti-smoking
campaign and ask those people about their current level of smoking and how aware
they were of the anti-smoking campaign. We could then divide people into groups
according to how well aware they had been of the campaign. If the campaign was
successful we would expect that those with the greatest awareness would also have
the lowest level of smoking. We may then conclude that the campaign was effective.
2015 Semester 2
STA60004
Effect = E2 C2 = 7%
However, high and low awareness groups might differ in other ways not just in the
awareness of the campaign (e.g. high awareness participants might be older, be in
poorer health etc.).
The experimental design is different to the survey design in that the variation
between the attributes of people is created by intervention from the researcher
wanting to see if the intervention generates a difference. A survey approach would
not create the variation but would find naturally occurring variation. The problem
with survey research is that we cannot be sure that the two groups are similar in
other respects, whereas the experimental researcher begins with two similar groups
and the only difference is that only one group receives the treatment. Therefore any
difference in dependent variable must be due to the treatment.
In many cases we do not need to test causal propositions. For example, if we want
to determine voting intentions in the upcoming elections, a cross-sectional survey will
be the best option.
The ultimate goal of survey design is to allow researchers to generalize about a large
population by studying only a small portion of that population. Therefore, if you need
personal, self-reported information that is not available elsewhere and if
generalization of research findings to a larger population is desired, survey research
is the most appropriate method.
Survey research is usually used when there is a need for information about a group
of people or organisations, which is not available from any other source. In some
cases, this is because we want to know facts that are difficult to observe
systematically. For example, some crimes are reported to police, many are not. A
2015 Semester 2
STA60004
way to estimate the rate at which people are victims of crime is to ask a sample of
people about their victimisation experience. Sometimes we are interested in
measuring phenomena that only individuals themselves can perceive: what people
think or know, or what they feel. Very often the best way to find out what people like
and believe is to ask them. You cannot assume that people think in certain ways
without asking them what they think.
Surveys are used for collecting information from or about people to describe,
compare, predict, or explain peoples behaviour, attitudes, opinions and values. In
other words, surveys are used to answer four broad classes of questions:
1. The prevalence of attitudes, beliefs, and behaviour;
2. Changes in attitudes, beliefs, and behaviour over time;
3. Differences between groups of people in their attitudes, beliefs, and behaviour;
and
4. Causal propositions about these attitudes, beliefs, and behavior.
1. Prevalence of Attitudes, Beliefs, and Behaviour
Surveys are most often used to measure the frequency of certain attitudes, beliefs,
and behaviour. For example, surveys can be used to see what proportion of the
public approves of the Prime Ministers performance (an attitude), what proportion of
the public believes that taxing emissions of carbon dioxide is imperative to reduce
global warming (a belief), and what proportion of the population has been
unemployed and looked for a job during the previous month (a behaviour).
2. Changes of Attitudes, Beliefs, and Behaviour Over Time
Measuring the prevalence of attitudes, beliefs, or behaviour is generally only of
limited interest. The proportions often mean little by themselves. For example, if 20%
of high school students have used drugs in the past month, it is important to know
whether that is higher or lower than in previous years, whether there is an increase
in drug use or whether it is diminishing. Longitudinal surveys are usually used for
measuring change.
2015 Semester 2
STA60004
3. Differences Between Groups

In addition to describing the total sample (and inferring to the total population), you
can describe subsamples and compare them. For example, if you are studying
voting intentions you can compare voting intentions of men and women, voters of
different ages, and so forth.
4. Causal Propositions
Surveys are also used to test causal propositions. In studying voter preference, for
example, you might want to explain why some voters prefer one candidate while
other voters prefer another. An explanatory objective requires the simultaneous
examination of two or more variables. Preferences for different political candidates
might be explained in terms of variables such as party affiliation, sex, education,
religion etc. By examining the relationships between candidate preferences and the
several explanatory variables, you may attempt to explain why voters prefer a
particular candidate. You have to include in the survey a variety of questions that tap
alternative causal logics and then analyse the data to decide which causal
explanation fits better.
You should be very careful, however, to avoid mistaken attribution of causal links.
Association between variables very often does not prove a causal link. Causal
propositions can be tested more definitively in experiments than in surveys.
It is not always possible, however, to obtain a control group. Sometimes ethical
considerations make it impossible to introduce experimental interventions. Suppose
the researcher is interested in the effect of marital breakdown on the social
adjustment of young children. We cannot assign people randomly to two groups and
then somehow cause marital breakdowns in one group.
The logic of the experimental design can provide a useful guide to the logic of survey
analysis. Where the experimenter isolates the experimental variable through the use
of the experimental and control groups the survey researcher seeks to accomplish
the same task by controlling for variables after the fact. For example, the
experimenter may ensure that both the experimental and the control groups have the
same sex distribution in order to avoid the possible influence of that variable on the
experiment. The survey researcher achieves this either by ensuring that subgroups
in the sample have the same sex distribution or by testing the observed relationship
2015 Semester 2
10
STA60004
separately among men and women. The logical goal of isolating relevant variables
by ruling out the influence of extraneous variables is considered to be the same for
both methods.
Longitudinal Designs
With a longitudinal design a sample is surveyed and then is surveyed again on at
least one further occasion. A longitudinal design allows the analysis of process and
change over time, which is not easily possible in a cross-sectional survey. Therefore
longitudinal designs may be more able to make causal inferences.
The primary longitudinal designs are panel studies, trend studies and cohort
studies.
Panel Studies
Panel studies involve the collection of data over time from the same sample of
respondents. The sample for such a study is called the panel. For example, the
same people could be interviewed at successive elections to assess changes in
attitude and vote.
For the study of smoking behavior, we would measure the smoking behavior of a
representative sample of people before the anti-smoking campaign. After
participating in the QUIT program participants level of smoking will be reassessed.
The difference in smoking behavior between Time 1 and Time 2 will provide a
measure change over the period.
Effect = E2 E1
The problem with the panel design is that, in comparison with the experimental
design, we dont know the extent to which comparable smokers who did not
2015 Semester 2
11
STA60004
participate in the QUIT program stopped smoking. We cannot conclude with

confidence that the change was due to the intervention (other factors could have
caused the change: increase in price of cigarettes, increased unemployment etc.).
Another problem with this design is panel attrition: some persons participated in the
first study might be unwilling or unable to participate at Time 2.
An example of a panel study:
The Household, Income and Labour Dynamics in Australia (HILDA) Survey
http://melbourneinstitute.com/hilda/
Trend Studies
In trend studies a particular population is sampled and studied at different points in
time. While samples are of the same population, they are not composed of the same
people.
The voting polls conducted over the course of a political campaign are an example of
a trend study. At several times during the course of the campaign, samples of voters
are selected and asked for whom they will vote. By comparing the results of these
several polls, researchers might determine shifts in voting intentions.
For the study of smoking behavior, for example, we would measure smoking
behavior of a representative sample of people before implementing the anti-smoking
campaign. After the campaign we would measure the smoking behavior of another
representative sample.
Effect = E2 E1
The problem with this design is that we cannot fully match the samples. Therefore
the effect observed between Time 1 and Time 2 might be due to sampling error
(differences between the samples).
2015 Semester 2
12
STA60004
An example of a trend study:

Abigail, W.F., Power, C., & Belan, I. (2010). Termination of pregnancy and the over
30s: What are trends in contraception use 19962006? Australian Journal of Primary
Health, 16, 141-146.
http://web.b.ebscohost.com.ezproxy.lib.swin.edu.au/ehost/pdfviewer/pdfviewer?sid=ba28f030-6b3c4d8b-9532-c861a231e4d7%40sessionmgr113&vid=1&hid=128
(The article is available on the Blackboard)

Cohort Studies
Similarly to trend studies, in cohort studies a particular population is sampled and
studied at different points in time. While samples are of the same population, they
are not composed of the same people.
In cohort studies the focus is on people who have similar characteristics. For
example, in the study of smoking behavior the researcher may be interested in the
behavior of a particular age group, say 18 year old people. The researcher would
measure smoking behavior of a representative sample of 18 year olds before
implementing the anti-smoking campaign. After the campaign (say one year later) we
would measure the smoking behavior of another representative sample of the same
cohort of people. In this case, because we do the measurement one year later, the
sample will consist of 19 year olds. This would constitute a cohort study of a given
age group.
An example of a cohort study:
Degenhardt L., Gisev, N., Trevena, J., Larney, S., Kimber, J., Burns, L., Shanahan,
M., & Weatherburn, D. (2013). Engagement with the criminal justice system among
opioid-dependent people: A retrospective cohort study. Addiction, 108, 2152-2165.
http://onlinelibrary.wiley.com.ezproxy.lib.swin.edu.au/doi/10.1111/add.12324/pdf
(The article is available on the Blackboard)

If you would like to learn more about longitudinal designs, you may find the following
publication useful:
Research design and methods of analysis for change over time
http://www.ssric.org/trd/modules/cowi/chapter6
2015 Semester 2
13
STA60004
Approximating Longitudinal Designs

If the researcher wishes to answer questions that involve some notion of change
over time, a number of devices can be employed in a cross-sectional survey in order
to approximate the study of process or change.
For example, the respondents might be able to provide data relevant to questions
that involve process. Participants might be asked to report their family incomes or
other attributes or behavior both for the current year and for the previous year. These
data might then be used as though they had been collected in a panel study with two
stages of interviewing conducted a year apart.
In the study of smoking behavior we might ask a sample of people about their
current level of smoking and about their level of smoking before the anti-smoking
campaign was launched.
Effect = E2 E1
This type of design is referred to a retrospective panel design. A problem with this
design is that respondents might not be able to report information accurately. The
farther back they are forced to reach into their memories, the less accurate the
information they provide is likely to be.
Retrospective experimental design (or quasi-experimental design):

Retrospective experimental design attempts to deal with the control group problem.
Using this design we would ask a sample of people about their current level of
smoking, their awareness of the recent anti-smoking campaign and about their level
of smoking before the anti-smoking campaign was launched. We would then divide
our sample to groups: the group with high awareness of the campaign and the group
2015 Semester 2
14
STA60004
with low awareness. We would also try to ensure that both groups are similar in
regards to other variables (age, sex etc.). If the level of smoking in the high
awareness group dropped more than in the low awareness group we might attribute
this difference to the effect of the campaign.
Echange = E2 E1 = 10%
Cchange = C2 C1 = 3%
Effect = Echange - Cchange = 7%
Case Study Design

The case study design focuses on particular cases and tries to develop a full and
rounded understanding of the cases. In a case study a particular individual, program,
or event is studied in depth for a certain period of time (e.g., a study of the nature,
course, and treatment of a rare illness for a particular patient). The case study aims
to understand particular attributes of a person (or an organisation or whatever the
case is) within the context of the cases other characteristics and history.
Sometimes researchers study two or more different cases in order to make
comparisons or develop a theory. For instance, Sigmund Freud developed case
studies of several individuals as the basis for the theory of psychoanalysis and Jean
Piaget did case studies of children to study developmental phases.
2015 Semester 2
15
STA60004
In any particular study different research designs can be used. Causes of industrial
disputes, for example, can be studied with the help of a survey of attitudes of
management and workers, a case study of a particular strike or a particular factory
and an experiment where groups of workers work under different conditions to find
out if this affects the frequency of disputes (de Vaus, p.5).
In many studies it may be appropriate to use a range of research designs. The best
studies are often those that combine more than one design, since each design
provides a different perspective on the subject under study.
2015 Semester 2
16
STA60004
Exercise 1
(de Vaus, Chapter 1, Exercise 4, p.8)
Imagine that you believe being unemployed leads to a loss of self-esteem. Briefly
contrast how the case study, the experiment and the survey research would differ in
their basic procedure for testing this proposition.
(Use Discussion Board/ Blackboard to discuss this exercise)
2015 Semester 2
17
STA60004
Exercise 2
(de Vaus, Chapter 3, Exercise 5, pp. 39-40)
For each of the following statements of research findings indicate the type of
research design that appears to have been employed and explain what is wrong with
the conclusions that are drawn. Concentrate on problems that arise from research
design problems.
a. A Sixty-eight per cent of married people scored high on our index of conservatism
while only 38 per cent of single people scored high. Marriage makes people more
conservative.
b. After observing a sample of childless married couples over a ten-year period we

observed that the level of marital happiness declined over this period.
Childlessness works against people being happily married.
c. In the early 1970s, before the end of the Vietnam War, surveys showed that
tertiary students had strong anti-American attitudes. Recent surveys have shown
that these feelings are no longer evident among students. Ending the Vietnam
War certainly improved the attitudes of students to the United States.
2015 Semester 2
18
STA60004
d. Old people attend church more often than young people. For example, 58 per
cent of those over 60 attend church regularly while only 22 per cent of those
under 25 do so. From this we can conclude that as people get older they become
more religious.
e. The average number of children per family now is 1.8 families are obviously
getting smaller these days.
f. To test the idea that having children makes people happier, a group of parents
were asked how happy they felt now compared with before they had children.
Eighty-seven per cent said they were happier now than before they had children.
From this we can conclude that having children improves peoples happiness.
2015 Semester 2
19
STA60004
g. A
HEADSTART
program
(a
preschool
educational
program
to
help
disadvantaged children have a head start by the time they commence school)
was used to test the effectiveness of HEADSTART. A group of four-year-olds
from disadvantaged backgrounds were chosen to enter the program. IQ tests
were given at the beginning of the program and again at the end. There was an
average gain of ten IQ points over the period of the program. HEADSTART
increases childrens IQ.
2015 Semester 2
20
STA60004
Research Methods (Methods of Data Collection)

The three main methods used to collect data are direct measurement,
questionnaires and observation.
Direct measurement
Direct measurement involves testing subjects or otherwise directly counting or
measuring data. Examples:
-
Testing cholesterol levels;
Counting ballots in a local election.

Questionnaires and Interviews
This technique involves soliciting self-reported verbal information from people
about themselves.
Observation
Observation involves the direct study of behaviour by watching the subjects of the
study without intruding on them and recording certain critical natural responses to
their environment.
Another method of data collection is

Secondary research.
Secondary research consists of compiling and analysing data that have already
been collected by other researchers. Certain data may already exist that can
serve to satisfy the research requirements of a particular study. Researchers
should always investigate existing sources of information as a first step in the
research process to take advantage of information that has already been
collected.
2015 Semester 2
21
STA60004
Terms questionnaire and survey are very often used interchangeably. Therefore it
is crucial to understand the difference between research design and research
method.
Research Question
Research Design
Experiment
Cross-Sectional
Survey
Longitudinal
Survey
Case Study
Research Methods (Methods of Data Collection):
Direct
Measurement;
Questionnaire;
Interview;
Observation;
Etc.
Questionnaire;
Questionnaire;
Interview;
Interview;
Etc.
Etc.
Direct
Measurement;
Questionnaire;
Interview;
Observation;
Etc.
From the diagram above you can see that questionnaires, as a method of data
collection, can be used not only in survey research.
Example: Use of Questionnaires in Experimental Studies
In experimental studies, questionnaires can be used before, during, and after a
program or intervention. Data collected before the intervention may be used for:
Selecting groups to participate in a program;
Checking the support for a program;
Ensuring the comparability of groups;
Providing a basis for monitoring change.
2015 Semester 2
22
STA60004
Example (Fink, V.6, pp.24-25)

Questionnaires as Premeasures
To Select Participants
A self-administered questionnaire is given to all parents of children attending a
particular school. They are asked to specify the number of years of formal
education they have completed in this or any other country. They are also asked
to rate their willingness to participate in one of two experimental programs to
improve literacy. All of the parents who state that they have completed fewer
than 10 years of schooling and who indicate that they are definitely willing to
participate are considered eligible to participate in a study concerning the two
experimental programs.
To Check the Support for a Program
A questionnaire is mailed to all residents of a given town to find out if they are
willing to participate in a program aimed at teaching home-based injury
prevention. The questionnaire asks the residents if they are willing to be in a
control group, if randomly selected.
To Ensure Comparability of Groups
Students in a particular school are assigned to experimental and control groups
for a study of a new reading program. Before the start of the experiment, the
research team surveys the student participants to gather data that will enable
comparison of the ages and reading levels of the members of the two groups, to
check that the distribution of participants is similar with respect to these two
important variables.
To provide a Basis for Monitoring Change
Prisoners who have been selected to participate in a study of the results of a new
art therapy program are assigned to either the experimental group or the control
group. Before the experiment begins, the researchers interview all participants,
using a standardized instrument designed to measure rage. A similar survey will
be given after the experimental group completes 6 months of the art therapy
program.
Questionnaires can also be used during an intervention to measure change, and

after the intervention is completed, to measure outcomes and impacts.
2015 Semester 2
23
STA60004

Questionnaires as Interim and Postmeasures
To Measure Change
People over 65 years of age who have been to a particular hospitals emergency
room because of fall are interviewed within 2 weeks of their ER visits and then
again 3, 6, and 12 months later. Those in the experimental group received
geriatric assessments at the time of their hospital visits; those in the control
group did not. The survey team uses the follow-up interviews to compare the two
groups with respect to their social, psychological, and physical functioning.
To Measure Outcomes
Prisoners in an art therapy program are interviewed by two psychiatrists within 3
months of completing their course of study. The results are compared with those
obtained from interviews with the control group.
To Measure Impact
Two groups of elderly people, those who received special geriatric assessments
after they had gone to a hospital because of fall and those who did not, are
surveyed 1, 3, and 5 years later. The purpose of the surveys at 3 and 5 years is
to assess and compare the impacts of such assessments over time.
Steps in Survey Research

In Module 1 the basic steps in conducting survey research will be examined.
Survey research is comprised of the following activities:
2015 Semester 2
24
STA60004
(Figure 8.1 from Bryman (2012), reproduced with permission from Oxford University Press)
2015 Semester 2
25
STA60004
Choosing Research Topic

In general, when choosing a research topic for your study, the following principles
should be followed:
The research topic should be one in which you are interested in;
It should be broad enough, yet specific enough to make the research scope
reasonable;
It should be practical and feasible;
It should be important (worthwhile studying);
It should be ethical (the study will not cause any harm).
Exercise 3
Read the following article:
Lim, M.S.C., Hellard, M.E., & Aitken, C.K. (2005). The case of the disappearing
teaspoons: Longitudinal cohort study of the displacement of teaspoons in an
Australian research institute. British Medical Journal, 331, 498-500
http://www.biostat.jhsph.edu/courses/bio622/misc/Disappearing_teaspons.pdf
and comment on the importance of the research topic of the study.
2015 Semester 2
26
STA60004
Setting Measurable Objectives

What information should we collect in a survey? You must know the surveys
objectives in order to answer this question. The statement of objectives will guide the
selection of respondents and the writing of survey questions.
Example (Fink, V.1, p.7)
Consider the following objectives for a survey of educational needs. As you can see
those objectives suggest specific questions.
Illustrative Objectives for a Survey of Educational Needs
1. Identify the most common needs for educational services.
2. Compare needs of men and women.
3. Determine the characteristics of people who benefit most from services.
Objective 1: Educational needs
Sample survey question: Which of the following skills would you like to have?
Objective 2: Compare needs of men and women.
Sample survey question: Are you male or female?
Objective 3, first part: Characteristics of survey participants
Sample survey questions: What is your occupation? What was your household
income last year? How much television do you watch? How many books do
you read in an average month?
Objective 3, second part: Benefits
Sample survey question: To what extent has this program helped you improve
your job skills? (In this example, you can infer that one benefit is
improvement in job skills.)
Defining Terms
When developing a survey, you need to define all abstract, imprecise or ambiguous
terms in the survey objectives. In the previous example the imprecise terms are
needs, educational services, characteristics, and benefit. These terms are
ambiguous because no standard definition exists for any of them.
2015 Semester 2
27
STA60004
Formulating Research Questions and Hypotheses

Survey objectives can be converted into questions and hypotheses.
Example (Fink, V.1, p.9)
Survey Objective 4
To compare younger and older parents in their needs to learn how to manage a
household and care for a child
Survey Research Question: How do younger and older parents compare in their
needs to learn how to manage a household and care for a child?
Null Hypothesis: No differences exist between younger and older parents in
their needs to learn how to manage a household and care for a child.
Research Hypothesis: Differences exist between younger and older parents in
their needs to learn how to manage a household and care for a child, with
younger parents having greater needs.
The difference between stating a surveys purpose as an objective and as a question

is usually a minor change in sentence structure from statement to question.
Where Do Surveys Objectives Originate?

The objectives of the survey can originate from:
a defined need
Example (Fink, V.1, p.10):
Suppose a school district is concerned with finding out the causes of a
measurable increase in smoking among students between12 and 16 years of
age. The district calls on the Survey Research Department to design and
implement a survey of students. The objective of the survey to find out why
students smoke is defined for the surveyors and is based on school
districts needs.
reviews of the literature and other surveys

A systematic review of the literature will tell you what is currently known about a
topic. The review will point out the gaps, limitations and other shortcomings.
experts
Experts are those who are knowledgeable about the topic or those who may be
influential in implementing research findings
There are two types of meetings that researchers can use to help to identify survey
objectives, research questions and hypotheses:
2015 Semester 2
28
STA60004
Focus groups and

Consensus panels
In a focus group, a trained leader conducts a carefully planned discussion to obtain
participants opinions on a defined area of interest.
A 10-member group 5 students, 2 parents, and 3 teachers was asked to
help identify the objectives and content of a survey on teenage smoking. The
group met for 2 hours in a classroom at the Middle School. The group was told
that the overall goal of the survey was to provide the social district with
information on why children start smoking. What information should the survey
collect? What types of questions should be on the survey to encourage children
to provide honest, and not just acceptable, answers? The focus group
recommended a survey that would have at least two major objectives:
1. Determine the effects of cigarette advertising on smoking behaviour
2. Compare smoking behaviour among children with family members who
smoke/ dont smoke.
Consensus panels are conducted by a skilled leader in a highly structured

environment. For example, consensus panel participants may be asked to read
documents and rate or rank preferences.
Descriptive and Explanatory Research

Once the objectives are determined, the design of the survey must be chosen. It is
essential to keep the study objectives in mind so that the data will address those
objectives. It is also important to anticipate the data analysis because a desired
analysis can be performed only if appropriate design decisions are made.
Survey research can be grouped into two general types:
Descriptive (or observational) and
Explanatory
Descriptive Research
Descriptive studies deal with research questions of what things are like. Public
opinion polling, voter intention studies, unemployment rate surveys and the census
are examples of descriptive surveys.
2015 Semester 2
29
STA60004
Descriptive surveys deal with either describing or comparing peoples attitudes,

opinions, values and behaviour.
Examples (Fink, V.6, pp.14-15)
Describing
Objective: To describe the quality of life of men over and under 65 years of age
with different health characteristics (e.g., the presence or absence of
common conditions such as hypertension and diabetes) and social
characteristics (e.g., living alone or living with someone; employed or not), all
of whom have had surgery within the past 2 years for prostate cancer.
Target: Men of differing ages, health, and social characteristics who have had
surgery for prostate cancer within the past two years.
Number of times surveyed: Once, within 2 years of surgery
Comparing
Objective: To compare, before and after participation in a safety course, parents
of children under 5 years of age, between 6 and 12, and 13 and over in terms
of their opinions of their ability to cope with potential accidents and injuries in
the home.
Target: Parents who participate in a safety course.
Number of times surveyed: Twice, before and after participation.
When doing descriptive research consider the following:

The time frame of your interest
For example, if you study a topic on divorce, decide whether you want to know
about divorce now or in past, or do you want to look at the trends over, say, the
last 50 years.
Geographical location of your interest
For example, decide whether you want to know about divorce rates for the whole
nation, or part of the country, or for other countries.
Descriptive studies can provide a stimulus for explanatory research.
Explanatory Research
The first step in explanatory research is to decide whether you are looking for
causes or consequences. For instance, if you are studying recent increase in
2015 Semester 2
30
STA60004
divorce rate, you may be interested why this happens or you may be interested in
the consequences of the increased divorce rate. In the first case, increase in divorce
rate will be a dependent variable:
Increase in
divorce rate
?
Independent Variables
Dependent Variable
In the second case, increase in divorce rate will be an independent variable:
Increase in
divorce rate
?
Independent Variable
Dependent Variables
You also can consider intervening variables. For example, you may research how
education affects income level via its effect on job:
Education
Job
Income
Independent Variable
Intervening Variable
Dependent Variable
You should also be aware of extraneous variables. Extraneous variable refers to any
variable other than the independent variable that could cause a change in the
2015 Semester 2
31
STA60004
dependent variable. For example, students in religious schools may be more

religious because they have religious parents rather than because they attend a
religious school.
Type of school
(religious/non-religious)
Childs religiousness
Independent variable
Dependent variable
Parental religiousness
Extraneous variable
Units of Measurement/ Units of Analysis in Survey Research

Data collected from surveys are arranged in a variable by case data grid.
Cases are units of measurement (usually a person, a household or an
organisation) that provide information;
Variables are pieces of information collected about each case.
Example:
The following information was collected from several people: sex, age, marital status
and work status. This information was compiled to form a variable by case data grid.
In Table 1 each row represents a case (person) and each column represents a
variable.
Table 1: A Variable by Case data Grid
Variables
Sex
Cases
Age (years) Marital Status
Work Status
Person 1
Male
21
Single
Part time
Person 2
Female
28
Married
Unemployed
Person 3
Female
46
Divorced
Full time
Person 4
Male
34
Single
Full time
Person 5
Female
39
Married
Part time
Person 6
Male
26
Married
Full time
Person 7
Male
52
Separated
Full time
2015 Semester 2
32
STA60004
The case in the data grid refers to a unit of measurement or unit of observation.
Unit of measurement is the unit on which the researcher collects data. The cases in
the data grid are not necessarily people. The unit of measurement can be a country,
a year or an organisation.
The unit of analysis is the major entity that is being analysed in the study. The unit
of analysis should not be confused with the unit of observation. For different
analyses in the same study you may have different units of analysis.
Consider the following example. Imagine a researcher collects data on students from
different schools. There are three possible levels of generalizations: the student, the
classroom, and the school. If the researcher wants to draw conclusions about
students, student should be the unit of analysis. If the researcher wants to make
generalizations about schools, then schools should be the unit of analysis etc.
For example, if the researcher is comparing the children on achievement test scores,
the unit is the individual student because you consider a score for each student (see
Table 2). If you decide to compare average classroom performance then the unit of
analysis is the classroom. In this case, since the data that goes into the analysis is
the average itself, and not the individuals' scores, the unit of analysis is the group.
Even though you had data at the student level, you use average scores in the
analysis (See Table 3). Therefore it is the statistical analysis you do in your study
that determines what the unit of analysis is.
Table 2
Table 3
Variable
Variable
Achievement
Test Score
Average
Achievement
Test Score
Student 1
Class 1
Student 2
Cases
Student 3
Class 2
Cases
Class 3
Student 4
Class 4
...
2015 Semester 2
33
STA60004
Exercise 4
(de Vaus, Chapter 3, Exercise 1, p. 39)
For each of the following statements say what unit of analysis is being used.
a. In the UK for every 1000 women aged 20-24 there were 30.4 who had an
abortion in that year of 1998.
b. In 1998 in the United States the average family in poverty would require an
additional US$6620 per year to get on or above the poverty line.
c. Australia has one of the lowest rates of expenditure on research amongst

developed countries.
d. Within any one year 18 per cent of Australians move.
e. In the UK the official abortion rate per 1000 women aged 20-24 has changed as
follows:
1968 = 3.4
1985 = 20.4
1970 = 10.5
1990 = 28.1
1975 = 15.1
1995 = 25.5
1980 = 18.7
1998 = 30.4
2015 Semester 2
34
STA60004
Ethics in Research
Three important considerations need to be taken into account when developing
surveys:
Technical consideration (sample design, questionnaire construction, etc.);
Practical considerations (budget, deadlines, purpose of the research);
Ethical considerations
In other words, a survey should be technically correct, practically efficient and
ethically sound (de Vaus, p. 58).
Ethical principles
1. Voluntary participation
2. Informed consent
3. No harm
4. Anonymity
5. Confidentiality
6. Privacy
1. Voluntary Participation
People should not be required to participate in a survey. This should be stated
explicitly. For example:
Although your participation in this survey will be greatly valued, you are not required
to participate. You can stop at any point or choose not to answer any particular
question. (de Vaus, p.60)
Although participation in surveys is voluntary, consider the following practices:
Governments can require by law that citizens participate in census collections
and certain surveys.
Some institutions can require people to complete forms (e.g. universities,
hospitals etc.). Reason: information can be useful for monitoring, planning,
reporting etc.
2015 Semester 2
35
STA60004
Researchers cannot always guarantee that participation by all people is

voluntary. Some research involves indirect participation. For example, in a
questionnaire you are sometimes asked about the education, occupation and
income of your parents or partner.
2. Informed consent
Informed consent is closely related to voluntary participation. Participants must give
their informed consent before taking part in a study. This means they can formally
agree to participate only after they have been informed about a range of matters
relating to the survey.
Informed consent form usually includes the following information:
The purpose of the research;
Description of the likely benefits of the study;
Statement of how the respondent was selected;
A statement that participation is voluntary and that the respondent is free to
withdraw at any time;
A statement that participation is anonymous or confidential;
Any foreseeable risks or discomfort;
Some information about how the data and results will be used;
The identity of the researcher and the sponsor.
You will need to decide how much information to provide to participants. Sometimes
too much detailed, technical information may confuse respondents and discourage
participation. The best solution to this problem is to provide general information and
to offer to answer further questions.
The consent form is designed to protect all parties: the participants, the researcher,
and the institution. Therefore, it is important that information in the consent form is
presented in an organised and easily understood format.
Sometimes in research, especially in some psychological studies, participants are
deliberately deceived. This is done because accurate knowledge may invalidate the
study. In this case participants should be fully debriefed after the study.
2015 Semester 2
36
STA60004
Informed consent can be demonstrated by asking participant to sign a written

informed consent form. However in most cases this is not necessary. Usually
completing and returning the questionnaire or continuing with the telephone interview
demonstrates consent.
In research with young children, intellectually disabled or others who may not be in a
position to understand the implications of participating in research, consent should
be obtained from the participant AND other people (e.g. parents, guardians, school
authorities etc.). In this case it is advisable to ask those people to sign a written
informed consent form. Participation still ought to be voluntary.
3. No Harm
In surveys some questions can distress and embarrass participants (questions about
family relationships, sexual behaviour, unpopular attitudes etc.). Therefore, in the
informed consent or the questionnaire introduction participants should be informed
how to deal with distress. You can write something like the following, for example: If
you at any time feel upset by any of the question in this survey, you can contact Life
Line (tel.: 13 1114, 24 hours) and discuss your problems and concerns.
4. Anonymity
Any survey is either anonymous or confidential.
Anonymity means that the researcher will not and cannot identify the participant.
The participant should be assured that their participation is anonymous. For
example, telephone survey may or may not be anonymous. It depends on how you
obtain telephone numbers. If you contact a person using random digit dialling then
the survey is anonymous. Postal surveys with identification numbers are not
anonymous.
5. Confidentiality
Confidentiality means that the researcher can match participants names with their
responses but ensures the participant that nobody else will have access to their
responses.
2015 Semester 2
37
STA60004
Sometimes surveys are administered through third parties (e.g., to students through
their teacher; to employers through their supervisor etc.). In those cases the third
parties must not see the responses before the surveys are returned to the
researcher.
Once data are collected make sure that confidentiality is maintained. Any identifying
information (e.g., name and address) should be separated from respondents
answers. This is done by providing cases with ID numbers and having a separate file
in which these ID numbers are linked with the participants names. This is usually
done if follow up is required. Access to the file with respondents names and
corresponding ID numbers should be restricted. If follow up is not required then you
dont need to keep any record of participants names.
The survey data must be confidentialised before publishing results. This can be done
by:
Removing information that lead to identification;
Collapsing categories of variables from highly specific and putting individuals into
broad groups.
Informed Consent Example:
Parent-Teacher Interview Satisfaction Survey
March, 24, 2013
Dear Parent/Carer,
You have been randomly selected to participate in a survey that aims to
investigate parents satisfaction level at parent-teacher interviews at XYZ
Secondary School. This study is being conducted by ABC Research
Consultancy, on behalf of XYZ Secondary School. This survey aims to provide
important information about your parent-teacher interview experience and help
guide the future of parent-teacher communication at our school.
You will be asked to complete a 10 minute anonymous questionnaire containing
35 questions about your experience at the parent-teacher interview you have just
attended. Although your participation is greatly valued, you do not have to
complete this survey. You can stop at any time or choose not to answer any
questions. Most questions will ask you to tick the appropriate box while others will
require you to circle a number best relating to your feelings about the parentteacher interview.
2015 Semester 2
38
STA60004
Participation in this research is not anticipated to pose any risks or discomforts.

Some of the questions are personal and if you do not feel comfortable you are
not obliged to answer. You can also contact the School Counselling Service at
(03)8999 9999 or Lifeline (telephone: 13 11 14, 24hrs) to discuss this.
Only the researcher will see individual survey responses. No identifying
information will be reported in the results of this study and only group data will be
presented in the report. The results of the study will benefit everyone within the
school by providing valuable information about the ways in which the
communication during the parent-teacher interview can be improved. A summary
of the results will be made available to the school community at the end of Term
3, 2013.
For further information about the questionnaire or the study, you may contact the
research representative, Anne White, at ABC Research Consultancy on (03)9888
8888.
Yours sincerely,
Robert Ford
Principal
XYZ Secondary School
6. Privacy
Privacy means that people can expect to be free from intrusion.
Ethical Responsibilities to Colleagues, Sponsors and the Public

Acknowledge the contributions of colleagues;
Make the sponsor aware of the limitation of the study;
Provide readers with methodological details about data collection, sampling, data
analysis etc.;
Make any sponsorship arrangements clear to the public (e.g. if you are doing
research on the effects of smoking on health, the reader would want to know who
sponsored the research: a university, health authority or the tobacco industry).
2015 Semester 2
39
STA60004
Ethical issues are dealt in the codes of ethics of the professional organisations. For
example,
National Statement on Ethical Conduct in Human Research:
http://www.nhmrc.gov.au/_files_nhmrc/publications/attachments/e72_national_statement_m
arch_2014_140331.pdf
Swinburne Research/ Research Integrity & Ethics:

http://www.research.swinburne.edu.au/ethics/
Australian Psychological Society/ Code of Ethics:

http://www.psychology.org.au/Assets/Files/APS-Code-of-Ethics.pdf
Australian Market and Social Research Society/ Code of Professional Behaviour:

http://www.amsrs.com.au/documents/item/194
The codes of ethics of the professional organisations are based on the

Belmont Report (Ethical Principles and Guidelines for the Protection of
Human Subjects of Research).
The Belmont Report was published in 1978 by the National Commission for the
Protection of Human Subjects of Biomedical and Behavioral Research. It was named
the Belmont Report after the Belmont Conference Center (Elkridge, Maryland, United
States), where the National Commission for the Protection of Human Subjects of
Biomedical and Behavioral Research met when first drafting the report. The report
was created in reaction to previous human subject violations (e.g., Tuskegee syphilis
experiment and other unethical human experimentation research).
Ten famous psychological experiments that could never happen today:
http://mentalfloss.com/article/52787/10-famous-psychological-experiments-could-neverhappen-today
The Belmont Report identifies three fundamental ethical principles for all human
participant research respect for persons, beneficence, and justice.
2015 Semester 2
40
STA60004
The principle of respect for persons includes two moral requirements: the
requirement to acknowledge autonomy and the requirement to protect those with
diminished autonomy. This means that individuals have a right to decide for
themselves whether to participate in research. The researchers should not use
information about participants without first getting their informed consent.
Beneficence involves two principles: (1) do not harm and (2) maximize possible
benefits and minimise possible harms.
Justice requires that participants are selected fairly and that the risks and benefits of
research are distributed equitably. For example, if research supported by the
government leads to the development of therapeutic devices and procedures, justice
demands that these developments will be available not just to those who can afford
them. Such research also should not unduly involve people from groups unlikely to
be among the beneficiaries of subsequent applications of the research.
2015 Semester 2
41
STA60004
Bibliography
Babbie, E.R. (2010). The practice of social research (12th ed.). Belmont: Wadsworth
Cengage.
Bryman, A. (2012). Social research methods (4th ed.). Oxford University Press.
De Vaus, D.A. (2002). Surveys in social research (5th ed.). Sydney: Allen & Unwin.
Fink, A. (2003). The survey kit, Volume 1: The survey handbook. (2nd ed.) London:
Sage Publications.
Fink, A. (2003). The survey kit, Volume 6: How to design survey studies. (2nd ed.).
London: Sage Publications.
Rea, L.M. & Parker R.A. (2005). Designing and conducting survey research: A
comprehensive guide (3rd ed.). San Francisco: Jossey-Bass.
Trochim, W. (2000). The research methods knowledge base (2nd ed.). Cincinnati:
Atomic Dog Publishing.
Weisberg, H.F., Krosnick, J.A., & Bowen, B.D. (1996). An introduction to survey
research, polling, and data analysis (3rd ed.). Thousand Oaks: Sage Publications.
2015 Semester 2
42
STA60004
Answers to Selected Exercises

Exercise 2 (p. 17)
a) Cross-sectional design
b) Panel design
c) Trend
d) Cross-sectional
e) Cross-sectional (one-group post-test only)
f) Retrospective panel design
g) Panel design
Exercise 4 (p. 33)
a) Women aged 20-24
b) Family in poverty
c) Developed countries
d) Australians
e) Year
2015 Semester 2
43
STA60004
Additional Resources
Blackboard/ Learning Material/ Weekly Activities and Notes/ Week 1: Module1
Topic 1/ Additional Resources
Visit Brymans Social Research Methods (4th ed.) online resources:
http://www.oup.com/uk/orc/bin/9780199588053/
2015 Semester 2
44

Module 1
Topic 2: The Basics of Survey Sampling
STA60004
Contents
Learning Objectives...................................................................................... 3
Optional Reading....................................................................................... 3
Steps in Sampling Process...................................................................................... 4
Sampling Concepts... 4
1. Population (Target Population)... 4
2. Census....................... 4
3. Sample.......................................................................................................... 5
4. Sampling Frame............................................................................................ 5
5. Survey Population......................................................................................... 6
6. Probability Sampling..................................................................................... 6
7. Non-Probability Sampling 6
8. Sampling Error.............................................................................................. 7
9. Sampling Bias...............................................................................................10
10. Error in Survey Research............................................................................ 11
Probability Sampling.............................................................................................. 12
1. Simple Random sampling.......................................................................... 12
2. Systematic Sampling.................................................................................. 13
3. Stratified Sampling...................................................................................... 14
4. Cluster sampling........................................................................................ 15
Non-Probability Sampling.............. 16
1. Quota Sampling................... 16
2. Purposive or Judgment Sampling.... 17
3. Snowball Sampling............. 17
4. Availability or Convenience sampling..... 17
Example......................................... 18
Exercise 1...................................... 19
Exercise 2...................................... 20
Exercise 3...................................... 20
Sample Size................................... 21
Exercise 4...................................... 26
Bibliography...................................... 27
Answers to Selected Exercises. 28
Semester 2 2015
STA60004
1. Understand the differences between a sample and a population;
2. Understand the processes involved in selecting a sample;
3. Understand the differences between probability and non-probability sampling;
4. Understand the difference between simple random sampling, systematic
sampling, stratified sampling, and cluster sampling;
5. Understand the difference between quota sampling, purposive sampling,
snowball sampling and convenience sampling;
6. Explain the concepts of sampling error and non-response.
Optional Reading
Chapter 8
(The book is available at Swinburne Library and Swinburne Bookshop.)
Chapter 6
Semester 2 2015
STA60004
Steps in the Sampling Process

The following steps are performed in the sampling process:
Determine the relevant population
Select the appropriate sampling frame
Choose the sampling method (probability or non-probability)
Determine the sample size
Sampling Concepts
1. Population (Target Population)
2. Census
3. Sample
4. Sampling frame
5. Survey Population
6. Probability Sampling
7. Non-Probability Sampling
8. Sampling Bias
9. Sampling Error
10. Error in Survey Research
1. Population (Target Population)

A population is the universe to be sampled (the complete group of individuals who
are the subject of the study). For example:
- All Australians;
- All people over 18 years of age;
- All employed people.
A population is not always defined in terms of individual people. It can be, for
example, households, schools, hospitals or any other entity.
2. Census
Census is the enumeration of an entire population. A census is obtained by
collecting information about every member of a population.
Semester 2 2015
STA60004
3. Sample
A sample is a segment or subset of a population. The best sample is representative
of the population. Representative sample is a sample that reflects the population
accurately. A sample is representative if important characteristics (e.g., age, gender,
etc.) of those within the sample are distributed similarly to the way they are
distributed in the population (the profile of the sample is the same as that of the
population).
If a sample is representative of the population, then we can make inferences
(generalizations) about the population based on the known characteristics of the
sample. Therefore, a sample enables us to learn about the characteristics of the
population without surveying every single member of the population.
Selected and Achieved Sample
Selected sample is a subset of the target population that has been chosen to
participate in a survey.
Achieved sample constitutes those members of the selected sample who have
completed the questionnaire
Samples vs. Census
Why should we use samples instead of census?
Samples are less expensive to obtain;
Samples can be studied more quickly than entire target populations.
4. Sampling Frame
A sampling frame is a list of all members of the population. A sample is selected
from the sampling frame. The sampling frame should not contain:
-
duplicate records (no member should appear more than once on the list);
redundant records (i.e., former members of the population should be removed

from the list); in other words, the sampling frame should be current.
Very often the following lists serve as sampling frames:

-
White Pages Telephone Directory;
Electoral roll;
Rates registers;
Motor vehicle registration lists;
Customer lists, etc.
Semester 2 2015
STA60004
Some of the aforementioned sampling frames cannot be considered as complete

sampling frames for some studies. For example, electoral roll excludes those
ineligible to vote; White Pages are limited to the household members who are
nominated to appear in the directory etc.
The electoral roll has been used in the past as it could be purchased from the
Australian Electoral Commission, but this is no longer the case. Since 2004 the
Electoral Roll has only been available for use by members of Parliament, political
parties and medical researchers. Because of these restrictions, many surveys
involving the general population are subcontracted to the Australian Bureau of
Statistics (ABS). The ABS uses the list of all dwellings in Australia as the sampling
frame.
Many telephone surveys, including opinion polls, rely on the random digit dialling
method. Random digit dialling has the advantage that it includes numbers not listed
in the Australian White Pages. A disadvantage of using this method is the necessity
of screening out disconnected and business numbers. Also, with the proliferation of
mobile phones, some people have chosen not to have a fixed line. If mobile phones
are included in the frame, duplication of units then becomes an issue.
5. Survey Population
Theoretically a sample should be drawn from the target population. However, it is not
always possible to know how, or where, to contact each member of the target
population. In this case we use survey population: a population which includes those
elements in the target population that can be reached for inclusion in the sample.
6. Probability Sampling
Probability sampling implies the use of random selection. Probability sampling
requires a sampling frame of members of the target population so that members of
the sample can be selected with an equal (or at least known) chance of selection.
Probability sampling allows researchers to utilise tests of statistical significance that
permit inferences to be made about the population from which the sample was
selected.
7. Non-Probability Sampling
This type of sampling is used when sampling frames are not available. Samples are
chosen based on judgment regarding the characteristics of the target population and
the needs of the survey. With non-probability sampling, some units of the target
Semester 2 2015
STA60004
population have a chance of being selected and others have not. By chance, the
surveys results may not be applicable to the target population at all. Non-probability
methods are usually used for situations in which precise representativeness is not
necessary. Non-probability sampling can be used when you study a relationship or
when developing a theory.
8. Sampling Error
Sampling error results naturally from selecting a sample, rather than measuring the
entire population. As the sample size increases, the sampling error decreases, until
the sample becomes a census in which there is no sampling error. It is important to
understand that probability sampling does not and cannot eliminate sampling error.
Consider Figures 1- 4 to appreciate the significance of sampling error.
Example (Bryman (2012), pp. 188-190; Figures 1- 4 are reproduction of Figures 8.38.7 with permission from Oxford University Press):
Imagine we have a population of 200 people and we want to obtain a sample of 50
people. One of the variables of interest is whether people watch soap operas or not.
Imagine the population is equally divided between those who watch and those who
do not watch soaps (see Figure 1). If the sample is representative of the population
we would expect our sample of 50 to be equally split in terms of the variable of
interest (Figure 2).
Figure 1: Watching soap operas in population of 200
Figure 2. A sample with no sampling error.
If there is a small amount of sampling error, the sample will look like Figure 3. A
sample in Figure 4 has higher degree of overrepresentation of those who do not
watch soap operas.
Semester 2 2015
STA60004
Figure 3: A sample with very little sampling error.
Figure 4. A sample with some sampling error.
In Figure 5 you can see a very serious over-representation of people who do not
watch soaps (25 watch soap operas and 35 people do not).
Figure 5: A sample with a lot of sampling error.
Conceptually, sampling error is the degree to which the sample is not representative
of the population, and it arises naturally as a function of selecting a sample. The
main idea of getting a sample is that you can calculate a statistic and estimate the
corresponding population parameter. Standard error reflects the difference between
an estimate derived from a sample and real value for the whole population. In
sampling contexts, the standard error is called the sampling error.
For example, you want to know how much time, on average, university students
spend on self-study per day. Imagine you obtain a random sample of university
students and calculate the sample mean. Say, the mean is 2.5 hours. How confident
are you that the mean of 2.5 hours is likely to be found in the population?
If you take an infinite number of samples from the population, the sample estimate of
the mean of the variable of interest (in our case self-study hours) will vary in relation
to the population mean. This variation will take the form of normal distribution and is
called the sampling distribution (in our case it is the sampling distribution of means).
The standard deviation of the sampling distribution is referred to standard error. So,
in a sample, a standard deviation is the spread of the scores around the sample
Semester 2 2015
STA60004
mean. In a sampling distribution, the standard error is the spread of the sample
means around the population mean (the mean of the sample means).
The distribution of sample means
Note: 95 percent of sample means will lie within the shaded area. SE = standard error of the mean.
To calculate sampling error we use the standard deviation (sd) of our sample:
, where n = sample size, N = population size.
From this formula you can see that if the sample size is equal to the population size
(n=N), SE is zero.
=0
You can also see that the bigger the sample size, the smaller the sampling error.
A 95% confidence interval (CI) for the population mean is calculated as follows:
]
In the case of proportions, we use the observed proportion in the sample:
A 95% confidence interval for the population proportion:
Semester 2 2015
STA60004
The term
refers to the finite population correction (fpc) factor. If the
sample size n is a reasonably large fraction of the population size N, then fpc needs
to be included in the calculation of SEs and CIs. It is recommended that fpc should
be included if the sampling fraction
is 10% or more. It can be safely ignored if
is less than 10%. If the fpc is ignored, the formula for the standard error of the
mean becomes
form of
and the formula for the standard error of the proportion takes a
.
9. Sampling Bias (Sampling-Related Error)

Sampling bias is a distortion in the representativeness of the sample which happens
when some members of the population have little or no chance of being selected for
inclusion in the sample.
Sources of sampling bias:
Using non-probability sampling: If a non-probability sampling method is used,
there is a possibility that human judgment will affect the selection process. As a
result, some members of the population will be more likely to be selected than
others.
Inadequate sampling frame: If the sampling frame is not accurate or complete,
the sample that is obtained cannot truly represent the population, even if a
probability sampling method is used.
Non-response: Non-response occurs when some respondents refuse to
cooperate, or cannot be contacted, or cannot participate (e.g., because of the
mental incapacity etc.). Non-response results in the reduction of the sample size.
To minimize the reduction of sample size the following techniques can be used:
-
Draw an initial sample that is bigger than needed (be careful to pay attention
to the costs);
Use trained interviewers;
Send reminders to recipients of mailed and internet surveys and make repeat
phone calls to potential phone survey respondents;
Use graphically sophisticated surveys;
Provide incentives.
Semester 2 2015
10
STA60004
Selecting an initial sample that is larger than needed may not solve the
problem of bias, however. Those who agree to participate may differ in various
ways from non-respondents. Non-response may introduce bias into a surveys
results because of the differences between respondents and non-respondents in
attitudes, patterns of behaviour and other potentially important factors.
Response rate refers to the percentage of a sample that agrees to participate.
However the calculation of the response rate is a bit more complicated:
Response Rate =
Number of usable questionnaires

Total sample - unsuitable or uncontactable members of the sample
Usable questionnaires = Returned questionnaires - Unusable questionnaires

Unusable questionnaires constitute those ones where a large number of questions
are not answered or where it is clear that a respondent has not taken the
questionnaire seriously.
Error in Survey Research

In general, error in survey research arises because of the sampling error and nonsampling error:
Sampling Error;
Sampling Related Error (Sampling Bias);
Data Collection Error (e.g., poor question wording, poor interviewing
techniques, poor administration of questionnaires etc.)
Data Processing Error (e.g., faulty management of data, coding errors etc.)
Semester 2 2015
11
STA60004
Probability Sampling
Types of probability sampling:
1. Simple random sampling;
2. Systematic sampling;
3. Stratified sampling;
4. Cluster sampling
1. Simple Random Sampling

In simple random sampling, members of the sample are selected from the population
at random such that each element has an equal and known chance of selection.
Simple random sampling approximates drawing a sample out of a hat. Members of a
population are selected one at a time and independent of one another. To choose
the members for the sample, a table of random numbers, or a computer generated
list of random numbers are usually used.
Steps in selecting a simple random sample using a table of random numbers:
Define the population;
Select or devise a complete sampling frame;
Give each case a unique number starting at one;
Decide on the required sample size;
Select numbers for the sample size from a table of random numbers;
Select the cases that correspond to the randomly chosen numbers.
See de Vaus (2002), pp.72-73, to learn how to use a table of random numbers.
Steps in selecting a simple random sample using SPSS:
Imagine our population of interest contains 100 units, and we need to obtain a
sample of 20 units.
Type 1 to 100 in the first column;
Choose Data Select Cases;
Click on Random sample of cases;
Click on Sample;
Click on Exactly;
Enter 20 and 100 in the boxes;
Click on Continue;
Semester 2 2015
12
STA60004
Click on Copy selected cases to a new dataset;

Enter the data set name;
Click OK
The new data set will contain the sample.
Problems with simple random sampling:
It requires a good sampling frame. Sampling frames are usually available for
organisations such as schools, universities, unions, etc. For larger population
surveys of a city, a region or a country adequate sampling frames may not be
available.
If population comes from a large area and data collection technique involves
travelling, the cost may be prohibitive.
2. Systematic sampling
Systematic sampling is similar to simple random sampling.
Advantage of systematic sampling is that it is mechanically easier to create. With
this type of sampling, you select units directly from the sampling frame, without using
a table of random numbers.
Steps in selecting a systematic sample:
Obtain a sampling frame;
Determine the population size (e.g., 200);
Decide on the sample size (e.g., 50);
Calculate a sampling fraction: divide the population size by the sample size
(200/50=4);
Select a starting point by choosing a number that falls within the sampling fraction
(a number between 1 and 4, e.g., select number 3);
Use the sampling fraction to select every nth case. (In our example, select every
4th case and obtain 50 cases. So the sequence will be: number 3, number 7, 11,
15, ...).
Problems with systematic sampling:
Similar to problems associated with simple random sampling.
Additional problem:
Periodicity of sampling frames: a certain type of person may reoccur at regular
intervals within the sampling frame. For example, if the sampling frame is a list of
Semester 2 2015
13
STA60004
names, you can obtain a sample that lacks names that appear infrequently, say
names beginning with the letter X. Another example: suppose we have a list of
married couples arranged so that every husbands name is followed by his wifes
name. If a sampling fraction is any even number the sample will only contain
females.
Systematic sampling should not be used if repetition is a natural component of
the sampling frame. You should reorder the list or adjust the sampling intervals in
order to use systematic sampling.
3. Stratified sampling
Stratified sampling is a modification of simple random sampling designed to
produce more representative samples. For a sample to be representative the
proportions of various groups in the sample should be the same as in the population.
For example, if in a study the ethnic background of respondents is expected to affect
participants responses, then we need to ensure that each ethnic group is
represented in the sample in its correct proportion.
Steps in selecting a stratified sample:
Select the stratifying variable (e.g., ethnic background);
Divide the sampling frame into separate lists (strata) one for each category of
the stratifying variable;
Select a simple random or systematic sample from each stratum.
Advantage of stratified sampling:
It ensures representation from each stratum. Hence a more representative
sample is obtained.
Disadvantages:
Similar to problems associated with simple random sampling.
Additional problems:
More complicated than simple random and systematic sampling;
Strata must be identified and justified.
There are two types of stratified sampling: proportionate stratified sampling and
disproportionate stratified sampling. In proportionate stratified sampling, the number
of units allocated to the various strata is proportional to the representation of the
strata in the population. This type of sampling is used when you need to estimate a
populations parameter.
Semester 2 2015
14
STA60004
If you need to perform a detailed analysis within a relatively small stratum or

compare strata to each other, proportionate stratified sampling may not yield
sufficient numbers of cases in some of the strata for such analyses. In this situation,
disproportionate stratified sampling is more appropriate. For example, you can
allocate the same sample size for each stratum and then compare those strata.
4. Cluster sampling
A cluster is a naturally occurring unit, such as a school, a university, a hospital, a
city, or a state. Cluster sampling is usually performed when a proper sampling frame
is not available. For example, you cannot obtain a list of all patients in city hospitals
or all members of sporting clubs, but you can obtain lists of hospitals and sporting
clubs.
Steps in selecting a cluster sample:
Clusters are randomly selected;
All members of the selected clusters are included in the sample.
Multistage cluster sampling (an extension of cluster sampling):
Clusters are randomly selected;
A sample is drawn from the cluster members by simple random or systematic
sampling.
Stratified vs. Cluster Sampling

With cluster sampling, you start with a naturally occurring units.
With stratified sampling, you create the groups.
In cluster sampling only some clusters are selected.
In stratified sampling, all strata are represented in the sample.
Example: Stratified Sampling
The employees of an organisation were grouped according to their departments
(sales, marketing, research, and advertising);
Ten employees were selected at random from each department.
Example: Cluster Sampling
Five of the ABC Hotel chains 10 hotels were chosen at random;
All employees in the chosen hotels were surveyed (or random samples of
employees in the chosen hotels were surveyed multistage cluster sampling).
Semester 2 2015
15
STA60004
Example: Steps in sampling the population of a city for which there was no
sampling frame of residents:
Divide the city into clusters (e.g., electorates);
Select a random sample of these clusters;
Obtain a list of smaller areas of selected electorates (e.g., suburbs);
Obtain a random sample of suburbs within each of the selected electorates;
For each selected suburb obtain a list of addresses of households;
Select a random sample of addresses within the selected suburbs.
Non-Probability Sampling
Types of Non-Probability Sampling:
1. Quota sampling;
2. Purposive or judgment sampling;
3. Snowball sampling;
4. Availability or convenience sampling
1. Quota Sampling
Quota sampling aims to produce representative samples without random selection of
cases. The population is first divided into subgroups (e.g., males and females,
younger and older). The proportion of people who fall into each subgroup (e.g.,
younger and older males and younger and older females) is then estimated. Quotas
are then assigned for the interviewer to complete the required number of interviews
within each group. For example, in a telephone survey, the interviewer will be
required to work through the list until the required number is achieved. In quota
sampling the sample is arranged so that it mirrors the population with respect to the
defined groups.
Quota sampling is non-random because interviewers can select any cases that fit
specific criteria. Unlike stratified random samples, quota samples are not selected
with a known probability; therefore the sampling error cannot be determined.
Semester 2 2015
16
STA60004
2. Purposive or Judgment Sampling

Members of the sample are chosen on the basis that the researcher believes they
will be representative of the target population. For example, in a study of leaders of
the conservation movement, the researcher may choose some typical leaders from a
number of typical conservation groups for the sample.
3. Snowball sampling
Respondents are referred to the researcher via word-of-mouth in situations where it
is difficult to locate the members of the population. After interviewing, those
respondents are asked to identify other members of the population. As newly
identified members name others, the sample snowballs. The process is repeated
until the required sample size is achieved.
This technique is used when a population listing is not available and cannot be
compiled. For example, surveys of teenage gang members, illegal immigrants and
marijuana users might use snowball sampling.
4. Availability or Convenience Sampling

A convenience sample is a group of individuals who are available and willing to
participate. Respondents are selected on the basis of ease of access to the
researcher (e.g., interviewers may locate in a convenient location for respondents,
such as a shopping centre).
Availability samples are least likely of any technique to produce representative
samples. People who voluntarily participate in the survey may be different in
important ways from those who do not participate.
Availability sampling can be used for pilot testing questionnaires or exploratory
research.
Semester 2 2015
17
STA60004
Example
Bryman (2012), Chapter 8: Sampling
Kinsey et al. (1948): Sexual Behavior in the Human Male
Alfred Kinsey was a sexologist who suspected that there was a greater diversity of sexual
behaviour in the USA than had so far been acknowledged. He set out to investigate this by
collecting the personal narratives of 18,000 men (and later a sample of women), inviting
them to write about their sexual life histories. There were two main stages of recruitment in
this study. At first, Kinsey was content to use a non-probability, snowball sample of
volunteers, including students, prison inmates, colleagues and members of gay clubs, who
would put him in touch with other people they knew. He justified this on the grounds that
sexuality was a private and sensitive issue for which random, probability sampling would be
inappropriate: he could not really approach strangers on the street and ask them to provide
an honest and detailed account of their sexual experiences! Later, Kinsey was criticised for
using unscientific methods, for it was argued that his sample was biased towards those who
were forthcoming enough to volunteer their personal stories: indeed, these people did report
a higher level of sexual activity than those in his second sample. The latter was what Kinsey
called a multistage cluster sample of 100% samples, although OConnell Davidson &
Layder (1994) argue that this technique was not rigorous enough to be a probability sample.
Kinsey had identified geographical clusters across the USA and divided these into further
sub-clusters; he then broke these down into cells representing different social groups,
stratified by age, residence, religious affiliation and so on. He aimed to collect data from
every male adult in each cell, but the cells themselves had not been randomly selected;
Kinsey chose those that were most convenient and accessible to him. This meant that
although the sample was very large, it was not drawn randomly and systematically from a
sampling frame and so could not be deemed representative of the American male
population. It is also difficult to know whether the high response rate of those who
volunteered for the study concealed another segment of the population who had not
responded and who might have had quite different stories to tell.
Sources:
Kinsey, A., Pomeroy, W. & Martin, C. (1948). Sexual behavior in the human male.
Philadelphia: W.B Saunders.
OConnel Davidson, J. & Layder, D. (1994). Methods, sex and madness. London: Routledge.
Authored by Tom Owens.

Oxford University Press, 2012. All rights reserved.
18
STA60004
Exercise 1
(Fink, V.7, p.63)
Name the sampling method used in each of these four scenarios.
a) Two of four software companies are randomly chosen to participate in a new
work-at-home program. Thirty employees are selected at random from each of
the two companies and asked to complete a self-administered questionnaire by
electronic mail.
b) Each of the rangers surveyed at five national parks is asked to recommend two
other rangers to participate.
c) To be eligible to participate in a particular survey, students must attend a local

high school and speak English. Students with poor attendance records will be
excluded. All remaining students will be surveyed.
d) A self-administered survey to evaluate the quality of medical care is completed by

the first 100 patients who seek preventive care. The objective is to find out
whether the patients are satisfied with the advice and education given to them.
---------------------------------------------------------------------------------------------------------------e) A researcher wants to know whether people who buy organic food differ from
those who do not in regard to gender, age, educational level and income. The
researcher wants to be able to compare the two groups statistically. She decides
to administer a self-completion questionnaire to shoppers outside two shopping
centres. The researcher expects that non-organic buyers will outnumber organic
buyers. She decides to include 100 participants in each group. The researcher
will stand outside the stores and ask shoppers if they are willing to answer some
questions. The first question will be: Do you regularly purchase organic food?
At first, it does not matter if an individual says yes or no because the
researcher wants 100 in each group. After a few days, however, the researcher
has the 100 non-organic buyers. She then tells people who answer no to the
Semester 2 2015
19
STA60004
first question, Thank you for your time. Thats all I needed to know. She
continues to recruit organic buyers until the required number of 100 is met.
f)
Researchers conducted a survey in a high school to estimate the time that

students spend doing homework per week. The time is likely to vary more
between year levels than within year levels. Therefore, the researchers randomly
selected 50 students from each year level.
Exercise 2
(de Vaus, p. 92)
Think of a research topic in which you would need to use non-probability sampling
techniques and explain why a probability sample would not be feasible.
Exercise 3
Think of a research topic in which you would use stratified sampling techniques
instead of simple random or systematic sampling. Explain why.
Semester 2 2015
20
STA60004
Sample Size
Sample size depends on:
The degree of accuracy we require for the sample.
Defined by:
-
Sampling error: the amount of error we are willing to accept in our estimated
value (e.g., to within +2% of the estimated value);
Confidence level: the level of confidence we can have in our generalisations

(e.g., 95% confidence level)
How the variable is spread in the population:

If we dont know how the variable is spread in the population, we choose the
conservative value, a 50/50 split on the variable (say, we assume that our
population consists of 50% of people who would answer yes to a question and
50% of people who would answer no).
In Table 1 the sample sizes required for various sampling errors at 95% confidence
level are presented. The figures in the Table are calculated for SIMPLE RANDOM
SAMPLING.
Semester 2 2015
21
STA60004
Table 1
Sample Sizes Required for Various Sampling Errors
at 95% Confidence Level
Margin of Error
%
Sample size
(for 50/50 split on the variable)
1.0
9604
1.5
4268
2.0
2401
2.5
1537
3.0
1067
3.5
784
4.0
600
4.5
474
5.0
384
5.5
317
6.0
267
6.5
227
7.0
196
7.5
171
8.0
150
8.5
133
9.0
119
9.5
106
10.0
96
For example, if in a sample of 2401 respondents it was found that 50% intended to
vote for the Labour Party, we can be 95% confident that 50% + 2% (i.e. between
48% and 52%) of the population intends to vote Labour.
You can use the following website to calculate the required sample size for your
study:
http://www.surveysystem.com/sscalc.htm
Sample size calculation:

The half-width of a 95% confidence interval for a proportion is approximately
.
n=
Semester 2 2015
Making n the subject of the equation gives
22
STA60004
For example, if the amount of error we are willing to accept is 2% (0.02), then the
required sample size would be
n=
= 2401
The half-width of a 95% confidence interval for a mean is approximately

Making n the subject of the equation gives n =
.
. This
equation requires a value of sd, the estimated standard deviation of the variable we
are going to measure in the survey. As we dont know the value of sd before
conducting the study, we have to resort to a reasonable guess, either from a
previous study or a pilot study. Or we can rely on the fact that approximately 95% of
the normal distribution lies within two standard deviations of the mean. So we can
use the formula: range
4sd to get a value of sd for substituting into the sample
size formula.
Points to consider:
As can be seen from Table 1, for small samples a small increase in sample size
will lead to a substantial increase in accuracy. For example, increasing the
sample from 96 to 150 respondents reduces sampling error on 2% (from 10% to
8%). To reduce error from 4% to 2% you would require 2401 respondents
instead of 600. The rule is: to halve the sampling error you need to
quadruple the sample size.
Considerations of sample size are likely to be affected by matters of time and
cost because very big samples are decreasingly cost efficient.
The size of the population is irrelevant for the accuracy of the sample. In
other words, the size of the population from which a sample of a particular size is
drawn has no impact on how well the sample is likely to describe the population.
A sample of 200 people will describe a population of 10,000 and 10 million with
the same degree of accuracy.
The notion, the size of the population is irrelevant for the accuracy of the
sample, relates to very large populations. From the formula for calculating
sampling error
Semester 2 2015
23
STA60004
, where n = sample size, N = population size,

You can see that if population size (N) is large,
equal 1. Then,
becomes approximately
, and the size of the population becomes irrelevant.
The choice of sample size also depends on the heterogeneity of the

population: the greater the heterogeneity of the population, the bigger the
sample will need to be and vice versa. For example, for a population in which
most people will answer a question in the same way, a smaller sample will be
required.
If there is no variability in the population the sampling error would be zero. This
is reflected in the formula for calculating sampling error.
=
=0
If the population variance increases, the sample variance would increase too,
and the sampling error would become bigger.
Or, in case of calculating proportions:
Imagine, for example, you are study voting intentions and all members of the
population intend to vote for candidate X. There is no variability in the population
and the sampling error is zero in this case.
=
=0
If there is some variability, say 90% intend to vote for candidate X and 10%
intend to vote for candidate Y, then
=
If variability is greater, say 50% would vote for candidate X and 50% for
candidate Y, the sampling error would be bigger:
=
Semester 2 2015
24
STA60004
Not very often researchers base a sample size decision on the need for
precision of a single variable. If you need to break the sample into subgroups
(e.g., males and females), the degree of accuracy and variation within each
group should determine the sample size required for each group;
Table 1 and the equations on which figures are based apply to simple random
samples. Many studies use other types of sampling. More often than not, tables
will underestimate the sampling error. Systematic sampling should produce
sampling errors similar to simple random samples. Stratified samples can
produce smaller sampling errors. Cluster sampling tends to produce higher
sampling errors.
There will be errors from sources other than sampling. Therefore, the calculation
of precision based on sampling error alone can be unrealistic oversimplification.
Semester 2 2015
25
STA60004
Exercise 4
Access Brymans Social Research Methods (4th ed.) online resources:
http://www.oup.com/uk/orc/bin/9780199588053/
Click on Multiple choice questions
Go to Chapter 8
Answer the questions and get your score.
Semester 2 2015
26
STA60004
Bibliography
Fink, A. (2003). The survey kit, Volume 7: How to sample in surveys. (2nd ed.)
Fowler, F.J. (2009). Survey research methods (4th ed.). London: Sage Publications.
Sarantakos, S. (2013). Social research (4th ed.). Basingstoke: Palgrave Macmillan.
Semester 2 2015
27
STA60004

Exercise 1 (p. 17)
a) Multistage cluster sampling
b) Snowball sampling
c) No sampling all eligible students the entire population will be surveyed
d) Convenience sampling
e) Quota sampling
f) Stratified sampling
Semester 2 2015
28

Module 1
Topic 3: Methods of Data Collection
STA60004
Contents
Learning Objectives..............................................................................................
Optional Reading................................................................................................
Data Collection Methods......................................................................................
1. Face to Face Interviews...................................................................................... 6

2. Telephone Methods............................................................................................. 9
3. Postal Surveys..................................................10
4. Online Methods ... 11
Exercise 1 ..... 14
Choosing the Right Method.................................................................................... 26
Multi-Mode Methods............................................................................................... 27
5. Observation Methods..................................................................................... 28
6. Using secondary Data........................................................................................ 29
Exercise 2......................................................................................... 31
Exercise 3......................................................................................... 36
Bibliography.... 38
Answers to Selected Exercises... 39
Semester 2 2015
STA60004
1. Describe the most commonly used methods for collecting survey data;
2. Understand the advantages and disadvantages of each of these methods of data
collection;
3. Explain the relevance of the research questions, target population, expected
response rates, resources and time constraints in determining the most
appropriate method of data collection;
4. Evaluate which data collection method might be the most appropriate in a given
situation;
5. Appreciate the value of secondary data;
6. Identify sources of secondary data;
7. Be familiar with Australian Bureau of Statistics online resources.
Optional Reading
Chapters 9, 10, & 12
Chapter 8
Semester 2 2015
STA60004
In this topic, the main methods of data collection will be discussed: The three
main methods used to collect data are direct measurement, questionnaires and
observation.
Direct measurement
Direct measurement involves testing subjects or otherwise directly counting or
measuring data. Examples:
-
Testing cholesterol levels;
Counting ballots in a local election.

Interviews and self-completion questionnaires
This technique involves soliciting self-reported verbal information from people
about themselves.
Observation
Observation involves the direct study of behaviour by watching the subjects of the
study without intruding on them and recording certain critical natural responses to
their environment.
Another method of data collection is

Secondary research.
Secondary research consists of compiling and analysing data that have already
been collected by other researchers. Certain data may already exist that can
serve to satisfy the research requirements of a particular study. Researchers
should always investigate existing sources of information as a first step in the
research process to take advantage of information that has already been
collected.
Semester 2 2015
STA60004
The diagram below provides an overview of data collection methods.
Data Collection Methods
Using
Secondary Data
Primary Data Collection
Direct Measurement
Interviews and self-completion questionnaires
Observation
Self-Completion
Questionnaires
With Interviewer
Overt
Paper Surveys
Face-to-Face
Interviews
Individual
Interviews
Internet Surveys
Telephone
Interviews
Group Interviews/
Focus Groups
At Central
Location
Interviewer
Delivery/Pick up
Semester 2 2015
At Respondents
Location
5
Webbased
E-mail
based
Mail out/ Mail

back
Embedded
At Central
Location
Covert
Attached
STA60004
1. Face to Face Interviews

In face to face interviews a trained interviewer administers a questionnaire
personally to a respondent and records respondents answers (electronically or using
a hard copy).
Computer Assisted Personal Interviews (CAPI);
Paper and Pencil Interviews (PAPI).
Face to Face Interview Advantages
Allows a significant level of interaction between the respondent and the
interviewer:
-
Interviewers can answer respondents questions;
Clarify misunderstandings;
Probe answers to open ended questions;
Can use body language, visual and auditory cues to encourage participation;
Questionnaires can be reasonably complex;

Allows accurate recording of responses
Face to Face Interview Disadvantages
Possible interviewer bias;
Respondent may not like / doesnt react well to the interviewer;
Tiredness or impatience can affect the quality of responses;
Responses to personal, sensitive or controversial questions can be affected by
social desirability considerations;
Some interviewers can contaminate results (e.g., they may add their own
interpretation on questions);
Costly, especially if the sample is spread over a wide area;
Time consuming (especially if paper copy questionnaires are used rather than
CAPI)
Types of Face to Face Interviews
Individual Interviews
-
At a central location;
At respondents location
Group Interviews
Central location
Semester 2 2015
STA60004
Individual Interviews at Central Location

Involves interviewing respondents at a central location;
Often undertaken for convenience - members of the target population are located
in a single, easy-to-capture location (e.g., supermarkets, hospitals etc.).
Examples:
Interviewing members of the general public in city streets;
Interviewing cinema goers as they exit a movie;
school students at schools;
patients at hospitals;
consumers at supermarkets.
Sampling:
For many types of central location surveys, it is difficult to produce a list from
which to select respondents. Therefore, probability sampling methods may not be
possible.
In some central locations, such as a school or a hospital, this is not an issue if a
class list or patient register is available from which names can be randomly
selected.
Face to Face Interviews - Residential surveys
Residential surveys involve interviewing respondents in their own homes.
Advantages over Central Location Surveys:
Probability sampling is possible;
Respondents are more likely to be relaxed in their own environment;
There is more privacy for respondents;
It is easier to administer long interviews at home.
Disadvantages:
Residential interviews are more expensive than central location interviews due to
the travelling time and costs;
If the incidence of the target population is low, it may be difficult to locate
respondents;
Many people are reluctant to answer the door to a stranger;
Semester 2 2015
STA60004
Safety may be a concern for interviewers in some locations, or at some times of

the day.
Group Surveys/Focus Groups
A group of individuals is recruited to discuss a particular issue;
Held at a central location;
Aims to obtain the opinions of more than one person on a single occasion;
A group discussion is a less structured form of data collection than a structured
survey, and is guided by the person moderating the group.
Basic assumptions that underlie this method:
A group environment, through mutual stimulation, encourages discussion;
Enables the moderator to lead the discussion towards focal points and topical
issues through encouragement or discouragement or manipulation of the
environment;
Increases the motivation to address social and especially critical issues.
Group Discussions Advantages:
Useful for helping define the survey problem;
Useful for collecting qualitative information;
Interaction between respondents can increase the richness of information.
Group Discussions Limitations:
Need a suitable location;
Need to co-ordinate a group of people to be all available at the same time;
Need an experienced moderator;
Respondents can deviate from the topic.
Semester 2 2015
STA60004
2. Telephone methods
In telephone interviews, the level of interaction between the interviewer and the
respondent is restricted to speaking and listening; it is not possible to use visual
cues.
Telephone Interview Advantages over Face-to Face Interviewing:
Low cost per interview;
No travel costs;
Less interviewer bias;
Short turn-around time;
Instant data entry with CATI (Computer-Assisted Telephone Interview). The CATI
surveys are conducted from a central interviewing centre. This centralised
location allows greater supervision and quality control;
Can be more appropriate than face-to-face methods for interviewing people about
sensitive issues.
Telephone Interview Limitations:
Possible bias as those without phones are excluded;
Respondent resistance (invasion of privacy);
Easy for respondent to terminate interview
Methods of Obtaining a List of Telephone Numbers:
Obtaining a systematic sample from telephone directories
-
The sample will be biased: people with unlisted numbers will be excluded.
Random digit dialing

Advantages:
-
Avoids the need to obtain directories;
Allows contact with unlisted numbers
Disadvantages:
-
Does not enable to select between business and residential addresses;
A lot of time may be spent on calling non-existent numbers;
Counting non-existent numbers may lead to wrongly calculated response

rates.
Semester 2 2015
STA60004
If mobile phones are included in the frame, duplication of units then becomes
an issue.
Read the following article to learn about using mobile phones for survey research:
Vicente, P., Reis, E., & Santos, M. (2009). Using mobile phones for survey research:
A comparison with fixed phones. International Journal of Market Research, 51, 613633.
http://web.a.ebscohost.com.ezproxy.lib.swin.edu.au/ehost/pdfviewer/pdfviewer?sid=d9b2bc5
f-726f-4ca0-bc0f-39a51303230a%40sessionmgr4004&vid=1&hid=4207
Introductory scripts are very important in telephone surveys. Read the following
publication to learn how to write a script for a telephone interview.
https://math.uwaterloo.ca/survey-research-centre/sites/ca.survey-researchcentre/files/uploads/files/SRCIntroductoryScripts.pdf
3. Postal Surveys
Postal Surveys Advantages
Relatively inexpensive method of collecting data;
It is possible to distribute large numbers of questionnaires in a very short time;
The ability to cover a wide geographic area;
Respondent can complete the survey in their own time;
No interviewer bias;
May get better answers on sensitive questions
Postal Surveys Disadvantages
The need for questionnaires to be kept simple and straightforward to avoid
confusion or errors;
The time taken to answer correspondence or resolve queries by mail;
Usually lower response rates than other methods;
Longer response time;
Incomplete data on questionnaires;
Coding responses of PAPI surveys can be expensive;
Cannot be certain that the right person has completed the questionnaire
Semester 2 2015
10
STA60004
Incentives
Incentives can help increase response rates, but can add significantly to the cost
of the research;
Use of incentives is a contentious issue. While they may encourage response, we
cannot be sure whether they are encouraging considered responses, and
therefore valid useful information, or whether some people give any answer for
the chance to receive an incentive.
Advantages of hand-delivering questionnaires over posting them
Personal contact between the respondent and the interviewer or data collector
may improve response rates;
If the questionnaire is complicated, then complexities can be directly explained.
4. On-Line Methods
Email surveys:
-
Included in email text;
Attached to email;
Email contains a link to a web survey.
Web-based surveys
Email Surveys
Email surveys are appropriate where the population can be clearly defined, and the
email addresses of population members are known.
Email surveys Advantages
Inexpensive data collection method;
Respondents can complete the survey in their own time;
Easy to monitor response/follow up non-response;
Short turn-around period for data collection;
No double handling of data respondents enter their own data;
Can incorporate visual aids
Semester 2 2015
11
STA60004
Email surveys Limitations

Requires a sampling frame of email addresses corresponding to the population of
interest;
Sampling frames may not be available or incomplete for many populations;
Requires respondent to regularly check email.
Web Surveys
Web Surveys Advantages
Short turn-around period for data collection;
No double handling of data respondents enter their own data into a database;
Can incorporate visual aids;
Easier to reach widespread population;
Simpler to implement complex branching;
Response errors are easily detected and the respondent can be prompted to give
a valid response;
Can enforce question answering;
Can randomize question order;
Limited results can be automatically available.
Graphically sophisticated surveys are mostly used in market research. See for
example:
http://ecustomeropinions.com/survey/survey.php?sid=715363891
http://net05.mwm2.nl/go.aspx?vp=B62AA890-9312-4D87-AEC5-5C74597A84DC
Sometimes researchers ask participants to respond to a video or audio clip. For

example, to measure participant perceptions of a political candidates positions on
foreign policy the researcher could include a video clip from a recent speech.
Multimedia can also be used when testing children. Children like a colourful and
animated interface. Also, video and audio messages can guide participants through
an online survey.
Semester 2 2015
12
STA60004
The main benefits of using graphics:

Graphics generate interest and attract attention;
Graphics can be powerful when integrated with text.
Graphics can make concepts easier to understand
The correct use of colour can reinforce the product brand.
Colour, fonts, and graphics can help the reader comprehend an idea
Web Surveys Limitations

Some respondents may participate more than once;
Probability sampling is not possible if the sampling frame of email addresses
corresponding to the population of interest does not exist.
A number of low cost online software packages provide easy interfaces for building
surveys and viewing reports online. For example:
Opinio (www.objectplanet.com/opinio/)
This product is available to Swinburne University staff and postgraduate research
students (www.swinburne.edu.au/lt/opinio.html)
SurveyMonkey (www.surveymonkey.com)
Works well for basic surveys;
Offers a free version that might be useful for very small surveys. This version
allows very little customization of the look of the survey, 10 questions per survey
and can only collect 100 responses per survey;
The Pro versions (from $19/month to $65/month) offer some advanced features
and allow you to export results to other programs including SPSS.
Creative Research Systems (www.surveysystem.com)
SurveyGizmo (www.surveygizmo.com)
You can also create questionnaires with the use of Google Forms.
Semester 2 2015
13
STA60004
Exercise 1
Develop the following questionnaire using Google Forms:
Parent Opinion Questionnaire
(Note: the questionnaire is intentionally simplified)
Q.1 Your gender: Male
Female
Q.2 Your age: 20-25 years old 26-30 31-35 36-40 41-45 50+
Q.3 The academic standards at this school provide adequate challenge for my child
Strongly disagree 1 2 3 4 5 Strongly agree
Q.4 In this question, please indicate which elements of a parent-teacher interview
IMPORTANT to you (choose as many as you wish):
Information on my childs participation in the classroom
Information on my childs academic achievement in relation to the rest of the
class
Suggestions on how my child could improve
Information on my childs relationships with other children
Receiving clear answers to my questions
Please use the space below if there are any comments you would like to make about
this school.
To use Google Forms you need to open a Google Account at

https://accounts.google.com/
Semester 2 2015
14
STA60004
After signing to your account, click on Apps
The following Applications appear:

Click on More
Semester 2 2015
15
STA60004
Click on Forms
Alternatively, you can click on Drive, then click New in the top left corner,
hover over More, and select Google Forms
The following web page appears:
Semester 2 2015
16
STA60004
Fill in the title of your questionnaire, Parent Opinion Survey, by replacing the
text in the Untitled Form box.
Choose a Theme for your questionnaire
Click OK
The following dialog box will appear:

Semester 2 2015
17
STA60004
Enter Question Title: Your gender:

Use Multiple Choice as Question Type;
Type Male in Option 1;
Type Female in Option 2;
Click on Done
To enter the second question, click on Add item
Semester 2 2015
18
STA60004
Use Choose from a list as Question Type;

Type Your age: in Question Title
Type in Response categories: 20-25, 26-30, etc.;
Click on Done
Click on Add item

Semester 2 2015
19
STA60004
For Question 3, The academic standards at this school provide adequate

challenge for my child, use Scale as Question Type;
Type Strongly disagree in Option 1;
Type Strongly agree in Option 2;
Click on Done
Click on Add item
Semester 2 2015
20
STA60004
Q.4 In this question, please indicate which elements of a parent-teacher interview

IMPORTANT to you (choose as many as you wish):
Information on my childs participation in the classroom
Information on my childs academic achievement in relation to the rest of the class
Suggestions on how my child could improve
Information on my childs relationships with other children
Receiving clear answers to my questions
Use Checkboxes to create Q.4
Click on Done
Click on Add item

Semester 2 2015
21
STA60004
Use Paragraph text to create an open response question, Please use the
space below if there are any comments you would like to make about this
school.
Semester 2 2015
22
STA60004
Click on Send form

The link to your questionnaire will appear in Link to share
Use this link to send your questionnaire to your participants.
To view how your questionnaire looks like, clock on View live form
Semester 2 2015
23
STA60004
This is how your questionnaire will look like:
To view the summary of the participants responses to your questionnaire, click on

Summary of responses:
Semester 2 2015
24
STA60004
To view individual responses, click on View responses:
The questionnaire you created will be stored in My Drive
To learn more about creating surveys using Google Forms, use the following link:
https://support.google.com/docs/answer/2839737
Semester 2 2015
25
STA60004
Choosing the Right Method

Choice of the survey data collection method will be dependent on a number of
factors:
1. Purpose of Research
For example, if the purpose of the research is to test consumers reactions to a
new product, then some methods of data collection that do not allow consumers
to view or test samples may not be appropriate.
2. Proposed Survey Questions
Some questions are better suited to particular methods of data collection than
others:
-
Complex questionnaires are best handled by interviewers who can be trained

to lead respondents through the questions;
Open-response questions: Cues from an interviewer, in face-to-face

interviews, such as Why else? can encourage more detailed responses than
might otherwise be obtained in a postal survey;
Sensitive questions may be better handled by a self-completion survey where

the respondent can answer privately, or a telephone survey where the
interviewer cannot be seen. People may feel quite uncomfortable about
revealing private information to a stranger;
Where the sequencing of questions is complex or different groups of

respondents are required to answer different sections of a questionnaire,
computer-assisted methods are most appropriate.
3. Who Will Provide the Information

-
If the population is widely dispersed, it may be cheaper to conduct telephone

interviews or online surveys;
Certain target populations may not be physically able to complete a written

survey (e.g., illiterate people, children), so it would be better to use an
interview
4. The Incidence in the General Population

In some studies the incidence of your target population in the general population
will be very low - some methods of data collection can be very expensive.
Semester 2 2015
26
STA60004
5. Cost
-
Online surveys are often the least expensive;
Interview surveys are the most labour intensive:

Require people to be trained to an appropriate standard;
Transport for interviewers is essential for residential surveys. Transport costs
can be a significant expense where the selected sample is widespread.
6. Time
-
On-line data collection methods and computer-based telephone surveys are

best suited to rapid information collection:
No travelling involved;
Responses are entered directly into the computer (CATI methods) or into a
database (online surveys)
In paper surveys, the time required depends on how long respondents take to
complete and return their questionnaires; and the researcher has limited
control over this.
Multi-Mode Methods
Sometimes researchers use more than one data collection methods. This allows
using the advantages of one method to counteract the disadvantages of another.
For example, to reduce the cost of face-to-face interviews, telephone or postal
surveys can be used for respondents who live far away. Or if you conduct a webbased survey, those without internet access can be interviewed using face-toface, postal or telephone methods.
Although multi-mode method may allow obtaining more representative samples,
the mode of administration can affect the way people respond (so called mode
effects).
Semester 2 2015
27
STA60004
5. Observation Methods
Observation methods can be a valuable tool for learning about behaviour, and
they may provide a more valid measurement of some behaviour than interviewing
techniques.
However, there may be some ethical issues associated with covertly surveying
people.
Observation Methods Advantages
No interviewer effects;
Good for studies about relationships/ interactions;
Suitable for studies about behaviour (e.g., children)
Observation Methods Limitations
May be time consuming;
May be difficult to objectively collect data.
Read "Structured Observation" (Bryman) to learn more about this method of
collecting data:
http://ezproxy.lib.swin.edu.au/login?url=http://onlineres.swin.edu.au/993314065.pdf
Semester 2 2015
28
STA60004
6. Using Secondary Data

What is Secondary Data?
Secondary data is data that has been collected by someone else. It includes census
data, data from other surveys presented electronically or in reports, books, journals,
newspapers, magazines etc.
Why Do We Use Secondary Data?
Designing and conducting your own survey can be an expensive and time
consuming process. For example, you may not have the resources to collect your
own data or it is difficult to obtain a good sample. You may require data from
different countries in order to perform a comparative analysis.
Types of Secondary Data
Quantitative (Census of Population and Housing, Official Government surveys,
research conducted by large research organizations etc.)
Qualitative (historical records, interview transcripts, newspapers, letters, diaries,
biographies etc.)
It is becoming conventional that data obtained by large research organisations are
placed and stored in a publicly accessible data archives. Available sets of data can
allow you to perform analysis of change over time where the same questions are
asked in repeated surveys or to conduct a comparative analysis where the same
questions are asked in surveys in different countries.
Usually the data are provided in aggregate form for a state or country and are not
suitable for analyses in which the individual is the unit of analysis. But you can use a
country, year or region as the unit of analysis and perform relevant research.
In Australia one of the main sources of secondary data is the Australian Bureau of
Statistics (ABS) which is responsible for collecting a range of demographic,
economic and social data, including the Census of Population and Housing which is
conducted every five years.
Making Use of the Australian Bureau of Statistics Data
The ABS provides statistics free of charge via the ABS website www.abs.gov.au.
The website is updated daily to enable access to a wide range of new product
releases.
Semester 2 2015
29
STA60004
Statistics are available on a range of topics:

Economy: Statistics are available on major economic fields that can be used to
understand trends in the economy, identify drivers of economic growth, evaluate
economic performance and for the formulation and assessment of economic policy.
Environment and Energy: Information is available which allows you to investigate the
relationship between social, environmental and economic statistics.
Industry: A broad cross-section of Australian industry data is collected on topics
including agriculture, construction, transport, tourism and the services industry.
People: Data is available on Australias population including, education, health, and
housing. This helps to assist with monitoring the progress of society.
Regional: Statistics on regional and rural areas are available for many standard data
sets.
ABS census data
The Census of Population and Housing held every five years. The aim of the census
is to:
measure the population;
provide certain key characteristics of everyone in Australia on census night;
better understand the dwellings in which Australian people live;
provide timely, high quality and relevant information for small geographic areas
and small population groups;
complement the information provided by other ABS surveys
Census data is provided via a range of different tools available on ABS website.
These include:
Analytical articles;
QuickStats;
Community Profiles;
Table Builder;
DataPacks;
etc (see www.abs.gov.au)
Semester 2 2015
30
STA60004
Exercise 2
Go to http://www.abs.gov.au/
Click on Census Data
Semester 2 2015
31
STA60004
The following Online Tools will be displayed:
Semester 2 2015
32
STA60004
Let us look at the QuickStats tool, for example.

Enter the name of a location, e.g., Hawthorn
Select Hawthorn (Vic.), Vic, State Suburb (SSC).
Click on GO
The following information will be displayed:
Semester 2 2015
33
STA60004
You also can compare 2011 Census data with 2006 Census data or 2001 Census
data:
Answer the following questions from the QuickStats tables:

1. What is the median age of Hawthorn residents? Is it more or less than the
median age of Australia overall?
2. What percentage of Hawthorn residents speak Greek at home?
3. How many families live in Hawthorn?
Semester 2 2015
34
STA60004
4. Has the number of families who live in Hawthorn increased or decreased over
the last 10 years?
5. Has the number of private dwellings in Hawthorn increased or decreased over

the last 5 years?
6. Has the percentage of Hawthorn residents who speak Greek at home changed?
Advantages of Using Secondary Data

An inexpensive method of obtaining data: No need to collect your own data
more time and resources can be put into the analysis;
If the secondary data is being analysed by a number of researchers, they can
share their findings and compare results;
Secondary data can be used to make comparisons with your own data collection,
e.g. by comparing the results of a survey of the general population with Census
data you can make some assessment of the quality of the survey sample, and
possible under/over-representation of certain groups in the sample.
Issues with Secondary Data

Uncertainty about the quality of the data (e.g., issues with the survey design or
issues with the survey response/non-response bias);
Problems of relevance and comparability in relation to your research topic (e.g.,
different definitions of concepts, or the data might not match your research
needs);
The age of the data;
The format and structure of the data can be different.
Semester 2 2015
35
STA60004
Exercise 3
Case study
Bryman (2012), Chapter 10: Self-completion questionnaires
Ellen Annandale (1998): Gender and self-reported illness

Research into gendered patterns of morbidity (the incidence of illness in a
population) has provided some ambiguous results. It seems that women suffer more
ill-health than men, but opinion is divided about whether this is due to natural
biological differences between the sexes, the social roles and lifestyle practices that
women engage in, or the methodological techniques used to construct health-related
statistics. In terms of the latter, Annandale (1998) presents some contrasting findings
from different studies. On the one hand, official statistics indicate that women consult
their GPs more than men and are more likely to be hospitalised for mental disorders
(Pilgrim & Rogers, 1993). Similarly, the results of the 1994 General Household
Survey, which is based on structured interviews, showed that women were much
more likely to report suffering from long-standing illness or restricted activity in the
two weeks prior to the interview (OPCS, 1995).
However, when similar questions were asked in a self-report questionnaire (the 1994
Health Survey for England [Department of Health, 1996]), these gender differences
appeared to be much less. Although men were slightly more likely than women to
report very good health, women were more likely than men to report good health,
and the figures for fair, bad and very bad health were almost identical. Annandale
therefore argues that we should be wary of overestimating the female excess of illhealth in the UK population because these gender differences may simply be an
artefact of the measurement process. For example, the similarity of male and female
responses to the Health Survey of England could mean that men were more likely to
admit to feeling ill in a self-report questionnaire than in a structured interview,
because health is a personal, sensitive issue and the S.C.Q. allows for more privacy
in responding.
On the other hand, it could mean that women were less likely to report ill-health in a
self-completion questionnaire because there is no interviewer present to build
rapport and to probe them for answers. It may also be that the response rate to the
Health Survey for England was somewhat lower than that of the General Household
Survey and resulted in smaller, less representative and biased sample of the
population. This reminds us of the need to consider the reliability, validity and
generalizability of different research strategies when interpreting the results of social
surveys.
Source:
Annandale, E. (1998). The sociology of health and medicine: A critical introduction.
Cambridge: Polity Press.
Other references:
Department of Health (1996). Health survey for England, 1994, Vol.1. London:
HMSO.

36
STA60004
OPCS (1995) Living in Britain: Results from the 1994 General Household Survey.
London: HMSO.
Pilgrim, D. & Rogers, A. (1993) A sociology of mental health and illness.
Buckingham: Open University Press.
Question:
1. If women consult their local doctor more often than men, does this indicate that
women are ill more often than men? Explain.
(Discuss the question on the Blackboard/ Discussion Board)

37
STA60004
Bibliography
Boyce, J. (2003). Market research in practice. Sydney: McGraw Hill.
Sarantakos, S. (2013). Social research (4th ed.). Basingstoke: Palgrave Macmillan.
Semester 2 2015
38
STA60004

Exercise 2 (p. 33)
1. Median age of Hawthorn residents in 2011 = 31 years.
Median age of Australians in 2011 = 37 years.
The median age in Hawthorn is therefore six years younger than for Australia
overall.
2. The percentage of Hawthorn residents who speak Greek at home = 1.8%.
3. 4767 families live in Hawthorn.
4. In 2011, 4767 families lived in Hawthorn.
In 2001, 4034 families lived in Hawthorn.
The number of families has increased by 733.
5. In 2011, there were 10333 private dwellings in Hawthorn.
In 2006, there were 9575 private dwellings in Hawthorn.
The number of dwelling has increased.
6. The percentage of Hawthorn residents who speak Greek at home in 2011 =
1.8%.
The percentage of Hawthorn residents who speak Greek at home in 2006 =
1.9%.
The percentage of Hawthorn residents who speak Greek at home in 2001 =
2.2%.
There has been a decrease in the percentage of Hawthorn residents who speak
Greek at home.
Semester 2 2015
39

Module 1
Topic 4: Developing a Questionnaire
STA60004
Topic 4 Developing a Questionnaire
Contents
Learning Objectives
Optional Reading
What can we measure in a questionnaire?
Types of questions
Behavioural Questions
Belief Questions
Knowledge Questions
Attitude Questions
Attribute Questions
Exercise 1
Principles of Question design
Relevance
Reliability
Validity
Discrimination
Question Wording
Exercise 2
Exercise 3
Closed and Open-Ended Questions
Principles of Developing Question Response Formats
Exhaustiveness (or Inclusiveness)
Exclusiveness
Balancing categories
Types of Question Response Formats
Simple Itemised Rating Scales
Likert Scale
Horizontal Rating Scales
Semantic Differential Scales
Ranking Scales
Checklists
Dichotomous questions
Paired comparisons
Exercise 4
Exercise 5
Response Sets
Order of Questions in the Questionnaire
Pilot Testing or Pretesting Questions
Examining Existing Questions
Bibliography
Semester 2 2015
3
3
4
4
5
5
6
6
7
8
9
9
9
9
10
11
16
16
18
21
21
21
22
23
23
24
24
24
25
25
26
26
27
28
29
30
31
32
33
34
2
STA60004
1. Recognise and construct different types of survey questions, including
-
Behavioural questions;
Questions about beliefs;
Knowledge questions;
Attitudinal questions;
Attribute questions;
2. Use a variety of question formats;

3. Understand the difference between open and closed response formats;
4. Identify problems with question wording, and be able to correct the problems;
5. Produce a meaningful questionnaire that takes into account the principles of good
questionnaire design.
Optional Reading
Chapter 11
Chapter 7
Semester 2 2015
STA60004
What can we measure in a questionnaire?

Questions and answers are part of everyday conversation. They are an integral part
of our social life. However, the focus of survey research is how to turn an everyday
process into rigorous measurement.
The ways in which the survey questions are asked can prescribe the answers.
Consider the following example:
Example (Fink, v.2, p.2)
The Relationships among Questions, People, and Information
Three survey experts were invited to present the results of their survey
American Views on Taxation. Expert As presentation was titled Most
Americans Support Increased Taxes for Worthy Purposes. Expert Bs speech
was called Some Americans Support Increased Taxes for Worthy Purposes.
Expert Cs talk was named Few Americans Support Increased Taxes for
Worthy Purposes. A review of the experts talks and original surveys revealed
three questions:
Expert As: Would you support increased taxes to pay for education programs
for very poor children?
Expert Bs: Would you support an increase in your taxes to pay for education
programs for very poor children?
Expert Cs: Would you support a 10% increase in your taxes to pay for
education programs for very poor children?
Types of questions
There are various ways of classifying questions, but here are the main types of
questions that are used in surveys:
Behaviour questions about what people do;
Beliefs questions about what people think is true or false;
Knowledge questions about the accuracy of beliefs;
Attitudes questions about what people think is desirable;
Attributes questions about respondents characteristics.
Semester 2 2015
STA60004
Behavioural questions
Behavioural questions are questions that measure what people do, or what they
have done. When we measure behaviour, we are usually trying to establish:
-
Whether or not the respondent exhibited certain behaviour;
The frequency of occurrence of the behaviour.
Behavioural questions may have to rely on respondents memories.

Examples of behavioural questions:
How often do you go to church?
When did you last eat out in a restaurant?
Example (de Vaus, p.95):
Topic: Workforce participation of mothers of preschool age children
Sample: Mothers some with young children, others with older children
Behavioural question: Ask whether the respondent is working or did work with a
preschool age child
Reasons of asking the question:
-
Can provide information on which types of mothers work and which types do not;
May help locate factors which facilitate or hinder workforce participation.
Belief questions
Belief questions measure what people believe is true or false
Belief question: Ask respondents about what they believe to be the effects of day
care centres on the emotional development of preschool-age children
Reasons of asking belief questions: To establish what people think is true rather than
on the accuracy of their beliefs
Answers to questions about beliefs and attitudes are not necessarily an indicator of
behaviour. Likewise, what a person does may have no bearing on their beliefs.
Peoples behaviour and beliefs are often inconsistent or irrational, and therefore they
do not necessarily behave as they would like to behave. For example, if a person
drives though a red traffic light, we cannot infer that he/she believes that to drive
through a red light is the right thing to do; and we cannot necessarily say that he/she
has a careless attitude to road safety.
Semester 2 2015
STA60004
Knowledge questions
Knowledge questions assess respondents knowledge of particular facts.
Knowledge question: Ask respondents what they know about government programs
aimed to assist parents with preschool age children to work parttime
Reasons for asking knowledge questions: To establish the accuracy of respondents
beliefs
In the consumer market, knowledge questions may relate to product awareness,
awareness of product attributes and the price of the product.
Beliefs vs. Knowledge Questions

Do not confuse questions about beliefs with questions about knowledge (where the
true answer is known and can be verified). For example:
All fitness clubs in Melbourne have yoga classes (yes, no, dont know) is a
knowledge question.
Pilates exercises are more beneficial for your health (strongly disagree, disagree,
not sure, agree, strongly agree) is a belief question.
Attitude Questions
Attitude questions try to determine what people like and what they think is desirable;
Topic: Workforce participation of mothers of preschool age children;
Attitude question: Ask respondents about their attitudes regarding whether or not
mothers with pre-schoolers ought to participate in the workforce.
Beliefs vs. Attitude Questions

Beliefs are what people believe is true or false. Beliefs determine our attitudes. A
person can have many beliefs about a phenomenon. An attitude toward that
phenomenon will be based on the overall evaluation of those beliefs. For example,
"Smoking is bad for your health" a belief
"Smoking causes a lot of problems not only for the smoker, but for the people
around" belief
Semester 2 2015
STA60004
"Smoking in public places should be prohibited" attitude

In expressing your attitude, you are making an evaluative judgment about something
based on your internal beliefs.
Attribute Questions (Personal Factual Questions/ Demographic

Questions)
Attribute questions are questions about respondents characteristics:
Age;
Education;
Occupation;
Gender;
Ethnicity;
Marital status;
etc.
Topic: Workforce participation of mothers of preschool age children;
Attribute questions: Questions about the number of children, the age of the child,
income, etc.
Semester 2 2015
STA60004
Exercise 1
Identify whether the following questions measure behaviour, beliefs, knowledge,
attitude or attributes.
What is your highest level of education?
Are you aware of our animal training programs?
My child gets on well with their peers at school.
Did you take any natural herbs to improve your athletic or sporting performance?
Animal research cannot be justified and should be stopped.
New surgical procedures and experimental drugs should be tested on animals

before they are used on people.
There are plenty of viable alternatives to the use of animals in biomedical and
behavioral research.
How satisfied have you been with the customer service?
All career counselors at this university are professionally qualified.
How did you book your parent-teacher interview?
Semester 2 2015
STA60004
Principles of Question Design

The main principles of question design are
Relevance
Reliability
Validity
Discrimination
Relevance
Always keep in mind your research questions when designing a questionnaire. Ask
questions that relate to your research objectives. For each question, you should ask
yourself whether it really is necessary. Also, you should avoid asking questions not
related to your research objectives because extra questions will unnecessary
lengthen your questionnaire and it is not fair to waste respondents time.
Reliability
The same respondent should answer the question in the same way on different
occasions (assuming that the respondent has not changed in the meantime). For
example, ambiguous questions may produce unreliable responses because
respondents may read the question differently on different occasions.
Validity
The question should measure what it is supposed to measure. For example, if we
use self-rated health (i.e. how healthy are you?) as a measure of health we should
be confident that it measures health rather than something else such as optimism
and pessimism). Decide exactly what it is you want to measure.
Example (Bryman, p. 254):
Consider the following question: Do you have a car?
What this question is designed to measure? If it is a car ownership, the question is a
bit ambiguous because it can be interpreted as: personally owning a car; having
access to a car in a household; and having a company car. Therefore, an answer of
yes may or may not be indicative of car ownership. It would be better to ask: Do
you own a car?
Semester 2 2015
STA60004
Discrimination
There should be variation in the sample on the key variables (e.g. if we want to study
whether there is a link between gender and income we need to have a sample in
which there is a good variety of income levels). Low variance may be a result of poor
question design. For instance, a limited range of response alternatives can produce
low variance. If you ask about income and offer only two alternatives of less than
$100,000 a year and more than $100,000 a year, you would not identify much
variation in the sample.
Semester 2 2015
10
STA60004
Question Wording
Survey questions must be formulated so that respondents can answer them easily
and accurately.
1. Use simple language when designing a question; avoid technical

terms
Use simple words, avoid jargon and technical terms.
Example (de Vaus, p. 97): A question, such as Is your household run on matriarchal
or patriarchal lines? is not acceptable.
2. Avoid long questions

Shorter questions are usually less confusing and ambiguous. Respondents tend to
skip long questions or skim them, thus not giving them enough attention.
3. Avoid double-barrelled questions

Double-barrelled questions are questions which ask more than one question.
How often do you visit your parents? (Separate questions about a person's mother
and father should be asked.)
How satisfied are you with pay and conditions in your job? (The respondents may
be satisfied with one but not with the other.) Therefore, it is unclear how to answer
this question. Any answer that is given by a respondent is unlikely to be a good
reflection of the level of satisfaction with pay and conditions.
Also, avoid asking questions that imply two questions. For example,
Which candidate did you vote for at the last election?
What if the respondent did not participate in the last election? It is better to ask two
separate questions:
Did you vote at the last election? (Yes, No)
If YES, which candidate did you vote for?
Semester 2 2015
11
STA60004
4. Avoid leading questions

A leading question is one where either the question structure or wording make
respondents provide an answer that they would not have given had the question
been asked in a more neutral way.
Do you agree that...?
Does this seem like a good idea to you?
Would you agree to cutting taxes further even though welfare provision for the most
needy sections of the population might be reduced? (Bryman, p. 257)
Do you oppose or favour cutting defence expenditure even if it endangers our
national security? (de Vaus, p.99)
Do you favour or oppose increasing the number of university places for students
even if it leads to a decline in standards? (de Vaus, p.99)
5. Avoid questions that include double negatives

Questions which use 'not' can be difficult to interpret, especially when asking
respondents to indicate whether they agree or disagree.
Marijuana should not be decriminalised.
-
Strongly agree
Agree
Undecided
Disagree
Strongly disagree
Better wording: Marijuana use should remain illegal.
6. Is the respondent likely to have the necessary knowledge?

Make sure that your respondents have sufficient knowledge to answer the question
you ask. Otherwise the respondents may guess or refuse to answer that question.
Example:
Do you agree or disagree with the government's policy on legalising drug injecting
rooms?
First you need to ask a filter question to find out if people are aware of the
government's policy on drug injecting rooms. Alternatively, you should offer the
respondent the opportunity to say that they are not sure what the government's
policy is.
Semester 2 2015
12
STA60004
7. Will the words have the same meaning for everyone?

If respondents interpret the questions in different ways they are, in fact, answering
different questions. For example:
Have you been a victim of a crime in the last five years?
The answer will depend on what the respondent includes in their definition of crime
(e.g., some people may exclude domestic violence from their definitions of crime
etc.). So try to make your meaning clear.
8. Is there a prestige bias?

When an opinion is attached to the name of a prestigious person and the
respondents are then asked to give their own opinion on the same matter, the
question can suffer from prestige bias. In other words, the prestige of the person who
holds the view may influence the respondents answer to the question. For example,
What is your view about the Pope's policy on birth control?
9. Is the question ambiguous?

The following factors can cause ambiguity:
poor sentence structure;
using words with several different meanings;
using double negatives;
using double-barrelled questions
10. Is the question too general? Is the frame of reference for the
question sufficiently clear?
How satisfied are you with your job?
This question lacks specificity. Do you mean pay, conditions, relationship with
colleagues?
How often do you see your mother?
Not clear within what time frame:
-
within the last year?
the last month?
Semester 2 2015
13
STA60004
Better:
Within the last year how often would you have seen your mother on average? and
provide alternatives such as 'daily' through to 'never' to help further specify the
meaning of the question.
11. Is the question too precise?

Avoid requiring answers that need more precision than people are likely to be able to
provide reliably.
How many times in the last year did any member of your household visit a doctor?
It is highly unlikely that most people will recall events accurately over such a long
period of time.
12. Does the question artificially create opinions?

On certain issues people will have no opinion. In this case it is advisable to provide
the option of responding 'don't know', or 'no opinion'. However, some researchers
argue that including dont know option may lead to some respondents not thinking
about the issue and choosing dont know in most questions. To overcome this
problem, it is suggested to use first a filter question to exclude out those who do not
hold an opinion on the topic, and then ask the second question relating to those
respondents who do hold an opinion.
13. Is personal or impersonal wording preferable?

Personal wording asks respondents to indicate how they feel about something.
Impersonal wording asks respondents to indicate how people feel about something.
The impersonal approach does not provide a measure of respondent's attitudes but
rather the respondent's perception of other people's attitudes.
14. Is
the
question
wording
unnecessarily
detailed
or
objectionable?
Questions about precise age or income can create problems (e.g., low response
rate). If we do not need precise data on these variables it is better to ask
respondents to put themselves in categories such as age or income groups.
Semester 2 2015
14
STA60004
15. Does the question have dangling alternatives?

Would you say that it is frequently, sometimes, rarely or never that ?
The subject matter of a question should come before alternative answers are listed.
This is especially important in the telephone interviews.
16. Is the question a 'dead giveaway'?

Avoid absolute, all-inclusive or exclusive words, such as all, always, each, every,
everybody, never, nobody, none, nothing. For example:
I am always willing to admit it when I make a mistake.
'Dead give-away' words allow no exceptions, and few people will agree with the
statement that includes them. This can lead to low variance and poor question
discrimination.
Semester 2 2015
15
STA60004
Exercise 2
Provide an example of an ambiguous question and explain why this question is
ambiguous.
Exercise 3
Comment on any potential problems with each of the following questions.
1. Would your spouse be happy for you to work full time?
2. How would you describe your health?
3. How often do you exercise?
Semester 2 2015
16
STA60004
4. Travel to other countries has become increasingly popular among Australians.

Have you ever travelled to another country? If yes, you might have travelled to
other countries to enjoy their scenery. How important was the scenery in your
decision to take a trip?
5. Dont you agree that social workers should earn more money than they currently
earn?
6. Mothers with children should not work.
7. Do you want to be rich and famous?
Semester 2 2015
17
STA60004
Closed and Open-Ended Questions

Closed or closed-ended or forced-choice question is a question in which a
number of alternative answers are provided, and respondents are asked to
choose one or more of the answers;
Open or open-ended question is a question for which respondents formulate
their own answers.
Open-Ended Questions
Open-ended questions are useful in the following situations:
To collect attribute information where the number of response options is too large
to precode: (e.g., Where were you born?)
To collect information where the response options are unknown, or feedback is
required (e.g., What aspects of this subject interest you the most?)
To get at general feelings;
To find out respondents reasons for their opinions.
Advantages
For respondents:
Many possible answers are allowed.
For researchers:
The researcher does not have to advance-guess the possible responses;
Unusual responses may be derived;
Useful for generating fixed-choice format answers;
A clearer insight into the respondents logic and way of thinking;
Data can be analysed qualitatively and quantitatively;
Useful for exploring new areas or areas in which the researcher has limited
knowledge.
Disadvantages
For respondents:
More demanding to answer.
For researchers:
More demanding (time-consuming) to process and code;
Semester 2 2015
18
STA60004
Responses may not be relevant (e.g., the respondent may have misinterpreted a
question);
Researchers can misinterpret the answers and thus misclassify responses.
The responses to open questions are often difficult to compare and interpret.
Example (Fink, v.2, p.36):
An Open-Ended Question and Three Answers
Question: How often during the past month did you find yourself having difficulty
trying to calm down?
Answer 1: Not often
Answer 2: About 10% of the time
Answer 3: Much less often than the month before
It is not very easy to compare the three answers. Does 10% of the time (Answer 2)
mean not often? How does Answer 3 compare to the other two?
Closed Questions
Example (Fink, v.2, p.36):
A closed Question
How often during the past month did you find yourself having difficulty trying to
calm down?
[Circle one number]
Always
Very often
Fairly often
Sometimes
Almost never
Never
Advantages of closed questions:

For respondents:
Easy to complete;
Response options can help guide the respondent, eliminating irrelevant answers.
Semester 2 2015
19
STA60004
For researchers:
Quicker to process;
Easier to analyse;
Cheaper to process and analyse;
Enhance the comparability of answers.
Disadvantages
For respondents:
The response options may be too narrow for the respondent
For researchers:
The response options must be exhaustive, so the researcher has to advanceguess the possible responses
Semester 2 2015
20
STA60004
Principles of Developing Question Response Formats

1. Exhaustiveness (or inclusiveness);
2. Exclusiveness;
3. Balancing categories
1. Exhaustiveness (or Inclusiveness)
Range of responses should cover all respondents.
Example:
Your relationship status
- Single
- Married
Better:
Your relationship status:
- Single
- Married
- De facto
- Separated
- Divorced
- Widow
Attitude questions should generally include dont know or no opinion option.

For some questions it is advisable to add other (please specify) option.
1. Exclusiveness
The response choices should be mutually exclusive.
Example:
What was your personal income in 2009?
-
$20,000 or less
$20,000 to 50,000
$50,000 to 75,000
$75,000 or more
Better:
What was your personal income in 2009?
-
Less than $20,000
$20,000 to 49,999
$50,000 to 74,999
$75,000 or more
Semester 2 2015
21
STA60004
2. Balancing categories
Where response categories can be ordered from high to low there should be the
same number of response alternatives either side of the neutral position. The two
endpoints should mean the opposite of each other.
For example:
- Strongly approve
- Approve
- Neither approve nor disapprove
- Disapprove
- Strongly disapprove
Inclusion of the middle alternative

There is some disagreement about whether to include the middle alternative or not.
Some researchers believe that neutral choices provide respondents with an excuse
for not answering questions. It is argued that the middle alternative should not be
included because omitting it makes respondents indicate the direction of their
opinion. Other researchers argue that including the middle position avoids artificially
creating a directional opinion.
It is recommended to try out all questions before you use them. Pretest your
questions with and without the neutral choices and compare the results. Estimate
how many responses cluster around the middle point. Do some respondents resent
not having a neutral choice? Ask your respondents about the response format. You
may ask your respondents the following questions, for example:
-
Did you encounter any problems using the questions response scale?
Would another set of responses be more appropriate?
Semester 2 2015
22
STA60004
Types of Question Response Formats

Numerical Rating Scales (involve a set of responses where the alternative answers
are ordered from low to high)
-
Simple itemised rating scales;
Likert scale;
Horizontal rating scales;
Semantic differential scales;
Ranking Scales
Checklists
Binary Choice Formats
-
Dichotomous questions;
Paired comparisons
Simple Itemised Rating Scales

Commonly used simple itemised rating scales:
Very good
Very true
Definitely yes
Fairly good
Somewhat true
Probably yes
Neither good nor bad
Not very true
Probably no
Not very good
Not at all true
Definitely no
Very important
Very different
Very interested
Fairly important
Somewhat different
Somewhat interested
Neutral
Slightly different
Not very interested
Not so important
Not at all different
Not at all important
Always
Completely satisfied
Very often
Very satisfied
Fairly often
Somewhat satisfied
Sometimes
Somewhat dissatisfied
Almost never
Very dissatisfied
Never
Completely dissatisfied
Semester 2 2015
23
STA60004
Likert Scale
The scale was developed in 1932 by American psychologist Rensis Likert. Likert
scale is usually used for measuring attitudes. Respondents are asked to indicate
their level of agreement or disagreement with the statement.
Strongly disagree
Disagree
Neither agree nor disagree
Agree
Strongly agree
Horizontal Rating Scales

Respondents are provided with opposite attitude positions and asked to indicate with
a number where, between the positions, their own view falls.
Example (de Vaus, p. 102):
Government
Families
should be fully
should be fully
responsible for
responsible for
elder care
elder care
________________________________________________________
1
Dont know
_______
Semantic Differential Scales

Respondents are provided with opposite adjectives to describe someone or
something.
Examples (de Vaus, p. 103):
well
organised
disorganised
____________________________________________
1
A good
employer
Poor
employer
____________________________________________
1
Semester 2 2015
24
STA60004
Ranking Scales
Ranking scales require respondents to indicate relative importance of items.
Example (de Vaus, p.104):
Listed below is a set of issues that can influence the way in which people decide to
vote in general elections. Please rank each of these issues to indicate how important
they are to you when you decide to vote. Place 1 in the box next to the most
important issue, 2 next to the second most important issue and so on. Do not place
the same number in more than one box.
Policies to reduce unemployment
Improving the environment
Spending more money on education
Getting tough on crime
Reducing taxation
Improving social welfare support
Improving health services
Reducing immigration
Checklists
Respondents are provided with a list of items and asked to select those that apply.
Example:
What subjects did you do at school? Please choose all that apply.
Biology
Chemistry
English
Geography
History
Information Technology
Legal Studies
Mathematics
Psychology
Semester 2 2015
25
STA60004
Dichotomous questions
Respondents are asked to choose between one of two alternatives. For example,
Do you smoke cigarettes?
- Yes
- No
Paired comparisons
Respondents are given a set of pairs of items and asked to select one response from
each pair.
Example (de Vaus, p. 105)
Governments have to make choices between the areas to which they give priority
when allocating government expenditure. For each pair of expenditure areas tick the
one you think ought to be given priority.
Education
Education
Health
Social welfare
Health
Social welfare
Defence
Defence
Environment
Health
Industry support
Health
Environment
Family support
Law and order
Recreation
Law and order
Defence
Semester 2 2015
26
STA60004
Exercise 4
1. Which of the following best describes where you were when you first started
smoking?
(A) Alone
(B) With members of your family
(C) With friends
2. How many do you smoke?

(A) Less than half a pack
(B) About one pack
(C) More than one pack
3. If you won a lottery, would you . . .

(A) Buy an expensive car
(B) Travel overseas
(C) Buy a house
(D) Save your money
Semester 2 2015
27
STA60004
4. Staff members handle inquiries efficiently.

Yes/No
Exercise 5
A researcher is conducting a survey of anxiety and depression in the workplace. He
would like to ask, In the past month, how often has feeling depressed interfered with
doing your job? What response choices can the researcher use for this question?
Semester 2 2015
28
STA60004
Response Sets
Response sets refers to a tendency to respond to a question in some characteristic
manner regardless of the content of the question.
Social desirability - the tendency to provide the respectable rather than the true
response. As a result, socially 'desirable' behaviours (e.g. amount of physical
exercise) are over-reported while socially 'undesirable' behaviours (e.g. alcohol
consumption, sexist and racist attitudes) are under-reported.
Acquiescence - the tendency to agree with a statement regardless of its content;
Nonacquiescence - the tendency to disagree with a statement regardless of its
content.
Reducing Social Desirability Response Sets:

Mention that everybody does it
Even the calmest of parents get angry at their children some of the time. Did
your children do anything in the last seven days to make you feel angry? (de
Vaus, p. 108)
Use an authority
Many doctors now think that drinking wine reduces heart attacks and aids
digestion. Have you drunk any wine in the last seven days? (de Vaus, p. 108)
Build in an excuse
We know that people are often very busy and can find it difficult to find time to
engage in regular exercise. How often have you engaged in exercise designed
to improve your fitness in the last seven days? (de Vaus, p. 108)
Ask a less specific question
Have you ever, even once, hit your partner in anger? (de Vaus, p. 108)
Semester 2 2015
29
STA60004
Order of Questions in the Questionnaire

Some Recommendations:
1. Start with questions that respondents will enjoy answering:
-
easily answered questions;
factual questions;
questions that are obviously relevant to the objectives of the survey;
2. It is not recommended to start with demographic questions;

3. Go from easy to more difficult questions;
4. Go from concrete to abstract questions;
5. Place open-ended questions towards the end of the questionnaire;
6. Group questions into sections;
7. Make use of filter or contingency questions (questions that direct respondents to
questions that applicable to them depending on the previous responses);
8. Mix up positive and negative questions to avoid acquiescent response set;
9. Consider randomising questions for each respondent to help minimise the
question order effect;
10. Use a variety of question formats to make the questionnaire look interesting.
Semester 2 2015
30
STA60004
Pilot Testing or Pretesting Questions

Once a questionnaire has been developed, each question and the questionnaire as
a whole should be evaluated thoroughly before final administration. Evaluating a
questionnaire is called pilot testing or pretesting. Pilot testing should be conducted
with people who resemble those to whom the survey will finally be given.
Somewhere between 75 and 100 respondents provides a useful pilot test.
Before pretesting your questionnaire try to put yourself in the position of the
respondent. Imagine how you would answer your survey questions. This may help
you to see problems in question wording, structure, etc.
What should you ask respondents during the pilot testing?
Ask them how they interpret the questions meaning;
Ask whether they would rephrase the question;
Check whether the range of response alternatives is sufficient;
Ask whether the question is necessary/redundant
Ask whether there are any problems with the questionnaire flow;
Is the estimated time for completing the questionnaire calculated right?
Are clear instructions provided throughout the questionnaire?
Are any skips clear and simple to follow?
Is there sufficient space?
Semester 2 2015
31
STA60004
Examining Existing Questions

Consider using questions that have been developed by other researchers (you will
need to get authors permission before using some existing questionnaires).
Reasons for using existing questions and questionnaires:
Measurement qualities of the existing questions and questionnaires (reliability,
validity) may be available;
Using existing questions may allow you to draw comparisons with other research;
Existing questions might give you some ideas about how best to approach
creating your own questions.
There are online question banks where you can access previously developed
questions and questionnaires. For example, the UK Data Archive (UKDA) has a
good question bank providing access to many surveys and associated commentary
to assist survey design. The link to UKDA:
http://surveynet.ac.uk/sqb
Semester 2 2015
32
STA60004
Bibliography
Fink, A. (2003). The survey kit, Volume 2: How to ask survey questions (2nd ed.)
Semester 2 2015
33
STA60004

Exercise 1 (p. 8)
Identify whether
the
following questions
measure
behaviour,
beliefs,
knowledge, attitude or attributes.

What is your highest level of education?
(attribute)
Are you aware of our animal training programs?
(knowledge)
My child gets on well with their peers at school.
(belief)
Did you take any natural herbs to improve your athletic or sporting performance?
(behaviour)
Animal research cannot be justified and should be stopped.
(attitude)
New surgical procedures and experimental drugs should be tested on animals
before they are used on people.
(attitude)
There are plenty of viable alternatives to the use of animals in biomedical and
behavioral research.
(belief)
How satisfied have you been with the customer service?
(attitude)
All career counselors at this university are professionally qualified.
(knowledge)
How did you book your parent-teacher interview?
(behaviour)
Semester 2 2015
34
STA60004
Exercise 2 (p. 16)

1. Would your spouse be happy for you to work full time?
What if the respondents spouse already works full-time? Needs a filter
question first;
Unreasonable to expect respondents to give an accurate opinion on
their spouses opinion;
Happy in what sense? - ambiguous term
2. How would you describe your health?
Make the question more concrete, add time period. For example: In the
past three months, how ?
3. How often do you exercise?
Define exercise;
Make the question more concrete, add time period. For example: in
a typical week?
4. Travel to other countries has become increasingly popular among Australians.
Have you ever travelled to another country? If yes, you might have travelled to
other countries to enjoy their scenery. How important was the scenery in your
decision to take a trip?
Long question. Change to, for example: Have you ever travelled to
another country? If yes, how important was the scenery in your decision
to take a trip?
5. Dont you agree that social workers should earn more money than they currently
earn?
A leading question. Make a more neutral wording. For example, Do you
believe social worker salaries (a little lower than they should be, a little
higher than they should be, or about right?)
6. Mothers with children should not work.
A leading question;
Too many interpretations work where (in paid employment, at home),
work full-time, age of children?
7. Do you want to be rich and famous?
A double-barrelled question
Semester 2 2015
35
STA60004
Exercise 4 (p. 27)

1. Which of the following best describes where you were when you first started
smoking?
(A) Alone
(B) With members of your family
(C) With friends
The list is not exhaustive respondent may not even be a smoker;
The list is not mutually exclusive;
Are we just asking about cigarettes or illegal substances etc.?
2. How many do you smoke?
(A) Less than half a pack
(B) About one pack
(C) More than one pack
Are we just asking about cigarettes or pipes, cigars, illegal substances
etc.?
The list is not exhaustive (or does none belong to code (A)?) What if
the respondent smokes between half and a full pack?
Time frame has not been specified (per day, per week);
Size of a pack (20s or 25s or ..?)
Roll your own?
3. If you won a lottery, would you . . .
(A) Buy an expensive car
(B) Travel overseas
(C) Buy a house
(D) Save your money
List is not exhaustive (respondent may not do any of those options)
needs an other option
Are multiple responses allowed?
4. Staff members handle inquiries efficiently.
Yes/No
Needs more elaborate scale;
Not clear what efficiently means
Semester 2 2015
36
STA60004
Exercise 5 (p. 28)

Possible response options:
Often
Sometimes
Never
Always
Very often
Fairly often
Nearly all the time

Some of the time
A little of the time
Sometimes
Almost none of the time
Almost never
Never
All of the time
100% of the time
Most of the time
Between 50% and 100% of the time
A good bit of the time
Less than 50% of the time
Some of the time

Little of the time
None of the time
Semester 2 2015
37

Module 1
Topic 5: Introduction to Scale Development
STA60004
Topic 5 Introduction to Scale Development
Contents
Learning Objectives
Topic Introduction
Examples of Scales
Steps in the Scale Development Process
15
Step One: Definition of the Construct
15
Step Two: Generation of the Item Pool
16
Exercise 1
17
Step Three: Choice of Response Format
18
Step Four: Review of Items, Pilot Testing and Developmental Testing of the Scale
18
Step Five: Evaluation of the Scale
19
Testing the Reliability of a Scale
20
Test-Retest Reliability
21
Alternate Forms Reliability
21
Split-Half Reliability
22
Internal Consistency
23
Exercise 2
25
Exercise 3
35
Testing the Validity of a Scale
38
Content Validity
38
Face Validity
39
Criterion Validity
40
Predictive Validity
41
Construct validity
42
Presenting a Newly Developed Scale
44
Exercise 4
45
Bibliography
46
47
Semester 2 2015
STA60004
Learning Objectives
By the end of this topic you should:
Have an understanding of the issues associated with measurement in the social
sciences
Be familiar with the notion of reliability
Understand the use of Cronbach's alpha and its interpretation
Understand the various types of validity relevant to scale evaluation
Have the necessary SPSS computing skills to test the reliability of a scale
Semester 2 2015
STA60004
Topic Introduction
In research we often need to measure complex constructs or concepts. This is
commonly done using scales. A scale is a composite measure of a concept.
The development of good scales is a very complex issue involving a variety of tools.
This topic provides an introduction to this important area of survey research.
Suppose we want to develop a scale for measuring environmental footprint. It is
important to have a sound understanding of the literature in this area before you
start. A conceptual model of the basic construct needs to be developed, perhaps
something like the following. To fully represent this construct, items representing
each of the components are required.
Scales are supposed to be unidimensional. For example it could be argued that it is

not possible to develop a single scale to describe environmental footprints. As
indicated in our conceptual model for this construct there are at least five dimensions
that underlie this construct. Only if these dimensions are strongly correlated with
each other does it make sense to construct a single scale to measure environmental
footprint.
Semester 2 2015
STA60004
Example
NEO Personality Inventory (Costa & McCrae, 1992) measures major domains of
personality.
It
contains
five
scales:
Neuroticism,
Extraversion,
Openness,
Agreeableness and Conscientiousness.

Extraversion scale of NEO Personality Inventory, for example, is represented by the
following components or facets:
Warmth
Gregariousness
Extraversion
Assertiveness
Activity
Excitement-Seeking
Positive Emotions
To fully represent this construct, items representing each of the facets are required.
Each facet in this scale is represented by 8 items. For example, an item representing
the Warmth facet is Im known as a warm and friendly person.
Gregariousness:
I really feel the need for other people if I am by myself for long.
Assertiveness:
I have often been a leader of groups I have belonged to.
Activity:
I often feel as if Im bursting with energy.
Excitement-Seeking:
I like being part of the crowd at sporting events.
Positive Emotions:
I am a cheerful, high-spirited person.
Semester 2 2015
STA60004
All facets of the Extraversion scale are strongly correlated with each other.
Any scale you might want to develop involves the assumptions of additivity and
interval scaling.
Additivity means that respondents are asked to answer several items that constitute
a scale and then all the answers are added up to obtain an overall score. For
example, instead of measuring depression by asking respondents how much they
feel depressed, we would ask about a range of behaviours which tap depression. We
then add up the answers, and obtain an overall measure of depression.
An analogy for a scale is a students marks in a subject. The student usually
completes a number of pieces of work (an essay, a report, an examination) and
receives a final mark. The final mark is meant to reflect the students knowledge of
the subject. This is measured by summing the scores for each piece of work into the
overall score.
Interval Scaling: Items of a questionnaire are measured on an interval scale. Most
often researchers use Likert summative scale which asks people to say how much
they agree or disagree with the scale items.
Reasons for measuring a concept by using multiple indicators rather than one:
1. It helps reflect the complexity of the concept.
2. It leads to developing more valid measures. It can help to avoid some of the
distortions and misclassification which can occur by using one-item measures of
complex concepts.
3. Multiple indicators increase reliability. For example, question wording can affect
the way respondents answer it. Respondents answers could be largely a function
of the wording of the question. Using a number of questions should minimize the
effect of one question which is poorly worded.
4. Multiple indicators allow greater precision. For example, using suburb of
residence as a measure of persons social status may lead to a very crude, and
even wrong, classification. Much better to take into account other indicators, such
as education, occupation, income etc.
Semester 2 2015
STA60004
5. Data analysis is simplified: instead of analysing each question separately we can

analyse one variable.
Semester 2 2015
STA60004
Examples of Scales
Questionnaire 1
Instructions: For each of the following statements, circle the number on the 5-point
scale that best describes how that statement applies to you.
Strongly
Disagree
Disagree
Neither
disagree
nor agree
Agree
Strongly
Agree
1. I chose my present courses largely with a view to the job situation when
I graduate rather than out of their intrinsic interest to me.
2. I find that at times studying gives me a feeling of deep personal

satisfaction.
3. I think browsing around is a waste of time, so I only study seriously

whats given out in class or in the course outlines.
4. While I am studying, I often think of real life situations to which the

material that I am learning would be useful.
5. I am discouraged by a poor mark on a test and worry about how I will do

on the next test.
6. While I realize that truth is forever changing as knowledge is increasing,

I feel compelled to discover what appears to me to be the truth at this
time.
7.
8. In reading new material I often find that Im continually reminded of

material I already know and see the latter in a new light.
9. Whether I like it or not, I can see that further education is for me a good
way to get a well-paid or secure job.
10. I feel that virtually any topic can be highly interesting once I get into it.
11. I tend to choose subjects with a lot of factual content rather than
theoretical kinds of subjects.
12. I find that I have to do enough work on a topic so that I can form my own
point of view before I am satisfied.
13. Even when I have studied hard for a test, I worry that I may not be able
to do well in it.
14. I find that studying academic topics can at times be as exciting as a

good novel or movie.
15. I generally restrict my study to what is specifically set as I think it is

unnecessary to do anything extra.
I learn some things by rote, going over and over them until I know them
by heart.
Semester 2 2015
STA60004

Strongly
Disagree
Disagree
Neither
disagree
nor agree
Agree
Strongly
Agree
16. I try to relate what I have learned in one subject to that in another.
17. Lecturers shouldnt expect students to spend significant amounts of time

studying material everyone knows wont be examined.
18. I usually become increasingly absorbed in my work the more I do.
19. I learn best from lecturers who work from carefully prepared notes and
outline major points neatly on the blackboard.
20. I find most new topics interesting and often spend extra time trying to
obtain more information about them.
21. I almost resent having to spend a further three or four years studying
after leaving school, but feel that the end results will make it worthwhile.
22. I believe strongly that my main aim in life is to discover my own

philosophy and belief system and to act strictly in accordance with it.
23. I find it best to accept the statements and ideas presented by my

lecturers and question them only under special circumstances.
24. I spend a lot of my free time finding out more about interesting topics
which have been discussed in different classes.
25. I am at college/university mainly because I feel that I will be able to

obtain a better job if I have a tertiary qualification.
26. My studies have changed my views about such things as politics, my

religion, and my philosophy of life.
27. I am very aware that lecturers know a lot more than I do and so I
concentrate on what they say is important rather than rely on my own
judgment.
28. I try to relate new material, as I am reading it, to what I already know on
that topic.
Semester 2 2015
STA60004
Questionnaire 1 is Study Process Questionnaire (Biggs, 1987). It was designed to

measure student approaches to learning and studying. The questionnaire contains
two scales: Surface Approach (SA) and Deep Approach (DA). Surface approach, in
turn, contains two facets: Surface Motive (SM) and Surface Strategy (SS). Deep
Approach consists of Deep Motive (DM) and Deep Strategy (DS) facets.
Approach
Motive
Strategy
SA: Surface
Surface
Motive
(SM)
is
instrumental: main purpose is to
meet requirements minimally: a
balance between working too hard
and failing.
Surface
Strategy
(SS)
is
reproductive: limit target to bare
essentials and reproduce through
rote learning.
DA: Deep
Deep Motive (DM) is intrinsic: Deep Strategy (DS) is meaningful:

study to actualize interest and read widely, interrelate with previous
competence in particular academic relevant knowledge.
subjects.
To calculate Surface Approach (SA) score, sum up the scores for the following
questions:
SA
=
SM
+
SS
question1+q5+q9+q13+q17+q21+q25+q3+q7+q11+q15+q19+q23+q27
(SM= q1+q5+q9+q13+q17+q21+q25; SS= q3+q7+q11+q15+q19+q23+q27)

To calculate Deep Approach (DA) score, sum up the scores for the following
questions:
DA = DM + DS = q2+q6+q10+q14+q18+q22+q26+q4+q8+q12+q16+q20+q24+q28
(DM= q2+q6+q10+q14+q18+q22+q26; DS= q4+q8+q12+q16+q20+q24+q28
Semester 2 2015
10
STA60004
Questionnaire 2
Instructions: For each of the following statements, circle the number on the 5-point
scale that best describes how that statement applies to you and your mother.
Strongly
Disagree
Disagree
Neither
disagree
nor agree
Agree
Strongly
Agree
1.
While I was growing up my mother felt that in a well-run home the

children should have their way in the family as often as the parents do.
2.
Even if her children didnt agree with her, my mother felt that it was for
our own good if we were forced to conform to what she thought was
right.
3.
Whenever my mother told me to do something as I was growing up, she

expected me to do it immediately without asking any questions.
4.
As I was growing up, once family policy had been established, my

mother discussed the reasoning behind the policy with the children in
the family.
5.
My mother has always encouraged verbal give-and-take whenever I

have felt that family rules and restrictions were unreasonable.
6.
My mother has always felt that what children need is to be free to make
up their own minds and to do what they want to do, even if this does not
agree with what their parents might want.
7.
As I was growing up my mother did not allow me to question any

decision that she had made.
8.
As I was growing up my mother directed the activities and decisions of

the children in the family through reasoning and discipline.
9.
My mother has always felt that more forces should be used by parents
in order to get their children to behave the way they are supposed to.
10. As I was growing up my mother did not feel that I needed to obey rules
and regulations of behaviour simply because someone in authority had
established them.
11. As I was growing up I knew what my mother expected of me in my

family but I also felt free to discuss those expectations with my mother
when I felt that they were unreasonable.
12. My mother felt that wise parents should teach their children early just
who is boss in the family.
13. As I was growing up, my mother seldom gave me expectations and

guidelines for my behaviour.
14. Most of the time as I was growing up my mother did what the children in
the family wanted when making family decisions.
Semester 2 2015
11
STA60004

Strongly
Disagree
Disagree
Neither
disagree
nor agree
Agree
Strongly
Agree
15. As the children in my family were growing up, my mother consistently

gave us direction and guidance in rational and objective ways.
16. As I was growing up my mother would get very upset if I tried to

disagree with her.
17. My mother feels that most problems in society would be solved if

parents would not restrict their childrens activities, decisions, and
desires as they are growing up.
18. As I was growing up, my mother let me know what behaviours she
expected of me, and if I didnt meet those expectations, she punished
me.
19. As I was growing up my mother allowed me to decide most things for

myself without a lot of direction from her.
20. As I was growing up my mother took the childrens opinions into

consideration when making family decisions, but she would not decide
to do something simply because the children wanted it.
21. My mother did not view herself as a responsible for directing and
guiding my behaviour as I was growing up.
22. My mother had clear standards of behaviour for the children in our
homes as I was growing up, but she was willing to adjust those
standards to the needs of each individual children in the family.
23. My mother gave me direction for my behaviour and activities as I was

growing up and she expected me to follow her direction, but she was
always willing to listen to my concerns and to discuss that direction with
me.
24. As I was growing up my mother allowed me to form my own point of

view on family matters and she generally allowed me to decide for
myself what I was going to do.
25. My mother has always felt that most problems in society would be
solved if we could get parents to strictly and forcibly deal with their
children when they dont do what they are supposed to as they are
growing up.
26. As I was growing up my mother often told me exactly what she wanted
me to do and how she expected me to do it.
27. As I was growing up my mother gave me clear direction for my best

behaviours and activities, but she was also understanding when I
disagreed with her.
28. As I was growing up my mother did not direct the behaviours, activities
and desires of the children in the family.
Semester 2 2015
12
STA60004

Strongly
Disagree
Disagree
Neither
disagree
nor agree
Agree
Strongly
Agree
29. As I was growing up I knew what my mother expected of me in the

family and she insisted that I conform to those expectations simply out
for respect for her authority.
30. As I was growing up, if my mother made a decision in the family that
hurt me, she was willing to discuss that decision with me and to admit it
if she had made a mistake.
Questionnaire 2 is Parental Authority Questionnaire (Buri, 1991). It was designed

to measure parental prototypes. The questionnaire contains three scales: Permissive
Style, Authoritarian Style, Authoritative Style of parenting
Permissive parents are relatively noncontrolling and tend to use a minimum of
punishment with their children.
Authoritarian parents tend to be highly directive with their children and value
unquestioning obedience in their exercise of authority over their children.
Authoritative parents tend to fall somewhere between these extremes. They are
characterized as providing clear and firm direction for their children, but disciplinary
clarity is moderated by warmth, reason, flexibility, and verbal give-and-take.
Permissive Style = q1+q6+q10+q13+q14+q17+q19+q21+q24+q28
Authoritarian Style = q2+q3+q7+q9+q12+q16+q18+q25+q26+q29
Authoritative Style = q4+q5+q8+q11+q15+q20+q22+q23+q27+q30
Semester 2 2015
13
STA60004
Questionnaire 3
Please rate the following statements on the 4-point scale
Strongly
Disagree
Disagree
Somewhat
Agree
Somewhat
Strongly
Agree
1. I found it easy to get the information I needed about my online

course(s) and complete the enrollment & registration process
2. I was always able to gain access to my online course(s) and the applicable
network resources (library, e-mail, etc) when needed.
3. I was given multiple ways to interact with the teacher and other
students (e.g., e-mail, discussion) in all online course(s)
4. In my online course(s), I always received constructive and timely

feedback on my assignments and questions
5. Before starting my online course(s), I was well advised about the selfmotivation and commitment needed to succeed at distance learning
6. Before starting my online course(s), I was well advised about the

technology and skills I would need to fulfil my course requirements
7. My online instructor(s) always provided a clearly written, straightforward

statement of course objectives and learning outcomes or expectations
8. I had sufficient access to the online library resources I needed to fulfil

my course objectives and complete all my assignments
9. Before starting my online course(s), I received sufficient information

about admission requirements or prerequisites, tuition and fees, books
and materials, test proctoring or phone conferencing requirements, and
student support services
10. My course(s) provided me with the skills I needed to secure outside

course materials through electronic databases, interlibrary loans,
government archives, news services, and other sources
11. Prior to the beginning my online course(s), I was orientated to

Blackboard and had the opportunity to practice using it
12. I had convenient access to technical assistance/support whenever

needed
13. My technical support questions or problems were answered accurately

or solved quickly
14. There is a structured system in place to address student complaints

about online learning
Semester 2 2015
14
STA60004
Questionnaire 3 is Student Scale for Assessing the Quality of Internet-Based

Distance Learning (Scanlan, 2003). It consists of two facets: Teaching and
Learning Process and Administrative Support.
To calculate the total score, sum up the following items:
q3+q4+q5+q6+q7+q10+q14+q1+q2+q8+q9+q11+q12+q13
(Teaching and Learning Process = q3+q4+q5+q6+q7+q10+q14;
Administrative Support = q1+q2+q8+q9+q11+q12+q13)
Steps in the Scale Development Process

The following steps are usually required in the scale development process:
1.
Definition of the construct;
2.
Generation of the item pool;
3.
Choice of response format;
4.
Review of the items, pilot testing and developmental testing of the scale;
5.
Evaluation of the scale including its

-
reliability and
validity.
When it comes to the evaluation of scales we will only be touching the surface.
There are many evaluation techniques such as exploratory factor analysis,
confirmatory factor analysis and Rasch analysis which will not be covered in this unit.
Step One: Definition of the Construct

In order to clarify a construct or a concept you need to obtain a range of definitions of
that construct. Do this by searching the literature: textbooks, journal articles,
dictionaries and encyclopaedias. Once a number of definitions have been found you
might identify the common elements of these descriptions and develop a definition
based on these. Where a concept takes on a number of widely held but different
meanings, you will need to decide on and justify one, depending on your research.
Then you need to delineate the dimensions of the construct and decide whether you
Semester 2 2015
15
STA60004
want to develop a measure of all the dimensions of the concept or focus on just one
or two.
You need to think about whether you are assessing a single psychological factor, or
more than one. Often what seems like a single mood, set of beliefs or other
psychological factor, may prove to be complicated. For example, with self-rated
religiosity, you would need to decide whether to distinguish between internal
religious feelings and experiences, and outward observances, such as affiliation and
practice.
Another example: If you wish to study musical preference, you would need to decide
whether to look at liking music in general or to consider different uses of music (e.g.,
listening, performance, therapy, dance, and so forth). You also need to decide
whether to consider different types of music or just concentrate on one type of music.
Step Two: Generation of the Item Pool

After you have clarified the concept you will need to develop a set of questions which
seem to measure that concept. A good idea is to get advice from informants from the
group to be surveyed. Such people can provide useful clues about meaningful
questions.
Semester 2 2015
16
STA60004
Exercise 1
(Litwin, 2003, p.77, exe.7)
The housing office of a large university wants to measure student satisfaction with
various aspects of the campus dormitories. After researching the relevant published
literature, the housing director cannot find a survey instrument that she thinks is
appropriate, so she decides to develop her own. How would you advise her to begin
her project?
Semester 2 2015
17
STA60004
The first version of your scale should be longer than the final scale. Develop more
items than you really want to include in the final version. Some items may have to be
discarded because they do not meet all the necessary criteria.
So how many items should the scale contain? There are two points to consider:
Too few items may not produce a reliable measure.
Reliability coefficient alpha tends to be too low when there are few items on a
scale.
Too many items may put too much burden on the participants and there may be
repetitiousness in the items.
Something between 6 and 15 items should be enough for assessing a single factor.
To ensure this, you should start with between 10 and 30 items. If you have several
subscales, you need to keep each subscale as short as possible. If you have a very
long scale, no matter how interesting the topic is, most people will loose interest
before they have finished.
Step Three: Choice of Response Format

The number of response categories can vary from a simple dichotomy (yes/no,
true/false), to a continuum, allowing a wide range of responses. The aim is to elicit
responses from respondents which cover the entire range, thereby increasing the
variability in scores.
Mildly worded statements or very extreme statements are not appropriate as they
tend to result in either too much agreement or disagreement, thereby reducing
variability.
Step Four: Review of Items, Pilot Testing and

Developmental Testing of the Scale
The point of the review is to check that all the components of the construct have
been well addressed in the scale and to check that all the items are appropriate. This
should be done by expert reviewers with a fresh set of eyes. The final set of items
can then be presented to a small set of subjects for pilot testing. Pilot testing is
Semester 2 2015
18
STA60004
necessary for testing the clarity of the instructions, the clarity of the items and how
much time is needed in order to complete the scale.
Developmental testing of the scale is used to assess the scales reliability and
validity, so it is important that the sample you are using is representative of the
population for which the scale has been developed.
Step Five: Evaluation of the Scale

Reliability
Validity
Semester 2 2015
19
STA60004
Testing the Reliability of a Scale

Reliability refers to the accuracy or precision of a measurement procedure. If a
measurement instrument is reliable, its measurement is consistent and accurate,
rather than random.
In social research, the elements of measurement are usually abstract constructs
(concepts) such as personality traits, attitudes and values. Unlike the measurement
of physical attributes or conditions such as height, weight, and temperature,
psychological traits cannot be seen or felt. Nor can they be measured directly. They
must be inferred from peoples beliefs and behaviours. This measurement process is
extremely susceptible to error. That is why you must take particular care to maximise
the quality (namely, reliability and validity) of measurement instruments and
procedures.
Test scores must be reliable before they can have any validity. Therefore we discuss
reliability first as a necessary condition for validity to exist.
There are four procedures for calculating the reliability of a scale:
Test-retest;
Alternate forms;
Split-half and
Internal consistency
While each procedure is distinct, there are some conceptual similarities across these
procedures.
Test-Retest Reliability (Temporal Stability)

In the test-retest procedure a scale is administered twice to the same group of
people. A reliable scale is one on which respondents obtain the same scale score on
two different occasions. The rationale underlying test-retest reliability is that if a
measure reflects some meaningful construct, it should assess that construct
comparably on separate occasions. To measure test-retest reliability scores from the
first testing are correlated with scores from the second testing, and the resulting
Semester 2 2015
20
STA60004
correlation coefficient is called a reliability coefficient. If the correlation is high (0.8 or

above) then the scale is considered to be reliable. In other words, if high scorers on
the first testing score high on the retest, average scorers score average, and low
scorers score low, the scale is considered to be reliable. In some cases intraclass
correlations coefficient (ICC) is used to assess test-retest reliability.
Low temporal stability needs to be interpreted with caution as it can mean that either
a measure is not reliable or a real change in the attitude or achievement or whatever
trait is being measured has occurred between testings. Therefore, test-retest
reliability assessment is only suitable for stable traits that would not be expected to
change much between testing occasions.
Problems with test-retest reliability studies:
1. It is often difficult to give the same test to the same people twice.
2. Memory: people may remember their answers from the first occasion and answer
the same way the second time to be consistent. This can artificially inflate the
apparent reliability of the test.
3. Reactivity: answering questionnaire items might have made people to reassess
and change their behaviour.
4. Measuring attitudes: peoples attitudes may change.
To overcome these problems it is desirable that the time between testings be long
enough for respondents to forget their initial answers but short enough so that little or
no real change in attitude occurs between the testings. Generally, 2-4 weeks are
considered optimal for test-retest studies.
Alternate Forms Reliability

An alternate form of a scale is an equivalent scale that measures the same content
as the original form but with different items. Both forms are administered to the same
group of people. The scores from the two forms are correlated to obtain a reliability
coefficient.
Semester 2 2015
21
STA60004
The alternate-forms procedure has an advantage over the test-retest procedure in

that respondents recall of items and responses from the first testing has no influence
on scores in the second testing because the items are different. The second form
may be administered immediately after the first form, thus also overcoming the
problem of real change between testings; or it can be administered after a time
interval.
A disadvantage of alternate-forms procedure is that the different content in the items
of alternate forms unavoidably causes the two sets of scores to be somewhat
different. In fact, the alternate-forms reliability is usually the most conservative (the
lowest) reliability estimate of the four procedures.
Split-Half Reliability
Conceptually, the split-half procedure is somewhat similar to the alternate-forms
procedure. The scale is administered to a group of people. The scale items are then
divided into two half-length tests. Each respondent thus receives two scores, one for
each half-tests. These two sets of scores are then correlated.
The correlation coefficient indicates only the reliability of the half-length test. Since
correlation is directly associated with variance, and variance is directly associated
with test length, a full-length test would be expected to have somewhat higher
reliability than a half-length test. Therefore, the correlation coefficient is adjusted
using Spearman-Brown formula:
If, for example, the half-length tests intercorrelate .70, the reliability of the whole test
equals .82
(2 x 0.70)/(1 + 0.70) = 1.4/1.7 = 0.82
How do we split the test into halves? There are, of course, many possible ways. One
of the very often used procedures is the odd-even procedure: scoring oddnumbered items in one half and even-numbered items in the other.
Semester 2 2015
22
STA60004
The split-half procedure requires only a single administration, in comparison with the
test-retest and the alternate forms which require two test administrations.
Internal consistency differs somewhat from other reliability testing procedures. A
procedural difference is that the correlation statistic is not used directly. A conceptual
difference is that an internal consistency coefficient tells us about similarity in
measurement across items rather than stability over time or across forms. Split-half
procedure is an index of consistency between halves, and the internal consistency
procedure is an index of inter-item consistency.
Internal consistency reliability is concerned with the homogeneity of the items within
a scale. A scale is internally consistent to the extent that its items are highly
intercorrelated. Ideally, scale items should show relatively high variance, with mean
scores falling close to the centre of the range of possible scores. Items with low
variance do not discriminate among individuals with different levels of the construct
of interest, and therefore do not contribute to the scale as a whole.
The reliability of the scale is determined by the intercorrelation among each of its
items (Item-item correlations). Items with very low or negative correlations with other
items in the scale should be identified and marked for deletion from the final version
of the scale.
Cronbachs (1951) alpha is the most commonly used method of testing the reliability
of a scale. The formula for coefficient alpha is
where k is the number of items,
is the variance of one item,
is the variance of
the total test scores.

Cronbachs alpha can range from 0 to 1, with higher values indicating higher levels
of internal consistency reliability. This is so because conceptually alpha is calculated
to help answer questions about how similar items of the scale are. Alpha coefficients
Semester 2 2015
23
STA60004
are dependent on both the average correlation among the items and also the
number of items included in the scale.
In general, a minimum level of 0.7 is recommended for Cronbachs alpha. If alpha is
above 0.9, this usually means that there are some items in the scale which are very
similar to each other. In this case, shortening of the scale is recommended through
discarding some of those items.
The size of alpha is affected by the reliability of individual items. To increase the
alpha of the scale, and thus the scales reliability, we need to delete all unreliable
items. To identify which items are unreliable we need to examine various statistical
properties of the items. The most common and useful measures are item-total
correlations and alpha if item deleted.
Item-total correlation (or item-rest of test correlation) is the correlation between the
item and the total score of the scale, calculated without including the item being
investigated. Good item-total correlations are higher than 0.5. Some authors
suggest that item-total correlations should be at least higher than 0.3.
Alpha if item deleted statistic involves calculating what the alpha would be if a
particular item was dropped. If alpha becomes higher when an item is deleted, then
that item is unreliable and should be discarded.
If you use SPSS, the procedure to obtain alpha, item-total correlations and alpha if
item deleted is as follows:
Select ANALYZE, SCALE,RELIABILITY ANALYSIS;
Select variables which constitute your scale, for example item1, item2, etc. and
arrow them across;
Select model as ALPHA;
Click on STATISTICS;
Under DESCRIPTIVES select ITEM, SCALE, SCALE IF ITEM DELETED;
Under CORRELATIONS select CORRELATIONS;
Click CONTINUE and then OK to run.
Semester 2 2015
24
STA60004
Exercise 2
One hundred and fifty Swinburne students completed Spielbergers (1983) StateTrait Anger inventory. This widely used test is designed to assess a persons level
of state anger and their level of trait anger.
State anger is the level of anger a person is experiencing at the time of the test.
That is how angry a person is in a particular situation at a particular point in time.
Individuals who score highly on state anger are assumed to be experiencing high
levels of anger at the time the test was taken.
Trait anger is a measure of a persons general predisposition towards anger. People
who score highly on trait anger tend to be more vulnerable to anger across a number
of situations and across time.
While a person high on trait anger is expected to also score highly on state anger,
state anger is hypothesized to change across time and across situations while trait
anger is expected to remain fairly stable.
The items are:
State Anger Scale items: rated according to how you feel right now (1=not at all,
2=somewhat, 3=moderately, 4=very much)
q1
I am furious
q2
I feel irritated
q3
I feel angry
q4
I feel like yelling at somebody
q5
I feel like breaking things
q6
I am mad
q7
I feel like banging on the table
q8
I feel like hitting someone
q9
I am burned up
q10
I feel like swearing
Semester 2 2015
25
STA60004
Trait Anger Scale items: rated according to how you generally feel
q11
I am quick tempered
q12
I have a fiery temper
q13
I am a hotheaded person
q14
I get angry when I am slowed down by others' mistakes
q15
I feel annoyed when I am not given recognition for doing good work
q16
I fly off the handle
q17
When I get mad I say nasty things
q18
It makes me furious when I am criticized in front of others
q19
When I get frustrated I feel like hitting someone
q20
I feel infuriated when I do a good job and get a poor evaluation.
In the file State Trait Anger.sav (available on the Blackboard) you have the following
variables: sex, q1 to q10 (10 state anger items taken at Time 1), q11 to q20 (trait
anger items taken at Time 1), state1 (total score of the state items at Time 1), trait1
(total score of the trait items at Time 1), state2 (total score of the state items at Time
2), trait2 (total score of the trait items at Time 2).
Semester 2 2015
26
STA60004
Alpha for the State Anger Scale, item-total correlations and alpha if item is deleted
for each item were obtained.
Semester 2 2015
27
STA60004
(Items q1 to q10 were selected)

Click on Statistics
Semester 2 2015
28
STA60004
Click Continue and then OK to run.

You will get SPSS output as follows:
Reliability Statistics
Cronbach's Alpha
Based on
Standardized
Cronbach's Alpha
Items
.911
N of Items
.921
10
Item Statistics
Mean
Std. Deviation
q1
1.3172
.64230
145
q2
1.9172
.90141
145
q3
1.3931
.71973
145
q4
1.2621
.62384
145
q5
1.1172
.39971
145
q6
1.3241
.69605
145
q7
1.1793
.53579
145
q8
1.1310
.47515
145
q9
1.4138
.75080
145
q10
1.5655
.87253
145
Semester 2 2015
29
STA60004
Inter-Item Correlation Matrix

q1
q2
q3
q4
q5
q6
q7
q8
q9
q10
q1
1.000
.477
.735
.484
.422
.716
.540
.500
.460
.533
q2
.477
1.000
.596
.533
.374
.552
.390
.350
.349
.572
q3
.735
.596
1.000
.651
.515
.825
.554
.478
.391
.606
q4
.484
.533
.651
1.000
.572
.683
.586
.563
.360
.504
q5
.422
.374
.515
.572
1.000
.586
.712
.650
.439
.486
q6
.716
.552
.825
.683
.586
1.000
.662
.585
.526
.657
q7
.540
.390
.554
.586
.712
.662
1.000
.725
.436
.599
q8
.500
.350
.478
.563
.650
.585
.725
1.000
.431
.406
q9
.460
.349
.391
.360
.439
.526
.436
.431
1.000
.499
q10
.533
.572
.606
.504
.486
.657
.599
.406
.499
1.000
Case Processing Summary

N
Cases
Valid
a
Excluded
Total
%
145
96.7
3.3
150
100.0
a. Listwise deletion based on all variables in the

procedure.
Item-Total Statistics
Squared
Cronbach's
Scale Mean if
Scale Variance
Corrected Item-
Multiple
Alpha if Item
Item Deleted
if Item Deleted
Total Correlation
Correlation
Deleted
q1
12.3034
21.005
.713
.617
.900
q2
11.7034
19.863
.614
.457
.910
q3
12.2276
19.996
.792
.765
.895
q4
12.3586
21.162
.707
.576
.900
q5
12.5034
22.905
.666
.585
.906
q6
12.2966
19.821
.856
.789
.891
q7
12.4414
21.679
.731
.702
.900
q8
12.4897
22.474
.646
.610
.905
q9
12.2069
21.249
.549
.380
.911
q10
12.0552
19.358
.715
.577
.901
Scale Statistics
Mean
13.6207
Variance
25.612
Semester 2 2015
Std. Deviation
5.06084
N of Items
10
30
STA60004
Examine the output and answer the following questions:

1. What is Alpha for the state anger scale?
2. What does it tell us about the scales overall internal consistency? (Remember:
when Alpha is: <.60 reliability is unacceptable, .61-.80 low to moderate, .81-.90
moderate to high, >.90 very high).
3. Which items do you think are contributing to the scales overall reliability and
why? (Those items with high item-total correlations (higher than .50) and Alpha if
item is deleted values which are lower than the overall Alpha.)
4. Which items would you suggest have poor reliability and why?
Semester 2 2015
31
STA60004
Also, alpha for the Trait Anger Scale and item-total correlations and alpha if item is
deleted for each item were obtained using SPSS.
SPSS output is as follows:
Case Processing Summary

N
Cases
Valid
%
149
99.3
.7
150
100.0
Excluded
Total
a. Listwise deletion based on all variables in the

procedure.
Reliability Statistics
Cronbach's
Alpha Based on
Cronbach's
Standardized
Alpha
Items
.838
N of Items
.843
10
Item Statistics
Mean
Std. Deviation
q11
1.8725
.73786
149
q12
1.8121
.81679
149
q13
1.6376
.72796
149
q14
2.2148
.83475
149
q15
2.4698
.85864
149
q16
1.6376
.65980
149
q17
2.0604
.79889
149
q18
2.5168
.92710
149
q19
1.5638
.76514
149
q20
2.5302
.90462
149
Semester 2 2015
32
STA60004
Inter-Item Correlation Matrix

q11
q12
q13
q14
q15
q16
q17
q18
q19
q20
q11
1.000
.621
.593
.549
.394
.501
.380
.255
.176
.325
q12
.621
1.000
.714
.426
.252
.562
.463
.299
.333
.227
q13
.593
.714
1.000
.507
.328
.555
.398
.329
.394
.191
q14
.549
.426
.507
1.000
.386
.216
.284
.283
.158
.439
q15
.394
.252
.328
.386
1.000
.207
.234
.440
.098
.469
q16
.501
.562
.555
.216
.207
1.000
.439
.231
.367
.064
q17
.380
.463
.398
.284
.234
.439
1.000
.359
.364
.329
q18
.255
.299
.329
.283
.440
.231
.359
1.000
.129
.404
q19
.176
.333
.394
.158
.098
.367
.364
.129
1.000
.053
q20
.325
.227
.191
.439
.469
.064
.329
.404
.053
1.000
Item-Total Statistics
Squared
Cronbach's
Scale Mean if
Scale Variance
Corrected Item-
Multiple
Alpha if Item
Item Deleted
if Item Deleted
Total Correlation
Correlation
Deleted
q11
18.4430
21.492
.655
.561
.813
q12
18.5034
20.900
.662
.610
.810
q13
18.6779
21.341
.690
.632
.810
q14
18.1007
21.470
.562
.448
.820
q15
17.8456
21.848
.490
.361
.828
q16
18.6779
22.787
.523
.464
.825
q17
18.2550
21.745
.554
.375
.821
q18
17.7987
21.594
.471
.314
.831
q19
18.7517
23.472
.331
.246
.841
q20
17.7852
21.981
.438
.391
.834
Scale Statistics
Mean
20.3154
Variance
26.515
Std. Deviation
5.14924
N of Items
10
Examine the output and answer the following questions:

5. What is Alpha for the trait anger scale?
Semester 2 2015
33
STA60004
6. What does it tell us about the scales overall internal consistency?
7. Which items do you think are contributing to the scales overall reliability and
why?
8. Which items would you suggest have poor reliability and why? (Look at the
content of the items for clues!)
Semester 2 2015
34
STA60004
Exercise 3
Test-retest reliability of the state and trait anger was assesses by calculating two
Pearsons correlation coefficients (State Trait Anger.sav file).
Select ANALYZE, CORRELATE, BIVARIATE
Select STATE1, STATE2, TRAIT1, TRAIT2.
Click OK to run the analysis.
Semester 2 2015
35
STA60004
SPSS output:
Correlations
state anger time state anger time
1
state anger time 1
Pearson Correlation
2
1
Sig. (2-tailed)
N
state anger time 2
trait anger time 1
trait anger time 2
Pearson Correlation
150
.454
**
Sig. (2-tailed)
.000
131
**
Pearson Correlation
.270
.454
**
trait anger time
trait anger time
2
.270
**
.363
.000
.001
.000
131
150
132
**
.243
.254
**
.005
.003
131
131
131
**
.243
.796
**
Sig. (2-tailed)
.001
.005
150
131
150
132
**
**
**
Pearson Correlation
.363
.254
.000
.796
Sig. (2-tailed)
.000
.003
.000
132
131
132
132
**. Correlation is significant at the 0.01 level (2-tailed).
Semester 2 2015
**
36
STA60004
Answer the following questions:

1. How reliable over time is state anger?
2. How reliable over time is trait anger?
3. Which is the most reliable and is this what you would expect?
Semester 2 2015
37
STA60004
Testing the Validity of a Scale

Validity assesses how well a scale measures what it intends to measure. For
example, a scale that is designed to measure emotional quality of life should not
measure depression, which is a related but different construct. The validity of a scale
depends on how we have defined the concept it is designed to measure. We must
document validity when evaluating new scales or when applying established scales
to new populations.
Content validity
Content validity refers to the adequacy with which a measure or a scale has sampled
from the intended universe of content (Gable & Wolf, 1993). It assesses the extent to
which the indicators (items of a scale) measure the different aspects of the concept.
The behavior domain to be tested must be systematically analysed to make sure that
all major aspects are covered by the test items, and in the correct proportions. For
example, a test of arithmetic skills that deals only with substraction and does not
measure ability at multiplication, addition or division lacks content validity. Whether
we agree that a scale has content validity depends on how the developer defines
and operationalises the concept it is designed to measure.
Content validity is not quantified with statistics. It is judged on qualitative, rather than
quantitative grounds. One way in which evidence concerning content validity can be
gathered is to ask a panel of expert judges to examine the scale and to assess the
degree to which relevant topic areas have been addressed. Items that judges are
doubtful about should be discarded.
Semester 2 2015
38
STA60004
Example (Litwin, 2003, p.34):

Content Validity: Marital Interaction
A researcher designs a new scale to collect data on marital interaction as a dimension
of health-related quality of life. The researcher develops a series of items about
spousal communication, interpersonal confidence, and discussions within the
marriage. She plans to use her new scale to assess the impact of social support on a
large population of married cancer patients who are undergoing a difficult and stressful
chemotherapy protocol.
Before administering her new scale, the researcher asks several oncologists,
psychologists, social workers, oncology nurses, cancer patients, and spouses of
cancer patients to review each of the items. The researcher asks these reviewers to
rate each item and the scale as a whole for appropriateness and relevance to the
issue of marital interaction. She also asks each reviewer to list any areas that are
pertinent to marital interaction but not covered in the items of the scale. Once all the
reviews are complete, the researcher studies them to determine whether her new
survey instrument has content validity.
---------------------------------------------------------------------------------------------------------------Face validity
Face validity is often confused with content validity. It is the least scientific of all the
validity measures. Face validity refers to how respondents perceive the
appropriateness of the test. Content validity is evident when the items are about
what you are measuring, and face validity is present when the items appear to be
about what you are measuring. Establishing face validity involves casual assessment
of item appropriateness. It might involve showing your test to a few untrained
individuals to see whether they think the items look right to them. Therefore, face
validity is not validity in the technical sense. However, face validity itself is a
desirable feature of tests.
Why is it important? If respondents consider the test to have face validity, they may
offer a more conscientious effort to complete the test. If a test does not have face
validity the respondents might rush through the test and take it less seriously.
Therefore the lack of tests face validity can affect testtakers cooperation or
motivation to do the test.
However, there are times when it is necessary that the construct being measured is
not evident to participants. For example, this is done when you need to avoid the
possibility of respondents faking good (appearing better than they are). Therefore,
depending on circumstances, there may be advantages or disadvantages to a tests
purpose being evident from its appearance.
Semester 2 2015
39
STA60004
Criterion validity
Criterion-related validity concerns the relationship that exists between scale scores
and some specified, measurable criterion. There are two types of criterion validity:
concurrent validity and predictive validity.
Concurrent validity
Concurrent validity is shown when a new measure relates concurrently to some
other measures of the same concept. Using this approach we compare a new test of
a concept with existing, well-accepted measures of the concept. The statistic
calculated is a correlation coefficient. If scores on both the new and the
established measure are highly correlated this is taken to mean that the new
measure is valid.
One of the major problems with this type of validation is the choice of an appropriate
criterion. We must assume the validity of the established measure against which we
assess our new measure. A low correlation between the new and existing measure
means that the new measure is invalid. However, the validity of the old measure
could be invalid.
You need to justify why you want to develop a new scale. If there is a good measure
of a construct which you use as a criterion, then you might be asked why your test is
necessary at all. Therefore you need a rationale for creating a new test. For
example, your new measure is simpler, more user-friendly, more useful or costeffective than the measure against which you have validated your test.
Semester 2 2015
40
STA60004
Example (Litwin, p.35):

Concurrent Validity: Pain Tolerance
A researcher develops a new 4-item index to assess pain tolerance in a group of
patients scheduled for surgery. The items draw information from patients memories of
their past experiences with pain. The researcher sums he results from the four items to
form a Pain Tolerance Index score. The higher the score, the greater the tolerance for
pain. The index is self-administered and takes about a minute for a patient to
complete. To assess concurrent validity, the researcher administers her 4 items
together with a published pain-tolerance survey instrument that has been in use for
more than a decade in anaesthesiology research and is generally accepted as the gold
standard in the field. It contains 45 items, requires an interviewer, and takes an
average of an hour to complete. It is also scored as a sum of item responses.
The researcher is able to gather data with both survey instruments in a sample of
patients. She calculates the correlation coefficient to be 0.92 between her new test of
pain tolerance and the gold-standard test of pain tolerance. She concludes that her
index has high concurrent validity with the gold standard. Moreover, her instrument is
much shorter and easier to administer.
---------------------------------------------------------------------------------------------------------------Another problem with concurrent validity is that for some concepts there are no
appropriate, well-established measures against which to check a new scale.
A different approach is to give the new measure to criterion groups. For example, a
new measure of political conservatism might be given to members of conservative
and radical political groups. If the members of the conservative group come out as
conservative on the test and the radical group members emerge radical, this
provides good evidence for the tests validity.
Predictive validity
A scales predictive validity is its usefulness in predicting future events, behaviours,
attitudes, or outcomes. Predictive validity may be used, for example, to predict
election winners, success of an intervention, or other objective criteria.
Like concurrent validity, predictive validity is calculated as a correlation coefficient
between the initial test and the secondary outcome. The following example
demonstrates that the Pain Tolerance index that the researcher tested for concurrent
validity in the previous example may also be tested for predictive validity.
Semester 2 2015
41
STA60004
Example (Litwin, p.38):

Predictive Validity: Pain Tolerance
The researcher from the previous example decides to use her Pain Tolerance Index to
predict narcotic requirements in patients undergoing an operation. Having tested her
index for reliability and concurrent validity, she now wants to test it for predictive
validity. She administers her index to 100 of her preoperative patients and calculates
an index score for each individual. (Recall that a high score reflects a high tolerance
for pain.) Once all the surgeries have been completed, the researcher reviews the
medical records. She notes the total number of doses of narcotic that were
administered for postoperative pain in each patient. She then calculates a correlation
coefficient between the two data elements: index score and number of narcotic doses.
She finds that the statistic is -0.84. as expected, there is a strong inverse correlation
between the Pain Tolerance Index and the amount of narcotic required after surgery.
The researcher is pleased to find that her index has high predictive validity in clinical
practice.
---------------------------------------------------------------------------------------------------------------Construct validity
Construct validity involves testing the scales performance in terms of theoretically
derived hypotheses concerning the nature of the underlying variable or construct.
According to Kline (1993) construct validity is the most important approach to validity
testing. Consideration of construct validity is particularly important when a single
criterion is not available to test criterion-related validity.
Support for a scales construct validity can be sought by exploring its relationship
with
other
constructs,
both
related
(convergent
validity)
and
unrelated
(discriminant validity). This involves the inspection of the pattern of correlations

between the new scale and other existing measures, both related and unrelated. Of
importance here is the direction and magnitude of the relationships in light of
theoretical predictions.
If we predict some variable, based on theory, to be positively related to constructs A
and B, negatively related to C and D, and unrelated to X and Y, then a scale that
purports to measure that construct should bear a similar relationship to measures of
those constructs. In other words, that scale should be positively correlated with
measures of constructs A and B (convergent validity), negatively correlated with
measures of C and D (convergent validity), and uncorrelated with measures of X and
Y (discriminant validity). The extent to which empirical correlations match the
predicted pattern provides some evidence of construct validity.
Semester 2 2015
42
STA60004
Different authors suggest different interpretations of the strength of the correlation

coefficient. When it comes to construct validity, most researchers use Cohens
guidelines:
r = .10 to .29 - small
r = .30 to .49 - medium
r = .50 to 1.0 - large.
Cohen, J.W. (2013). Statistical power analysis for the behavioral sciences. (2 nd ed.)
Hoboken : Taylor and Francis (eBook) available at Swinburne library
http://www.swin.eblib.com.au.ezproxy.lib.swin.edu.au/patron/FullRecord.aspx?p=1192162
(see pages 78-81)

---------------------------------------------------------------------------------------------------------------Example (Mueller, 1986):
To test the construct validity of a scale measuring attitude toward social welfare the
following hypotheses were proposed:
1. The scale could be correlated with scales measuring equality value, altruism
value and attitude toward the poor. Since these are similar constructs, moderate
to high correlations would be required to support convergent validity.
2. The scale would be expected to correlate negatively with measures of dissimilar
constructs, for example with the scale measuring independence value and scale
measuring competition value. Negative correlations would be expected to support
convergent validity.
3. The scale would be expected to correlate zero or nearly zero with measures of
unrelated constructs (e.g., extroversion and intelligence).
---------------------------------------------------------------------------------------------------------------The construct validity of a scale can also be explored by comparing scale scores for
groups of people who are known to differ in terms of the trait or characteristic under
investigation (known-groups validity). T-tests are then used to compare scale scores
for the two groups. The finding of significant differences would provide support
concerning the construct validity of the scale.
It is also recommended to include a measure of social desirability when
administering a new scale to a development sample. This is designed to assess the
Semester 2 2015
43
STA60004
degree to which scores on the scale under investigation are influenced by subjects
motivation to present themselves in a positive light.
Presenting a Newly Developed Scale

The points to be covered when presenting a newly developed scale are as follows:
Statement what the scale measures;
Justification for the scale (uses, advantages over existing measures);
How the pool of items was drawn up (details of sources, any special steps
regarding content or face validity);
Description of the sample used for testing;
Descriptive statistics: means, standard deviations, ranges;
Reliability statistics
Validity statistics;
The scale (instructions, questions)
Semester 2 2015
44
STA60004
Exercise 4
Evaluating the Psychometric Properties of the New Well-Being
Measures
(the Flourishing Scale and the Scale of Positive and Negative
Experience)
Read the article by Diener et al. and write a critical review of the scale, New WellBeing Measures.
Diener, E., Wirtz, D., Tov, W., Kim-Prieto, C, Choi, D-W., Oish, S., Biswas-Diener, R.
(2010). New Well-Being Measures: Short scales to assess flourishing and positive
and negative feelings. Social Indicators Research, 97, 143-156. (Available on the
Blackboard)
(http://web.ebscohost.com.ezproxy.lib.swin.edu.au/ehost/pdfviewer/pdfviewer?sid=97134d7
1-9537-473e-b00e-b0fed8a476cc%40sessionmgr113&vid=2&hid=106)
Your review should contain the following:

Statement of what the scale measures;
Justification for the scale (advantages over the existing measures)
Reliability of the scale (comment on internal consistency and test-retest
reliability statistics)
Validity of the scale. Is there evidence to verify the scale measures what it
purports to measure? (e.g., correlations with similar tests etc.)
Overall conclusion and your suggestions for future research regarding the
validation of the scale.
Semester 2 2015
45
STA60004
Bibliography
De Vellis, R.F. (2003) Scale development: Theory and applications (2nd ed.).
Thousand Oaks, CA: Sage
Kline, P. (1986). A handbook of test construction: Introduction to psychometric
design. London: Methuen.
Kline, P. (2000). A psychometrics primer. London: Free Association Books.
Kline, J.B. (2005). Psychological testing: A practical approach to design and
evaluation. Thousand Oaks, CA: Sage.
Litwin, M.S. (2003). Survey kit, Vol.8: How to assess and interpret survey
psychometrics, Thousand Oaks, CA: Sage
Mueller, D.J. (1986). Measuring social attitudes: A handbook for researchers and
practitioners, New York: Teachers College Press
Netemeyer R.G., Bearden W.O., & Sharma S. (2003) Scaling procedures: Issues
and applications. Thousand Oaks, CA: Sage.
Semester 2 2015
46
STA60004
Answers to Exercises
Exercise 1
If she wants to ensure content validity, she must tailor her survey instrument to the
needs of the students themselves. The best way to start would be to put together a
focus group of students currently living in the campus dorms. During this exploratory
session, she could get an idea of what issues are important to the students. She
might then put together a first draft of her questionnaire and show it to these
students for their comments. This would provide initial testing of content validity.
Exercise 2
1. Alpha = .91
2. High internal consistency
3. & 4. All items are good
5. Alpha = .84
6. High internal consistency
7. All except q19
8. q19
Exercise 3
1. Test-retest for state anger is .45; not reliable over time.
2. Test-retest for trait anger is .80; reliable.
3. Trait anger is more reliable as evidenced by the larger correlation coefficient. We
would expect a persons general disposition towards anger to remain fairly stable
over time if the scale is reliable, whereas we would expect state anger to be a
result of the situation and therefore be more susceptible to change.
Semester 2 2015
47
STA60004
Exercise 4
The Flourishing Scale
Statement of What the Scale Measures
The scale assesses major aspects of socialpsychological functioning (social
psychological prosperity); specifically it assesses the respondents self-perceived
success in important areas such as relationships, self-esteem, purpose and
optimism. It is an 8-item summary measure which provides a single psychological
well-being score.
All items in the scale are phrased in a positive direction. Each item is answered on a
17 point scale that ranges from strong disagreement to strong agreement. A high
score represents a person with many psychological resources and strengths. High
scores signify that respondents view themselves in positive terms in important areas
of functioning.
Justification for the Scale (Advantages over the Existing Measures)
It is a brief scale which measures an overall psychological well-being. The scale
does not assess the individual components of socialpsychological well-being.
However, if an overall psychological well-being score is needed, and a brief scale is
desirable, the FS appears to be useful.
Reliability of the scale
To test reliability and validity of the scales convenient samples of university students
were used. The total sample comprised of 689 participants (468 females and 175
males); 181 participants were from Singapore Management University, the rest
respondents were from five American universities.
Internal consistency was found to be high (Cronbachs alpha = .87). Test-retest
reliability, assessed one month apart (N=257), was moderate (r = .71).
Semester 2 2015
48
STA60004
Validity of the scale

Scales used for testing convergent validity:
Scales of Psychological Well-Being (Ryff, 2008);
Basic Need Satisfaction Scales (Ryan & Deci, 2000);
The Satisfaction with Life Scale (Diener et al., 1985) - traditional subjective wellbeing measure;
LOT-R (Scheier, Carver, & Bridges, 1994) - assesses optimism;
The UCLA Loneliness Scale (Russell, 1996) - a measure of poor social

relationships;
Cantrils Ladder.
The Flourishing Scale correlated at substantial levels with the other wellbeing
measures (r ranged from .54 to .73), except at a medium level with the Ryffs
autonomy scale (r = .43) and at low level with the Loneliness scale (r = -.28).
Men and women did not score significantly different on the scale.
Respondents in Singapore scored lower than American students. It is not clear
whether the difference was statistically significant.
The Scale of Positive and Negative Experience (SPANE)
Statement What the Scale Measures
The scale assesses subjective feelings of well-being and ill-being. Assessment is
based on the amount of time the feelings were experienced during the past 4 weeks.
Six items of the scale assess positive feelings and six items assess negative
feelings. For both the positive and negative items, three of the items are general
(e.g., positive, negative) and three are more specific (e.g., joyful, sad). The summed
positive/negative score (SPANE-P/SPANE-N) can range from 6 to 30. The positive
and negative scales are scored separately because of the partial independence or
separability of the two types of feelings. However, the two scores can be combined
by subtracting the negative score from the positive score, and the resulting SPANE-B
score (an overall affect balance score) can range from -24 (unhappiest possible) to
24 (happiest possible).
Semester 2 2015
49
STA60004
Justification for the scale

The scale assesses with a few items a broad range of negative and positive
experiences and feelings, not just those of a certain type. This allows the scale to
reflect the full range of emotions and feelings that a respondent might feel, both bad
and good, without creating a list of hundreds of items to fully represent the diversity
of positive and negative feelings.
The authors suggest that current scales, in giving equal weighting to all items, can
obscure the fact that a person might feel quite positive or negative but not feel many
of the specific emotions listed on the scale (p.145). They also claim that an issue
with the most popular current scale of emotions is that the items are all high
arousal feelings, and many are not considered emotions or feelings. For example,
the words active and strong need not refer to feelings (p.145).
The assessment is based on the amount of time the feelings were experienced
during the past 4 weeks. Therefore, responses might be more comparable across
respondents than is the intensity of feelings. The last 4 weeks is considered to be
short enough to allow the respondent to recall actual experiences, and is an
adequate time period to avoid assessing a short-term mood.
Reliability of the SPANE Scale
Internal consistency was found to be good (Alpha: SPANE-P = .87, SPANE-N = .81,
SPANE-B = .89). The items with the lowest item-total correlations were afraid and
angry. Those items assess specific emotions. Test-retest reliability coefficients,
assessed one month apart (N=257), were found to be low (r = .62, .63, and .68 for
SPANE-P, (SPANE-N, and SPANE-B, respectively).
Semester 2 2015
50
STA60004
Validity of the SPANE Scale

Scales used for testing validity:
PANAS (Watson et al., 1988) - the most widespread measure of positive and
negative feelings;
SHS (Lyubomirsky & Lepper, 1999) - 4-item scale of happiness;
Fordyces (1988) a single item measure of happiness;
The Satisfaction with Life Scale (Diener et al., 1985) - traditional subjective wellbeing measure;
LOT-R (Scheier, Carver, & Bridges, 1994) - assesses optimism;
The UCLA Loneliness Scale (Russell, 1996) - a measure of poor social

relationships;
Cantrils Ladder.
The scales correlated at substantial levels with the other measures, except at a low
level with the Loneliness scale (r = -.32 (SPANE-P), r = -.29 (SPANE-N), r = -.34
(SPANE-B)).
Men and women did not score significantly differently on the scale.
Respondents in Singapore were reported to score lower than American students, but
it is not clear whether the difference was statistically significant.
Overall conclusion
The Flourishing Scale performed well, with high internal consistency, modest testretest reliability and high convergence with similar scales. Although it does not
assess the individual components of socialpsychological well-being, the FS seems
to be a good assessment of overall self-reported psychological well-being. If an
overall psychological well-being score is needed, and a brief scale is desirable, the
FS appears to be adequate.
The Scale of Positive and Negative Experience performed well in terms of internal
consistency and convergent validity with other measures of emotion, well-being,
happiness, and life satisfaction. Temporal stability was found to be low. The authors
claim that the scale has advantages over other existing measures of feelings. The
Semester 2 2015
51
STA60004
scales assess all positive and negative feelings, not just specific feelings. The
SPANE is an improvement on existing scales by succinctly measuring a broad range
of feelings based on the recent experience and duration of those feelings. It is also
purportedly less culturally specific which may increase its utility.
Authors suggestions for future research
The samples only included students. Broader samples should be a high priority for
future studies.
Establishing stability of the scales over longer time periods beyond 1 month is
required.
Validity studies should determine the associations of the scales with nonself-reported
assessments of the same concepts (e.g., from informants).
The scales should be tested for predicting nonself-reported behaviors.
The degree to which the new scales and existent scales differ and converge across
cultures and groups should be analysed.
Additional Suggestions for Future Research
The scales should be tested on a random sample drawn from a wider population.
Angry and Afraid were not as well correlated with other items in the SPANE-N
scale, and may warrant further analysis to see if other words give more consistent
correlation with other items in the SPANE-N scale.
The test could be compared across groups known to have higher levels of the
concept against groups known to have lower levels of the concept
The researchers did not assess the scale against unrelated constructs to examine
discriminate validity.
As responses to some FS items could be influenced by social desirability it is
recommended that a social desirability measure is included to assess its impact on
the scores.
Semester 2 2015
52

Module 1
Topic 6: Coding and Cleaning Survey Data
STA60004
Contents
Learning Objectives
Optional Reading
Coding Open-Ended Questions
Exercise 1
Exercise 2
Exercise 3
11
Exercise 4
12
Coding Missing Data
13
Checking for Coding Errors
13
Preparing Variables for Analysis
14
Changing Categories
14
Creating New Variables
18
Standardising Variables
19
Dealing with Missing Data
19
Bibliography
22
Solutions to Selected Exercises
23
Semester 2 2015
STA60004
On completion of this topic you will:
1. Understand the purpose of coding;
2. Be familiar with standard code-frames, such as those developed by the
Australian Bureau of Statistics;
3. Be able to create code-frames for open-response questions;
4. Be able to prepare variables for analysis;
5. Know how to change, collapse and reorder the categories of variables;
6. Know how to create new variables from existing ones;
7. Know how to deal with missing data.
Optional Reading
Chapters 9 and 10.
Semester 2 2015
STA60004
Coding Open-Ended Questions

Coding is the process of classifying answers to open-ended questions and
converting answers to numbers.
Open-ended response format is useful in the following situations:
To collect attribute information where the number of response options is too large
to precode:
Where were you born?
To collect attitudinal information where the response options are unknown, or
feedback is required:
What aspects of this subject interest you the most?
To get at general feelings;
To find out respondents reasons for their opinions.
Some pre-coded questions have an Other category. Sometimes it is necessary to
create additional codes to separate Other responses into individual response
categories
There is a trade-off between the detail given in a response, and the ability to group
and summarise different respondents answers to the same question.
The more codes the greater the detail of information that remains about the
responses;
The fewer the codes, the easier the data analysis. However, the danger is that
the summary codes may be too general to provide meaningful or useful
information.
Open-ended questions are coded by:
Using pre-existing coding schemes; or
Developing a coding scheme based on the responses given by respondents
Semester 2 2015
STA60004
Using Pre-Existing Coding Schemes

For standard questions such as occupation, religion, country etc. there can be many
possible responses and existing standardised coding schemes are very often used
for coding those questions.
Reasons for using standard coding schemes:
They are systematic and have been developed by experts;
They are publicly available and make the coding schema more transparent;
They may reduce coder error;
They enable to use the same classification system for repeated surveys;
They allow making comparisons.
Examples of standard coding schemes:
ABS website www.abs.gov.au
Go to Statistics, then Topic, then Statistical Classifications and Standards, then
Classifications.
Most standard classification schemes allow for coding at different levels of detail. For
example, Standard Australian Classification of Countries has three levels of
classification: Major Groups/ Minor Groups/ Countries
Semester 2 2015
STA60004
Semester 2 2015
STA60004
Developing a Coding Scheme Based on the Responses Given

Read through a selection of responses to review the content;
Summarise responses into themes;
Group themes into broad topics (if needed);
Calculate the number of responses associated with each theme (if needed);
Generate a frequency distribution for each theme (if needed).
Semester 2 2015
STA60004
Exercise 1
The following question was asked of respondents in a survey about mobile phones:
Should children be allowed to bring mobile phones to school? Please give reasons.
A list of 10 peoples answers to this question is presented below. Create a code-list
for this question.
ID
Answer given by respondent
No, mobiles can be distracting in class
Yes, of course. Mobile phones keep children safer. They can call their
parents in case of an emergency.
No, children shouldnt be allowed to use mobile phones at all. There are
possible health risks from using mobile phones. Some research suggests
that the radio waves from mobile phones may harm peoples brains.
Yes, in an emergency, kids can call for help quickly.
Children shouldnt be allowed to bring mobile phones to school. There

have been many cases of students using mobiles to cheat in tests.
No, kids will be texting, playing games etc. instead of doing class work.
Mobile phones shouldnt be used in schools. They take students attention

away from their lessons.
No, mobile phones are a distraction from school work.
No, mobile phones are too expensive for children. Even if some models
are cheap to buy, calls are expensive. Many kids run up big bills their
parents have to pay.
10
Yes, why not. Mobile phones are now a normal part of modern life.
Code(s)
Code list:
1
2
..
Semester 2 2015
STA60004
Exercise 2
The following question was part of a survey for students who have used the
Blackboard Learning Management System during their studies. This survey
investigated students access to the flexible provision of learning and their use of
Blackboard. This study was done several years ago when Blackboard was
introduced.
Question: My Blackboard subject pages assist my learning
Rating scale:
1. not at all
2. to some extent
3. significantly
Any comments?
Task: Code the following responses to the open-ended question Any
comments? and write a brief summary
As long as the information is provided in a way that allows private study to attain
the required standard. Unfortunately this is not consistent across subjects
Some pages have assisted more than others
The Discussion Boards were very helpful.
Lecture notes.
Only with discussion boards to ask questions.
Quicker, more flexible and more convenient access to information relating to
subjects!
My assessment and schedule is available there, but my learning is mostly
facilitated by newsgroup interaction, textbook reading and CD (offline) content.
Depend on subjects and lecturers. Some subjects with discussion board would
be better, but the lecturers should visit the boards to discussion with students.
External links for reading etc. are great.
You can study anywhere if you have a hardcopy of the notes. reading off the
screen is detrimental to their health, it is also very annoying.
Course material is good to have on the web, but without explanation, its
useless.
Semester 2 2015
STA60004
It actually takes quite a lot of time each week to download lecture notes and print
them out. I'd much rather have the lecture slides in a hardcopy book from the
bookshop, despite the cost.
Very helpful... keeps me up to date, helps with revision, keeps me on task more
than tutorials etc
Lecture notes etc helpful, however that is about it.
Maybe all online material should be made available on a CD-ROM, as well as
printed in the bookshop and sold for a small price to cover costs.
Depending on the lecturer and the availability of material on line
Availability of taped lectures from home or work would be really helpful.
Nothing is as good as learning at school in the class room
I have found the information related to the subjects that I have been studying
very, very helpful.
Using blackboard can be significantly slower and more frustrating than being
provided with printed copies of course notes. Printing your own notes costs more
than buying a copy from the bookshop, they doesn't last as long (not bound), get
lost more easily
Good additional resource to lecture notes etc
It depends on the subject and how well the material is presented.
Online access to subject pages is a definite bonus.
Amazing can be accessed anywhere
Need to train staff and lecturers or provide the resources to get content on site.
Need to train students in the use of it.
It saves time that I would otherwise have to spend traveling to Swinburne to
access information.
Allows me to get online information without going to library.
Semester 2 2015
10
STA60004
Exercise 3
Thematic Coding
Based on the research of Marwell and Schmitt (1967, 1990)
Marwell, G. & Schmitt, D.R. (1967). Dimensions of compliance-gaining behavior: An
empirical analysis. Sociometry, 39, 350-364.
Marwell, G. & Schmitt, D.R. (1990). An introduction. In J.P. Dillard Seeking
compliance: The production of interpersonal influence messages. Scottsdale, AZ:
Gorsuch Scarisbrick, pp. 3-5.
Hypothetical situation: Imagine that your teen-age son, Nick, who is a high school
student, has been getting poor grades. You want him to increase the amount of time
he spends studying from 6 to 12 hours a week.
Task: Try to describe and classify the following compliance-gaining strategies:
Example
Description of Strategy
"You offer to increase Nick's allowance if

he increases his studying."
If you comply, I will reward Promise/ Reward

you
"You threaten to forbid Nick watching TV

if he does not increase his studying."
"You point out to Nick that if he gets good

grades he will be able to get into a
university and get a good job."
"You point out to Nick that if he does not

get good grades he will not be able to get
into a university or get a good job."
"You try to be as friendly and pleasant as

possible to get Nick in the right frame of
mind' before asking him to study."
"You raise Nick's allowance and tell him

you now expect him to study."
"You, forbid Nick to watch TV and tell him

he will not be allowed to watch his
favourite programs until he studies more."
"You point out that you have sacrificed

and saved to pay for Nick's education and
that he owes it to you to get good enough
grades to get into a good university.
"You tell Nick that it is morally wrong for

anyone not to get as good grades as he
can and that he should study more.
Semester 2 2015
Strategy
11
STA60004
10
"You tell Nick he will feel proud if he gets

himself to study more."
11
You tell Nick he will feel ashamed of

himself if he gets bad grades
12
You tell Nick that since he is a mature

and intelligent boy he naturally will want
to study more and get good grades
13
"You tell Nick that only someone very

childish does not study as he should.''
14
You tell Nick that you really want very

badly for him to get into a university and
that you wish he would study more as a
personal favor to you."
15
"You tell Nick that the whole family will be

very proud of him if he gets good grades."
16

very disappointed (in him) if he gets poor
grades."
Exercise 4
Task: Using your classification or Marwell and Schmitts classification (see
Solutions to Exercises), code the following examples (the topic now is
Divorce)
Example of Compliance-Gaining Strategy
1
Youll see. Youll be a lot better off without me; youll feel a lot better after
the divorce.
Only a cruel and selfish neurotic could stand in the way of anothers
happiness.
If you dont give me a divorce, youll never see the kids again.
Only a selfish creep would force another person to stay in a relationship.

Youll hate yourself if you dont give me this divorce.
Any intelligent person would grant their partner a divorce when the
relationship had died.
Semester 2 2015
Strategy
12
STA60004
Coding Missing Data

The codes for missing data should be different from a valid code (a code which
represents an actual answer to the question). If available, -1, 0, or 9 are usually
used.
There are different reasons why people do not provide answers to questions.
Therefore, different codes are often given to different types of missing data.
Main types of non-response to questions:
The respondent was not required to answer the question;
Not ascertained: maybe the interviewer missed the question, or the respondent
missed the question, or it was not clear what someones answer was;
The respondent refused to answer;
The respondent did not know the answer or did not have an opinion.
Checking for Coding Errors

Sources of Error:
Data was entered in the wrong columns for some cases;
Miscoding happened during
-
data collection phase;
manual coding of answers;
data entry phase.
Methods for Checking for Coding Errors:

Valid Range Checks
Obtaining frequency distributions of all variables and checking whether all codes
are within the expected range
Filter Checks
If contingency questions are asked, some questions should only be answered by
certain people depending on how they answered a previous question (e.g.,
someone who recorded they had no paid job should not answer questions about
their job satisfaction). Invalid responses can be detected by cross-tabulating the
paid job answers with the job satisfaction answers.
Semester 2 2015
13
STA60004
Logical Checks
Certain set of responses will be illogical (e.g., if someones age is coded as 16 it
seems illogical if that persons highest level of education is recoded as PhD).
Preparing Variables for Analysis

Changing Categories
Decisions about the number of response categories are usually made when
constructing the questionnaire and when post-coding open-ended data. However
sometimes you will need to refine these codes.
Collapsing Categories
Collapsing categories is used when the initial coding of a variable resulted in more
categories than we require. The advantage of the initial detail, though, is that it
provides the flexibility to enable us to collapse the categories in a variety of different
ways.
Reasons for collapsing categories of variables:
The detailed coding may not reflect the form of the variable which is relevant to
the research problem (e.g., we might recode detailed occupational codes into
blue-collar and white-collar categories).
If there are very few people in a category it is often better to combine the
category with another suitable category because very low frequencies can
produce misleading tables and statistics.
Collapsing categories can highlight patterns in the data.
There should always be a sound justification for collapsing categories. Care should
be taken not to combine the categories in such a way as to mask a relationship as is
shown in the Table:
Semester 2 2015
14
STA60004
An illustration of how recoding can mask a relationship (de Vaus, p. 165)

Unrecoded version
Male
Female
Strongly agree
50%
15%
Agree
10%
45%
Disagree
30%
5%
Strongly
disagree
10%
35%
500
500
Recoded version
}
}
Male
Female
Agree
60%
60%
Disagree
40%
40%
500
500
Approaches to Collapsing Categories:

Substantive Approach
Distributional approach
The Substantive Approach
This approach involves combining categories that have something in common.
For example, occupations could be collapsed into industry-based categories (e.g.,
health, transport, agriculture, construction etc.). Or occupations could be classified
according to the amount of training involved: occupations which require a degree are
put in one category, occupations requiring a diploma in another category.
With ordinal and interval variables which have ranked categories, collapsing is done
by establishing cutting points along a continuum. For example, we might divide a
nine-point scale (1-9) into three groups so that approximately the same number of
codes are contained in each category.
The Distributional Approach
This approach is restricted to ordinal and interval variables.
The meaning of a particular response to a question is sometimes better interpreted
in relative than in absolute terms. For example, is a persons income of $30 000
regarded as low, medium or high? It depends on the other incomes with which it is
compared. If most people earn less, then it is relatively high; if most people earn
more, then it is relatively low. This approach has the advantage of letting the data
define what is low, medium or high.
Semester 2 2015
15
STA60004
The distributional approach involves dividing the sample up into roughly equal sized
groups of cases. The substantive approach involves dividing the categories of the
variable into equal lots.
Rearranging categories
Involves arranging categories in a more logical order
Reasons for rearranging categories:
Creating an order more appropriate to the focus of the analysis;
Making tables easier to read;
Changing the level of measurement of a variable and thus affecting the methods
of analysis that can be applied to the variable.
Imagine we have a variable indicating the industry in which a person works. Table 2
shows the initial order of industry categories. Suppose we want to perform analysis
that is focusing on unionization in the workplace and its impact on job satisfaction.
For this analysis it might be better to organize the industry categories according to
the level of unionization of the industry. This would provide a logical order to the
categories and make it easier to read tables later on. The table shows the revised
version in which the categories of the variable are rearranged in order to reflect the
unionization of the industry.
Table 2: Rearranging categories into a logical order appropriate to project
a) Original version
Code
b) Revised version
Industry
New
code
Industry
15
Agriculture,
fishing
in unions
and
in unions
Agriculture,
fishing
Mining
54
Wholesale and retail
18
Manufacturing
40
Construction
37
Electricity, gas and water
59
Manufacturing
40
Construction
37
Mining
54
Wholesale and retail
18
Electricity, gas and water
59
Semester 2 2015
forestry
forestry
and
15
16
STA60004
Reverse coding
Reverse coding is mostly used when constructing scales. A scale is a composite
measure of a concept that is created by asking respondents a set of questions and
then combining answers to those questions into a single composite measure of the
underlying concept.
Each of the variables that constitute the composite measure should be scored in the
same direction. However, when constructing items for a scale it is normal to mix up
the direction of the statements to which people respond: some will be positive and
some will be negative. If we want to combine variables that are coded in different
directions we need to reverse code some variables so that they are all coded in the
same direction.
Suppose a person was asked to complete the following questionnaire (Vulnerability
Facet of Neuroticism Scale, Costa & McCrae, 1992):
Question
Strongly
disagree
Disagree
Neutral
Agree
Strongly
agree
1. I often feel helpless and want someone else to

solve my problems.
2. I feel I am capable of coping with most of my

problems. (R)
3. When I am under a great deal of stress,
sometimes I feel like Im going to pieces.
4. I keep a cool head in emergencies. (R)

5. Its often hard for me to make up my mind.
6. I can handle myself pretty well in a crisis. (R)
7. When everything seems to be going wrong, I

can still make good decisions. (R)
8. Im pretty stable emotionally. (R)
To calculate the persons Vulnerability score we need to add up all items scores:
Vulnerability Score = Q1 score + Q2 score + Q3 score + Q4 score + Q5 score + Q6
score + Q7 score + Q8 score.
Before doing this we would need to reverse code some items (Q2, Q4, Q6, Q7, and
Q8).
Vulnerability Score = Q1 + Q2 + Q3 + Q4 + Q5 + Q6 + Q7 + Q8 = 2 + 1 + 1 + 2 + 3
+ 2 + 2 + 2 = 15
Semester 2 2015
17
STA60004
Creating New Variables

New variables can be created from existing ones by using information from a set of
questions. This is done in one of three ways:
1. Developing scales;
2. Using conditional transformations;
3. Using arithmetic transformations.
Developing scales was discussed in Topic 5 of this course.
Conditional transformations
Conditional transformation involves specifying a new variable and its categories and
then specifying the conditions a person must meet to be placed in a given category.
Example (de Vaus, p.169)
Suppose that in a study of marriages we want to create a variable that reflects the
marital history of both husband and wife. We would create three categories: 1) firsttimer marriage; 2) mixture; 3) both previously married.
Conditional transformations are performed in most computer packages by using IF
statements.
Arithmetic transformations
Arithmetic transformations are used for interval level variables. New variables can be
created by various arithmetic computations.
Suppose we want to study if the age difference between a husband and a wife
affected the degree of equality in their marriage. We can construct a new variable by
substracting the wifes age from the husbands age to indicate the age difference.
Suppose we obtained information about respondents annual income but for our
study we need to know their fortnightly income. This can be achieved by creating a
new variable by dividing annual income by 26 (number of fortnights in the year) to
construct a new variable indicating fortnightly income.
Semester 2 2015
18
STA60004
Standardizing Variables
Sometimes we are interested not in the exact scores people have on a variable but
their scores relative to other people in the sample. In this case we need to
standardize variables.
Some situations when standardization may be required:
1. Comparing and combining scores on variables with very different distributions;
2. Comparative
studies
where
units
of
measurement
(e.g.,
income)
are
incomparable;
3. Change over time where the value of units changes over time (e.g., income
changes with inflation) so adjustments need to be made to express income in
some common unit that removes the effect of inflation. (de Vaus, p.171)
For interval-level variables raw scores are usually converted into z-scores. For
ordinal-level variables scores are usually converted into percentiles.
Dealing with Missing Data

Checking for Missing Data Bias
Sometimes people for whom we have missing values on a variable can be different
from those with valid values. For example, those who skipped questions about
income may have other characteristics such as ethnic background or education level
in common. Because of this the results of the analysis could be biased because
some types of people are under-represented in the analysis of that variable.
To assess whether missing data introduce bias, divide the sample into two groups:
those with missing values and those without missing values on a particular variable.
Then use cross-tabulation or comparison of means to investigate whether the two
groups answered other questions differently.
Methods for Dealing with Missing Data

Missing values reduce the number of cases available for analysis.
Approaches for dealing with missing data:
1. Deleting either cases or variables from the analysis;
2. Statistical imputation: substituting the missing values with a new, best guess
value
Semester 2 2015
19
STA60004
1. Deleting either cases or variables

List-wise deletion of cases;
Pair-wise deletion of cases;
Deletion of variables
List-wise deletion
Using this approach, any case that has missing data on any of the set of variables is
deleted.
Problem with this approach:
It can lead to the loss of a lot of data and reduction in sample size. Valid answers on
many questions will be lost because of a non-answer on one question.
Pair-wise deletion
In this approach, instead of deleting all cases with any missing number, the
researcher uses only the cases with complete responses for each calculation. For
example, in the case of correlations, the correlations between each pair of variables
are calculated from all cases having valid data for those two variables even if those
cases have missing values on other variables. As a result, different calculations in an
analysis may be based on different sample sizes.
Deletion of variables
If a particular variable has a large number of the missing values, that variable can be
omitted from the analysis.
Advantage: You do not lose any cases
Advisability of this approach depends on how important that particular variable is for
the analysis.
2. Statistical imputation:
Substituting the missing values with a new, best guess value
Sample mean approach;
Group means approach;
Random assignment within groups;
Regression analysis
Semester 2 2015
20
STA60004
Sample mean approach

This approach involves replacing missing values with the value of the mean of that
variable.
It reduces the variability of the sample on the variable and hence reduces the
correlation between this and other variables.
Group means approach
This approach involves using group means rather than the overall sample mean.
-
Divided the sample into groups on a background variable;
Calculate the mean for the missing data variable within each category of the
background variable;
Replace missing values with the corresponding group mean.

It exaggerates the extent to which people in a group are similar to one another.
Random assignment within groups
-
Divide the sample into groups on a background variable;
Locate a case with missing data on a particular variable;
Find the value on the same variable of the nearest preceding case with a valid
code;
Substitute this value for the missing value.
Advantage of this method:

Missing values are replaced by a variety of different values. Hence the variability of
the sample is not affected.
Regression analysis
This method involves using regression to predict the values of missing data.
Researchers tend to have mixed feelings about replacing missing values. If you
decide to use imputation, you should consider analysing the data both with and
without the missing value replaced and then comparing the results to make sure that
the method of replacement does not lead to a different interpretation of the data that
you would have come to otherwise.
Semester 2 2015
21
STA60004
Bibliography
Costa, P.T.Jr., & McCrae, R.R. (1992). NEO-PI-R professional manual. Odessa, FL:
Psychological Assessment Resources.
Marwell, G. & Schmitt, D.R. (1967). Dimensions of compliance-gaining behavior: An
empirical analysis. Sociometry, 39, 350-364.
Marwell, G. & Schmitt, D.R. (1990). An introduction. In J.P. Dillard Seeking
compliance: The production of interpersonal influence messages. Scottsdale, AZ:
Gorsuch Scarisbrick, pp. 3-5.
Semester 2 2015
22
STA60004
Solutions to Selected Exercises

Exercise 1
The following question was asked of respondents in a survey about mobile phones:
Should children be allowed to bring mobile phone to school? Please give reasons.
A list of 10 peoples answers to this question is presented below. Create a code-list
for this question.
ID
Answer given by respondent
Code(s)
No, mobiles can be distracting in class
Yes, of course. Mobile phones keep children safer. They

can call their parents in case of an emergency.
No, children shouldnt be allowed to use mobile phones at

all. There are possible health risks from using mobile
phones. Some research suggests that the radio waves from
mobile phones may harm peoples brains.
Yes, in an emergency, kids can call for help quickly.
Children shouldnt be allowed to bring mobile phones to

school. There have been many cases of students using
mobiles to cheat in tests.
No, kids will be texting, playing games etc. instead of doing

class work.
Mobile phones shouldnt be used in schools. They take

students attention away from their lessons.
No, mobile phones are a distraction from school work.
No, mobile phones are too expensive for children. Even if

some models are cheap to buy, calls are expensive. Many
kids run up big bills their parents have to pay.
10
Yes, why not. Mobile phones are now a normal part of

modern life.
Code list:
1. Yes, keep children safe
2. Yes, normal part of life
3. No, distracting in class
4. No, health risks
5. No, cheating in class
6. No, too expensive
Semester 2 2015
23
STA60004
Exercise 3
Try to describe the following compliance-gaining strategies
Marwell and Schmitt classified the strategies as follows:
(You may have different classification system)
Example of Compliance-Gaining Strategy
Description of Strategy
Strategy
"You offer to increase Nick's allowance if he

increases his studying."
If you comply, I will reward you
Promise/
Reward
"You threaten to forbid Nick watching TV if

he does not increase his studying."
If you do not comply I will

punish you
Threat
"You point out to Nick that if he gets good

grades he will be able to get into a university
and get a good job."
If you comply you will be

rewarded because of "the
nature of things
Expertise
(Positive)
"You point out to Nick that if he does not get

good grades he will not be able to get into a
university or get a good job."
If you do not comply you will

be punished because of "the
nature of things
Expertise
(Negative)
"You try to be as friendly and pleasant as

possible to get Nick in the right frame of
mind before asking him to study."
Actor is friendly and helpful to

get target in "good frame of
mind" so that he will comply
with request
Liking
"You raise Nick's allowance and tell him you

now expect him to study."
Actor rewards target before

requesting compliance
Pre-Giving
"You forbid Nick to watch TV and tell him he

will not be allowed to watch his favourite
programs until he studies more."
Actor continuously punishes

target making cessation
contingent on compliance
Aversive
Stimulation
"You point out that you have sacrificed and

You owe me compliance
saved to pay for Nick's education and that
because of past favors
he owes it to you to get good enough grades
to get into a good university.
Debt
"You tell Nick that it is morally wrong for

anyone not to get as good grades as he can
and that he should study more.
You are immoral if you do not

comply
Moral Appeal
10
"You tell Nick he will feel proud if he gets

himself to study more."
You will feel better about

yourself if you comply
Self-Feeling
(Positive)
11
You tell Nick he will feel ashamed of

himself if he gets bad grades
You will feel worse about

yourself if you do not comply
Self-Feeling
(Negative)
12
You tell Nick that since he is a mature and

intelligent boy he naturally will want to study
more and get good grades
A person with good qualities

would comply
Altercasting
(Positive)
Semester 2 2015
24
STA60004
13
"You tell Nick that only someone very

childish does not study as he should.''
Only a person with "bad"

qualities would not comply
Altercasting
(Negative)
14
You tell Nick that you really want very badly

for him to get into a university and that you
wish he would study more as a personal
favor to you."
I need your compliance very

badly, so do it for me
Altruism
15

very proud of him if he gets good grades."
People you value will think

better of you if you comply
Esteem
(Positive)
16

very disappointed (in him) if he gets poor
grades."
People you value will think

worse of you if you do not
comply
Esteem
(Negative)
Semester 2 2015
25

Research Design Notes Weeks 1 To 6

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Research Design Notes Weeks 1 To 6

Uploaded by

Copyright:

Available Formats

STA60004 Research Design

Topic 1: Introduction to Survey Research

Topic 1: Introduction to Survey Research

Classical Experimental Design...

Cross-Sectional Survey Research Design.... 6

Topic 1: Introduction to Survey Research

Chapter 2: Social research strategies

Topic 1: Introduction to Survey Research

In this topic, four major types of research design will be discussed:

Topic 1: Introduction to Survey Research

To determine the characteristics and preferences of the shops customers;

Classical Experimental Design

Topic 1: Introduction to Survey Research

Typically, in the experimental design the researcher manipulates a causal variable

Cross-Sectional Survey Research Design

Topic 1: Introduction to Survey Research

Topic 1: Introduction to Survey Research

Topic 1: Introduction to Survey Research

Topic 1: Introduction to Survey Research

3. Differences Between Groups

Topic 1: Introduction to Survey Research

Topic 1: Introduction to Survey Research

participate in the QUIT program stopped smoking. We cannot conclude with

Topic 1: Introduction to Survey Research

An example of a trend study:

(The article is available on the Blackboard)

(The article is available on the Blackboard)

Topic 1: Introduction to Survey Research

Approximating Longitudinal Designs

Retrospective experimental design (or quasi-experimental design):

Topic 1: Introduction to Survey Research

Case Study Design

Topic 1: Introduction to Survey Research

Topic 1: Introduction to Survey Research

Topic 1: Introduction to Survey Research

b. After observing a sample of childless married couples over a ten-year period we

Topic 1: Introduction to Survey Research

Topic 1: Introduction to Survey Research

Topic 1: Introduction to Survey Research

Research Methods (Methods of Data Collection)

Testing cholesterol levels;

Counting ballots in a local election.

Another method of data collection is

Topic 1: Introduction to Survey Research

Research Methods (Methods of Data Collection):

Topic 1: Introduction to Survey Research

Example (Fink, V.6, pp.24-25)

Questionnaires can also be used during an intervention to measure change, and

Topic 1: Introduction to Survey Research

Example (Fink, V.6, pp.25-26)

Steps in Survey Research

Topic 1: Introduction to Survey Research

Topic 1: Introduction to Survey Research

Choosing Research Topic

Topic 1: Introduction to Survey Research

Setting Measurable Objectives

Topic 1: Introduction to Survey Research

Formulating Research Questions and Hypotheses

The difference between stating a surveys purpose as an objective and as a question

Where Do Surveys Objectives Originate?

reviews of the literature and other surveys

Topic 1: Introduction to Survey Research

Focus groups and