You are on page 1of 32

Chapter 1: Introduction

What is Statistics?
Latin roots, status which means state.
A Swedish named Tabell first recorded birth and
death of people from 1749.

Why Statistics?
So much data! Any information? Knowledge?
What? Where? When? Who? Why? How?

Statistics is a collection of procedures and principles


for gathering data and analyzing information to help
people make decisions when faced with uncertainty

SQQS1013: MZ_L1 1
Statistics

Theoretical Applied Statistics


Statistics
Applications of those
Development, derivation theorems, formulas, rules and
and proof of theorems, laws to solve real problems.
formulas, rules and laws.

Descriptive Statistics Inferential Statistics


Methods for collecting, Methods that use results
organizing, analyzing and obtained from sample to derive
summarizing data conclusions about a population

SQQS1013: MZ_L1 2
Descriptive vs. Inferential
Example 1:

Which of the following statements is descriptive in nature


and which is inferential?

1. Of the 1000 people in Uganda, 50% of them are


below 15 years old.
2. Half of the people in Uganda are under 15 years old.

SQQS1013: MZ_L1 3
Population vs. Sample
Population
a collection of all individuals about which information is
desired.
finite population (limited): e.g. books in library:
infinite population (unlimited): e.g. the population of all
people who might use Panadol.

Sample
a subset of the population.
e.g. a group of students taking Elementary Statistics.

SQQS1013: MZ_L1 4
Parameter vs. statistic
Parameter
a numerical value summarizing all the data of an entire
population.
- often a Greek letter is used to symbolize the name of
parameter.
e.g. The average age at time of admission for all students
who have ever attended our college.

Statistic
a numerical value summarizing the sample data.
- English alphabet is used to symbolize the name of statistic
e.g. The average height, found by using the set of 25
students.
SQQS1013: MZ_L1 5
Variable, Data value
Variable
a characteristic of interest about each individual element of
a population or sample.
students age at entrance into college
colour of students hair, etc.

Data value
the value of variable associated with one element of a
population or sample.
a number, a word, or a symbol.
e.g. Farah entered college at age 23, her hair is
brown, etc.

SQQS1013: MZ_L1 6
Basic Terms
Example 2:

A student is interested in finding out the average ringgit


value of cars owned by the faculty members.
i) population: the collection of all cars owned by all
faculty members at our university.
ii) sample: any subset of that population, e.g., the
cars owned by members of the Statistics department.
iii) variable: the ringgit value of each individual car.

SQQS1013: MZ_L1 7
Basic Terms
iv) data value: the ringgit value of a particular car, e.g. Alis
car is valued at RM 45 000.
v) data: the set of values that correspond to the
sample obtained (45,000; 55,000; 34,000;).
vi) parameter: the average value of all cars in the
population.
vii) statistic: the average value of the cars in the sample.

Census: a survey which includes every element in the


population.
Sample survey: a survey which includes every element in the
selected sample only.
SQQS1013: MZ_L1 8
Types of Variables

SQQS1013: MZ_L1 9
Quantitative Variables
Discrete Variable
Assumes a countable number of values.
any values corresponding to isolated points along a line
interval: there is a gap between any two values.
Example 3:
Number of courses for which you are currently
registered.

SQQS1013: MZ_L1 10
Quantitative Variables
Continuous Variable
Assumes an uncountable number of values.
any value along a line interval, including every possible
value between any two values.
Example 4:
Weight of books and supplies you carry as you
attend classes today.

SQQS1013: MZ_L1 11
Qualitative Variables
Attribute, categorical variable
A variable that describes or categorizes an element of
a population.

Example 5:
A sample of four hair saloon customers was surveyed
for their hair color, hometown and level of
satisfaction.

SQQS1013: MZ_L1 12
Exercise 1
1. Of the U.S. adult population, 36% has an
allergy. A sample of 1200 randomly selected
adults resulted in 33.2% reporting an allergy.
Describe the population.

What is the sample?

SQQS1013: MZ_L1 13
Exercise 1
Describe the variable.

Identify the statistic and give its value.

Identify the parameter and give its value.

SQQS1013: MZ_L1 14
Exercise 1
2. The faculty members at Universiti Utara Malaysia
were surveyed on the question How satisfied are
you with this semester schedule?.
Their responses were categorized as very satisfied,
somewhat satisfied, neither satisfied nor
dissatisfied, somewhat dissatisfied, or very
dissatisfied.
Name the variable of interest.

Identify the type of variable.

SQQS1013: MZ_L1 15
Exercise 1
3. Identify each of the following as an example of (1)
attribute (qualitative) or (2) numerical (quantitative)
variables.

The number of stop signs in town of less than 500


people.

Whether or not a camera is defective.

The number of questions answered correctly on a


standardized test.

The length of time required to answer a telephone call at


a certain real estate office.

SQQS1013: MZ_L1 16
Common types of scales
Data also can be classified by how they are
categorized, counted or measured.
Nominal - categories only, unordered.
Ordinal - categories with some order.
Interval - differences but no natural starting
point
Ratio differences and natural starting point

SQQS1013: MZ_L1 17
Nominal
A qualitative variable that characterizes (or
describes/names) an element of a population.
Arithmetic operations are not meaningful for such data.
Order or rank cannot be assigned to the categories.
Examples:
Survey responses:- Yes, No.
Gender: Male, Female.

SQQS1013: MZ_L1 18
Ordinal
A qualitative variable that incorporates an ordered
position, or ranking.
Differences between data values either cannot be
determined or are meaningless.
Examples:
Level of satisfaction: very satisfied, satisfied,
somewhat satisfied.
Course grades:- A, B, C, D, or F.

SQQS1013: MZ_L1 19
Interval
Involves a quantitative variable.
A scale where distances between data are
meaningful.
Differences make sense, but ratios do not :
e.g., 30-20C = 20C-10C, but 20C/10C is
not twice as hot!).
No natural zero

SQQS1013: MZ_L1 20
Interval
Examples:
Temperature scales are interval data with 25oC
warmer than 20oC and a 5oC difference has
some physical meaning.
Note that 0oC is arbitrary, so that it does not make
sense to say that 20oC is twice as hot as 10oC.
The year 0 is arbitrary and it is not sensible to
say that the year 2000 is twice as old as the
year 1000.

SQQS1013: MZ_L1 21
Ratio
A scale in which both intervals between values and
ratios of values are meaningful.
A real zero point.
Examples:
- Temperature measured in degrees Kelvin is a ratio
scale because we know a meaningful zero point
(absolute zero).
- Physical measurements of height, weight, length are
typically ratio variables. It is now meaningful to say
that 10 m is twice as long as 5 m. This is because
there is a natural zero.
SQQS1013: MZ_L1 22
Exercise 2
Classify each type of data:
a. Ratings of newscasts in Malaysia.
(poor, fair, good, excellent)
b. Temperature of automatic popcorn
poppers.
c. Marital status of respondents to a survey
on saving accounts.

d. Age of students enrolled in a martial arts


course.
e. Salaries of cashiers of C-Mart stores.
SQQS1013: MZ_L1 23
Primary & Secondary Data
Sir Arthur Conan Doyle (Sherlock Holmes) once said that
it is a capital mistake to theorize before one has data.
Primary data:
Specific data obtained for a particular study
conducted by researcher.

Secondary data:
Pre-existing data, second-hand.
Data obtained from materials published by
governmental, industrial or individual sources.

SQQS1013: MZ_L1 24
Secondary Data
Secondary data are often used before gathering
primary data:
What is already known?
Have my questions been answered earlier?
Considerations:
Availability
Quality, Reliability
Validity, Suitability
Cost

SQQS1013: MZ_L1 25
Primary Data
Data is collected by researchers, for a specific research.
1. Surveys:
describing, recording, analyzing and interpreting conditions that
exist or existed by asking from respondents.
i. Face-to-face interview
ii. Phone interview
iii. Questionaire
2. Observations:
the information is sought by way of investigators own direct
observation without asking from respondents.
3. Experiments:
investigators manipulate variable to study the effects on
respondents.

SQQS1013: MZ_L1 26
Face-to-face interview
Two-way communication.
Researcher asks question directly to respondent.

Advantages:
Precise answer.
Minimizes non-responses.
Allows for in-depth questioning.

Disadvantages:
Expensive.
Interviewer might influence respondents responses.
Respondent may refuse to answer sensitive or personal
question.

SQQS1013: MZ_L1 27
Phone Interview
Advantages:
Fast.
Less costly.
Wider respondent coverage.
less interviewer bias than personal interview

Disadvantages:
Information obtained might not represent the whole population.
Limited interview duration.
Not appropriate for long and contemplate question.
Low response rate (unanswered calls).

SQQS1013: MZ_L1 28
Questionnaire
A set of questions to obtain related information for a
conducted study.
posted to respondents either by postal service or email or website.
Advantages:
Wider respondent coverage.
Respondent have enough time to answer questions.
Minimizes interviewer bias
Cost effective.

Disadvantages:
One-way interaction.
Low response rate.
Not suitable for numerous and hard questions.
Time consuming (faster on internet).
SQQS1013: MZ_L1
Questionnaire may be answered by unqualified respondent. 29
Observation
Observing and measuring specific characteristics without
attempting to modify the subjects being studied.
Record human behaviour, objects and situations without
asking the respondent.
E.g. In a study relating to consumer behaviour, the investigator
instead of asking the make of car used by the respondent, look at
the car directly.
Advantages:
Direct observation of actual situation.
Minimizes response bias.

Disadvantages:
Limited to specific observable subjects or behaviour.
May be time consuming.

SQQS1013: MZ_L1 30
Experiment
investigators manipulate variable to study the effects on
respondents.
e.g. A bank may conduct an experiment to know what
attracts depositors: profit or security or liquidity.
Advantages:
Designed to suit purpose.
May use actual clients (field) or volunteers (lab).
Disadvantages:
May be costly.

SQQS1013: MZ_L1 31
You should now be able to:
define Statistics and its roles,
understand the basic terms
describe and differentiate
the types of data.
the techniques of primary data collections.
Next Lesson: Read up and make notes!
Organizing & Graphing Qualitative Data
Frequency Distribution/Table
Relative Frequency & Percentage Distribution
Bar Chart (simple/ vertical, horizontal, component, multiple)
Pie Chart
Line Graph/Time Series Graph

SQQS1013: MZ_L1 32

You might also like