You are on page 1of 60

Introduction to

Biostatistics 1
By
Dr Babatunde, OA
MBBS, PgCertDPMIS, MPH, FWACP
Department of Community Medicine,
FMC, Ido-Ekiti

Outline

Definition (C-O-S-A-I-P)
Collection
Organization
Summarizing
Analyzing
Interpreting
Presenting

Applications of biostatistics
Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

Introduction

A variable is any parameter that can be


observed or measured

Information collected on a variable is usually


unrefined and it is called data

The collection, analysis, interpretation and use


of data is called statistics

The application of statistics to health-related


fields is known as Biostatistics1
Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

Definition

Biostatistics = Medical statistics

Medical statistics is the scientific method of


collecting, organizing, summarizing,
analyzing, interpreting, and presenting
medical data1

Biostatistics is statistics applied to the


biological sciences and to Medicine2

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

Curiosity killed the cat


Biostatistics is all about curiosity3
Biostatistics is about asking medically
relevant questions and getting answers
using statistical methods
Which age group dies most? Mortality rate
What proportion of University students use
condoms during sexual intercourse?
Assignment 1: Each student should ask a
medically related question of personal
interest and submit it in the format below

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

Assignment 1 format 5
minutes
Name:
Matriculation Number:
Medical question of personal interest
Submit it at the end of the lecture
Also document in your notebook because
we will always make reference to this
question throughout this class

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

Research

Research is the scientific investigation of


facts and relationships to establish
dependable solutions to problems through
systematic collection, analysis, and
interpretation of data

Research is described as systematic in that it


involves an organized, formally structured
methodology to obtain new knowledge

Biostatistics is the basis for research


Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

Bio-statistics is simple
It is a general phenomenon that many
students do not have interest in statistics
Many see it as too abstract to conceptualize
However, it is the simplest form of all
sciences being practiced by both literates
and illiterates
Grandmother statistics: A big stroke by a
grandmother represents a birth while a
small stroke represents a death (origin of
tally sheet in immunization)

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

What is data?
Biostatistics center around data
Hence what is data?
Data is information collected of an
individual or group of individuals
When entered into a computer, it is called
dataset
Assignment 2: List 5 examples of data you
can collect to answer your question in
assignment 1

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

Assignment 2: List 5 examples of


data to answer your question
Example: How many students in this class
use condom during sexual intercourse:
5 data set:
1. Ever had sex
2. Age at 1st sexual intercourse
3. Number of sexual intercourse in last 3
months
4. Number of times used condom
5. Number of sexual partners since
sexual initiation

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

10

Collecting data
Questionnaires
Observations (checklist)
Focus Group Discussion
Proforma
Records
Census
List other ways you can collect data

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

11

Collecting data requires


measurement

4 Levels of measurement are involved in


data collection (N-O-I-R)

1.
2.
3.
4.

Nominal
Ordinal
Interval
Ratio

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

12

Nominal scale/level of
measurement of data
Lowest level
Mutually unordered category
No notion of numerical magnitude
Any number assigned has no numerical
value other than to distinguish one category
from another.
Examples: Gender, Blood Group, Marital
status
Assignment 3: List 5 more examples of
Nominal scale

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

13

Ordinal scale/level of
measurement of data
Ability to rank or order phenomenon
In addition to nominal propert
It is defined by related category
Examples: Patients pain coditions desribed
as Mild, Moderate, Severe
Assignment 4: List 5 more examples of
Ordinal scale of measurement

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

14

Interval Scale
Measurements are expressed in numbers
The starting point is arbitrary depending
largely on the units of measurement
It is possible to attach physical meanings to
differences of 2 measurements (intervals)
but not to their ratios
Examples: Temperature-Centigrade or
Fahrenheit

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

15

Ratio scale
Measurement on this scale has 3 previously
mentioned properties but in addition has a
true zero point
The ratio of any 2 measurements on the
scale is physically meaningful
Examples: Height in cm, Weight in Kg, Age
in years.

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

16

Basic
Basic Definitions
Definitions
Level

Summary

Example

Nominal

Categories only. Data cannot be


arranged in an ordering scheme

Students car:
1 Ford, 2 Toyota, 3 BMW

Ordinal

Categories are ordered, but


differences cannot be
determined or they are
meaningless

Students car:
1 Compact,
2 Mid-size,
3 Full size

Interval

Differences between values can


be found, but there may be no
inherent starting point. Ratios
are not meaningful

Temperature:
45,
80,
90

Ratio

Like interval scale, but with an


inherent starting point. Ratios
are meaningful

Weights of football players:


200 lbs, 300 lbs, 400 lbs

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

17

Why
Why does
does level
level of
of measurement
measurement matter?
matter?

Theoretical interest is not the primary reason why


researchers and statisticians consider the level of
measurement of a variable.
Level of measurement is important because the kinds
of statistical procedures that can be appropriately
used depend on the level of measurement of the
variable studied.
Calculating mean telephone number of a group of
peoples telephone number would be possible but
ridiculous, since telephone number is a nominal scale
level variable.
Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

18

Organization of data
Raw data is usually not too useful
It has to be organized to make sense out of
it
This brings us to types of statistics:

Descriptive: Frequency tables, Diagrams


Inferential: Use of statistical tests

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

19

Types of data

Primary data
Data that is obtained directly from an
individual e.g. 2006 Census

Secondary data
Data that is obtained from outside source
e.g. studying of hospital records 5

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

20

Types of Data

A Special type of Discrete Variable is the


Binary Variable which takes on exactly 2
possible values
Gender (M/F)
Pregnant? (Y/N)
Hypertensive? (Y/N)

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

21

Types of Data

Sometimes, discrete variables have a


natural ordering to them
For example, names of consecutive days in a
week (M, Tu, Wed, Thurs, Fri, Sat, Sun)

Other types of discrete variables do not


have a natural order and are called Nominal
Variables
Race (African American, Caucasian, Asian,
Hispanic etc.)

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

22

Types of Data
If in an experiment you measure a single
variable, it is called a Univariate experiment
If you measure 2 variables, it is called a
Bivariate experiment
And if you measure multiple variables, it is
called a Multivariate experiment

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

23

DESCRIPTIVE STATISTICS

Concerned with summarizing series of


measurements or observations
A] Measures of Central tendency
B] Measures of Variability/Dispersion
C] Measures of Relative standing

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

24

Summarizing data: Descriptive


Measures

Now that we have displayed our data, we want to


be able to characterize it quantitatively
Measures of Central Tendency
Mean, Median, Mode

Measures of Variability
Range, Variance, Standard Deviation

Measures of Relative Standing


Z-Scores, Percentiles, Quartiles

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

25

Measures of Central
Tendency

Mean
Arithmetic Average of a sample of data

Median
If you order the data from smallest to highest,
the median is the middle value, assuming an
odd number of data elements
If you have an even number of elements, it is
the average of the 2 middle numbers.

Mode
The most common value in a set of values

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

26

Arithmetic mean
i. Arithmetic Mean: This is different from
other types of mean like geometric mean
and harmonic mean.
The arithmetic mean is simply the
average, denoted by the symbols shown:
[,-x, ie miu or x-bar].
These symbols are used to represent
arithmetic mean of population [N] and
sample [n] respectively.

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

27

Median

Median: Here the distribution is arrayed or


arranged in a particular pattern.
Then look at the value which cuts this distribution
into two equal parts.
That value in array which divides it into two equal
parts is called the median.

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

28

Mode
Mode: This is the most frequently
occurring value in a distribution.
Some distributions are described as
amodal because they have no mode.
A distribution with one mode is uni-modal
and that with two modes is called bimodal
distribution.

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

29

A word for the wise


If

you stop learning you are


old, whether you are 20 or 80
years

Thank

you
Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

30

Introduction to
Biostatistics 2
By
Dr Babatunde, OA
MBBS, PgCertDPMIS, MPH, FWACP
Department of Community Medicine,
FMC, Ido-Ekiti

Measures of variability:
Range
This is one of the simplest measures
of variability.
This is simply the difference between
the highest and the lowest values;
R=XH-XL.
The range has a problem of looking
at two extremes alone and ignores
other values.

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

32

Variance and Standard


Deviation
In the following distribution; 9, 4, 2, 5, 10
[which has a mean of 6], the total
deviation from the mean or the average is
always zero.
Since the total or average mean deviation
is useless, something is done to get
around the problem.
Thus we square the deviations and sum
them up and we get 46.
Now the average of the squared deviations
is got by dividing by number of
observations.
This is called variance [S2, 2], sample and
population variance respectively.
Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

33

PRESENTATION OF DATA
tables

charts

diagrams
graphs

pictures
special

curves

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

34

Characteristics of a good
table
Numbering

eg table 1, table 2, etc


Title which must be brief and self explanatory
Headings of columns and rows should be clear
and concise
Data must be presented according to size or
importance, chronologically, alphabetically or
geographically
If percentages or averages are to be compared,
they must be placed as close as possible
No table may be too large
Footnotes may be given where necessary

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

35

Presentation of data (contd)

Charts and diagrams;


These methods of presentation have powerful
impact on the imagination of people. So they are
a popular media of exposing statistical data
a. Bar charts; these are a way of presenting a
set of numbers by the length of a bar- length of
bar being proportional to the magnitude to be
represented

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

36

Presentation of data contd

simple bar chart; bars may be vertical or horizontal


are usually separated by appropriate spaces with an eye on
neatness and clear presentation

Multiple bar charts; Here two or more bars are grouped


together.

Component bar chart; Here the bar may be divided into


two or more parts. Each part represents a certain item and
proportional to the magnitude of that particular item.

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

37

Presentation of data contd

b. Histogram; this is a pictorial diagram of


frequency distribution

It consists of a series of block

The class intervals are given along the horizontal


axis and frequency on the vertical axis

The area of each block or rectangle is proportional


to the frequency

The histogram is apt for representing continuous


variables.

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

38

Characteristics of
histogram

i. it is like the simple bar chart except that


the bars of histogram touch each other

ii. The height of each box is equal to the


frequency {ie for equal intervals} of class
it represents

iii. The interval with the highest box is


called the modal interval ie interval that
contains the mode.

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

39

PRESENTATION OF DATA contd

c. Frequency polygon; a frequency


distribution may also be represented
diagrammatically
by
the
frequency
polygon

Its obtained by joining the midpoints of


the histogram blocks.

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

40

d. Pie charts; Instead of comparing the length of a


bar
the areas of segments of a circle are compared.
The Area of each segment depends upon the
angle. A
circle of any considerable large size is divided
into the
number of components that make up the total
such
that the area of each sector is proportional to
the
component it represents.

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

41

PRESENTATION OF DATA contd

e. Graphs / scatter diagrams; this comes in


when there
are two different factors involved eg age
/height. If
after plotting the points, and they are such
that the
points cannot be joined by any line, then
graphs will
not apply and so we have scatter diagram.

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

42

Simple bar chart

1/17/17

Dr Babatunde OA MBBS, PGCertDPMIS,


MPH, FWACP

43

Multiple bar chart

1/17/17

Dr Babatunde OA MBBS, PGCertDPMIS,


MPH, FWACP

44

Component bar chart

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

45

Pie chart

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

46

Scattergram

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

47

Graph

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

48

Statistical testing

This refers to the applications of statistical


tests to study results with a view to
ascertain presence of statistical significance

Suppose we find in a study on level of


physical activity, 40% of men included in
the sample are physically active whereas
only 30% of women qualified as active. How
should one interpret this result?

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

49

Statistical testing-2
1. The observed difference of 10% might be a TRUE
DIFFERENCE, which also exist in the total pop from
which the sample was drawn

2. This difference might also be DUE to CHANCE; ie


in reality there is no difference b/w men and women
but that the sample of men just happened to differ
from the sample of women probably due to
sample variation

3. The observed difference of 10% is due to defect


in the study design (bias)-ie with an appropriate
study design no such difference would have
occurred
Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

50

Statistical testing-3
Statistical tests estimate the likelihood that such a
result occur by chance
If the likelihood or probability is less than 5% it
implies that a true difference exist and the notion of
chance occurrence is rejected
This level of 5% is known as the alpha level while
the actual likelihood or probability calculated is
know as the P-value
In statistical terms the assumption that in the total
population no real difference exists between the
groups is called the NULL HYPOTHESIS
Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

51

Statistical testing-4
Once

the alpha level has been set and the


statistical test applied to results the P-value
is obtained

If

the P-value is lower than the alpha value it


implies that a true difference exists and the
Null Hypothesis is rejected while the result
is said to be statistically significant

If

the P-value is higher than the alpha value


the Null hypothesis is accepted and the
result is taken as having occurred by chance
and considered not significant
Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

52

Statistical testing-5

If the Null hypothesis is rejected when it is


true ie no true difference exist ( P value >
than alpha value) then a type I error is
committed

If the Null hypothesis is accepted when a


true difference exist (P-value < than alpha
value) then a type II error is committed

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

53

Uses of Biostatistics in Medicine

Clinicians often have to evaluate and use new


information through out their practice lives.
The most important reasons for learning
biostatistics include the following:
1. Assessing medical literature-evidence based information
is often made available in journals and clinicians must
understanding biostatistics to be able to make sense of
such information
2. Patient care- results of research work are often meant
for patient care and clinicians want to know best
diagnostic procedure, optimal care and how treatment
regimens should be designed and implemented
Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

54

Uses of Biostatistics in Medicine


3. Use of vital statistics-effective diagnosis and
treatment of patients requires an understanding
of how to make sense out of vital statistics which
often results from the recording of vital events
such as births and deaths
4. Deploying diagnostic procedures-knowing the
appropriate diagnostic procedure to use in a given
patient is essential for effective care. Clinicians
should be conversant with the sensitivity,
specificity, positive and negative predictive
values of a procedure
Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

55

Uses of Biostatistics in Medicine


5. Assessing information on drugs and equipmentcompanies present information on their products
in charts, graph and clinical studies and clinicians
need to good knowledge of biostatistics to make
sense out of such presentation and information
6. Understanding epidemiologic problems-disease
prevalence, variation by seasons and by location,
and relationship to risk factors constitute
epidemiological parameters of utmost importance
to the clinician in practice.

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

56

Applications of
Biostatistics
Public health (Epidemiology, Nutrition etc)
Clinical trials
Population genetics
Genomics analysis
Ecology/Ecological forecasting
Biological Sequence Analysis
Systems biology for gene network inference

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

57

References
1.

2.

3.

4.

5.

Bamgboye EA. A companion of Medical statistics. Ibipress


& Publishing Company, Ibada Nigeria 1st Edition 2006: 116.
Dunn OJ. Basic statistics: A primer for the Biomedical
Sciences. Johm Wiley and Sons Publishers 2nd Edition: 1-11.
Kolawole EB. Statistical methods. Bolabay Publications
Lagos, Nigeria 1st Edition 2006: 1-12.
Taofeek I. Research methodology and dissertation writing
for allied professionals. Cress Global Link Limited, Abuja 1 st
Edition 2006: 1-24
Park K. Parks textbook of Preventive Medicine and Social
Medicine. M/s Banarsidas Bhanot Publishers 2004 18th
Edition: 608-615

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

58

References (contd)
6. Dawnson B, Trapp R. Introduction to Medical
Research in Basic and Clinical Biostatistics.
Fourth Edition. McGraw-Hill Companies Inc:
USA, 2004;p1-6
7. Prabhakara GN. Basics of Statistics in
Biostatistics. JAYPEE:New Delhi; 2006; p11-16.
8. Dawnson B, Trapp R. Summarising Data and
Presenting data in Tables and Graphs in Basic
and Clinical Biostatistics. Fourth Edition.
McGraw-Hill Companies Inc:USA, 2004;p23-60

Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

59

A word for the wise


What

doesnt kill us makes us


stronger
So
see
challenges
as
opportunities for
personal
growth

Thank

you
Dr Babatunde OA MBBS,
PGCertDPMIS, MPH, FWACP

1/17/17

60

You might also like