You are on page 1of 3

Statistics is the study of the collection, analysis, interpretation, presentation, and organization of data.[1] In applying statistics to, e.g.

,
a scientific, industrial, or societal problem, it is conventional to begin with a statistical population or a statistical model process to be
studied. Populations can be diverse topics such as "all persons living in a country" or "every atom composing a crystal". Statistics
deals with all aspects of data including the planning of data collection in terms of the design of surveys and experiments.[1]
Descriptive Statistics and Inferential Statistics
Every student of statistics should know about the different branches of statistics to correctly understand statistics from a more holistic
point of view. Often, the kind of job or work one is involved in hides the other aspects of statistics, but it is very important to know the
overall idea behind statistical analysis to fully appreciate its importance and beauty.
The two main branches of statistics are descriptive statistics and inferential statistics. Both of these are employed in scientific analysis
of data and both are equally important for the student of statistics.
Descriptive Statistics
Descriptive statistics deals with the presentation and collection of data. This is usually the first part of a statistical analysis. It is
usually not as simple as it sounds, and the statistician needs to be aware of designing experiments, choosing the right focus group and
avoid biases that are so easy to creep into the experiment.
Different areas of study require different kinds of analysis using descriptive statistics. For example, a physicist studying turbulence in
the laboratory needs the average quantities that vary over small intervals of time. The nature of this problem requires that physical
quantities be averaged from a host of data collected through the experiment.
Inferential Statistics
Inferential statistics, as the name suggests, involves drawing the right conclusions from the statistical analysis that has been performed
using descriptive statistics. In the end, it is the inferences that make studies important and this aspect is dealt with in inferential
statistics.
Most predictions of the future and generalizations about a population by studying a smaller sample come under the purview of
inferential statistics. Most social sciences experiments deal with studying a small sample population that helps determine how the
population in general behaves. By designing the right experiment, the researcher is able to draw conclusions relevant to his study.
While drawing conclusions, one needs to be very careful so as not to draw the wrong or biased conclusions. Even though this appears
like a science, there are ways in which one can manipulate studies and results through various means. For example, data dredging is
increasingly becoming a problem as computers hold loads of information and it is easy, either intentionally or unintentionally, to use
the wrong inferential methods.
Both descriptive and inferential statistics go hand in hand and one cannot exist without the other. Good scientific methodology needs
to be followed in both these steps of statistical analysis and both these branches of statistics are equally important for a researcher.
sampling is concerned with the selection of a subset of individuals from within a statistical population to estimate characteristics of
the whole population. Each observation measures one or more properties (such as weight, location, color) of observable bodies
distinguished as independent objects or individuals. In survey sampling, weights can be applied to the data to adjust for the sample
design, particularly stratified sampling. Results from probability theory and statistical theory are employed to guide practice. In
business and medical research, sampling is widely used for gathering information about a population.
In statistics and quantitative research methodology, a data sample is a set of data collected and/or selected from a statistical
population by a defined procedure.[1]
Typically, the population is very large, making a census or a complete enumeration of all the values in the population impractical or
impossible. The sample usually represents a subset of manageable size. Samples are collected and statistics are calculated from the
samples so that one can make inferences or extrapolations from the sample to the population. This process of collecting information
from a sample is referred to as sampling. The data sample may be drawn from a population without replacement, in which case it is a
subset of a population; or with replacement, in which case it is a multisubset.[2]
Statistics/Different Types of Data/Quantitative and Qualitative Data
Qualitative data
Qualitative data is a categorical measurement expressed not in terms of numbers, but rather by means of a natural language
description. In statistics, it is often used interchangeably with "categorical" data.
For example: favorite color = "blue"
height = "tall"
Although we may have categories, the categories may have a structure to them. When there is not a natural ordering of the categories,
we call these nominal categories. Examples might be gender, race, religion, or sport.
When the categories may be ordered, these are called ordinal variables. Categorical variables that judge size (small, medium, large,
etc.) are ordinal variables. Attitudes (strongly disagree, disagree, neutral, agree, strongly agree) are also ordinal variables, however we
may not know which value is the best or worst of these issues. Note that the distance between these categories is not something we can
measure.
Quantitative data
Quantitative data is a numerical measurement expressed not by means of a natural language description, but rather in terms of
numbers. However, not all numbers are continuous and measurable. For example, the social security number is a number, but not
something that one can add or subtract.
For example: molecule length = "450 nm"
height = "1.8 m"
Quantitative data always are associated with a scale measure.
Probably the most common scale type is the ratio-scale. Observations of this type are on a scale that has a meaningful zero value but
also have an equidistant measure (i.e., the difference between 10 and 20 is the same as the difference between 100 and 110). For
example, a 10 year-old girl is twice as old as a 5 year-old girl. Since you can measure zero years, time is a ratio-scale variable. Money
is another common ratio-scale quantitative measure. Observations that you count are usually ratio-scale (e.g., number of widgets).
Sampling techniques
Three main types of sampling strategy:
Random
Systematic
Stratified
Within these types, you may then decide on a; point, line, area method.
Random sampling
Least biased of all sampling techniques, there is no subjectivity - each member of the total population has an equal chance of
being selected
Can be obtained using random number tables
Microsoft Excel has a function to produce random number
The function is simply:
=RAND()
Type that into a cell and it will produce a random number in that cell. Copy the formula throughout a selection of cells and it will
produce random numbers.
You can modify the formula to obtain whatever range you wish, for example if you wanted random numbers from one to 250, you
could enter the following formula:
=INT(250*RAND())+1
Where INT eliminates the digits after the decimal, 250* creates the range to be covered, and +1 sets the lowest number in the range.
Paired numbers could also be obtained using;
=INT(9000*RAND())+1000
These can then be used as grid coordinates, metre and centimetre sampling stations along a transect, or in any feasible way.
Methodology
A. Random point sampling
A grid is drawn over a map of the study area
Random number tables are used to obtain coordinates/grid references for the points
Sampling takes place as feasibly close to these points as possible
B. Random line sampling
Pairs of coordinates or grid references are obtained using random number tables, and marked on a map of the study area
These are joined to form lines to be sampled
C. Random area sampling
Random number tables generate coordinates or grid references which are used to mark the bottom left (south west) corner of
quadrats or grid squares to be sampled

Figure one: A random number grid showing methods of generating random numbers, lines and areas.
Advantages and disadvantages of random sampling
Advantages:
Can be used with large sample populations
Avoids bias
Disadvantages:
Can lead to poor representation of the overall parent population or area if large areas are not hit by the random numbers
generated. This is made worse if the study area is very large
There may be practical constraints in terms of time available and access to certain parts of the study area
Systematic sampling
Samples are chosen in a systematic, or regular way.
They are evenly/regularly distributed in a spatial context, for example every two metres along a transect line
They can be at equal/regular intervals in a temporal context, for example every half hour or at set times of the day
They can be regularly numbered, for example every 10th house or person
Methodology
A. Systematic point sampling
A grid can be used and the points can be at the intersections of the grid lines (A), or in the middle of each grid square (B). Sampling is
done at the nearest feasible place. Along a transect line, sampling points for vegetation/pebble data collection could be identified
systematically, for example every two metres or every 10th pebble
B. Systematic line sampling
The eastings or northings of the grid on a map can be used to identify transect lines (C and D) Alternatively, along a beach it could be
decided that a transect up the beach will be conducted every 20 metres along the length of the beach
C. Systematic area sampling
A pattern' of grid squares to be sampled can be identified using a map of the study area, for example every second/third grid square
down or across the area (E) - the south west corner will then mark the corner of a quadrat. Patterns can be any shape or direction as
long as they are regular (F)

Figure two: Systemic sampling grid showing methods of generating systemic points, lines and areas.

In statistics and quantitative research methodology, various attempts have been made to classify variables (or types of data) and
thereby develop a taxonomy of levels of measurement or scales of measure. Perhaps the best known are those developed by the
psychologist Stanley Smith Stevens. He proposed four types: nominal, ordinal, interval, and ratio.
Statistics/Methods of Data Collection
The main portion of Statistics is the display of summarized data. Data is initially collected from a given source, whether they are
experiments, surveys, or observation, and is presented in one of four methods:
Textular Method
The reader acquires information through reading the gathered data.
Tabular Method
Provides a more precise, systematic and orderly presentation of data in rows or columns.
Semi-tabular Method
Uses both textual and tabular methods.
Graphical Method
The utilization of graphs is most effective method of visually presenting statistical results or findings.

You might also like