You are on page 1of 18

http://www.ignouassignmentguru.com/ IGNOU Assignment GURU Page No.

1
https://www.facebook.com/IGNOUAssignmentGURU https://twitter.com/IGNOU_Guru
http://www.ignouassignmentguru.com/ IGNOU Assignment GURU Page No.2

2
https://www.facebook.com/IGNOUAssignmentGURU https://twitter.com/IGNOU_Guru
http://www.ignouassignmentguru.com/ IGNOU Assignment GURU Page No.3

IGNOU ASSIGNMENT GURU (2017-2018)


B.P.C.-4
Statistics in Psychology
Disclaimer/Special Note: These are just the sample of the Answers/Solutions to some of the Questions given in the
Assignments. These Sample Answers/Solutions are prepared by Private Teacher/Tutors/Authors for the help and guidance
of the student to get an idea of how he/she can answer the Questions given the Assignments. We do not claim 100%
accuracy of these sample answers as these are based on the knowledge and capability of Private Teacher/Tutor. Sample
answers may be seen as the Guide/Help for the reference to prepare the answers of the Questions given in the assignment.
As these solutions and answers are prepared by the private Teacher/Tutor so the chances of error or mistake cannot be

U T
denied. Any Omission or Error is highly regretted though every care has been taken while preparing these Sample Answers/
Solutions. Please consult your own Teacher/Tutor before you prepare a Particular Answer and for up-to-date and exact

N
information, data and solution. Student should must read and refer the official study material provided by the university.

N O
NOTE: All questions are compulsory

E
SECTION-A
Q. 1. What is descriptive statistics? Discuss the statistics techniques of organising data.
Ans. MEANING OF DESCRIPTIVE STATISTICS

M
Descriptive statistics describe the raw data and get some meaningful interpretation of the same. It follows

G
some procedures and uses some statistical methods. For example, there are two groups of stu-dents given a

I
problem solving test. One group is taken as experimental group in that the subjects are provided training in

N
problem solving while the other group subjects are not trained. The scores they got are given in the table
below:

U
Table-1: Scores of 10 Students of Experimental and Control Groups

I G
Experimental Condition (With Training)

R 6
Control Condition (Without Training)
4

S S U
12
8
8
6

A G
Experimental Condition (With Training)

4
5
9
12
15
Control Condition (Without Training)

2
3
4
10
12
5 2
4 2
The students in control group have scored lower as compared to that of the students in experimental group.
Description of data performs two operations: (i) Organising Data and (ii) Summarising Data.
ORGANISING DATA
Four major statistical techniques for organising the data are:
• Classification
• Tabulation

3
https://www.facebook.com/IGNOUAssignmentGURU https://twitter.com/IGNOU_Guru
http://www.ignouassignmentguru.com/ IGNOU Assignment GURU Page No.4

• Graphical Presentation
• Diagrammatical Presentation
Classification
Classification of data means categorisation of data for its most effective and efficient use.
It is a summary of the frequency of individual scores or ranges of scores for a variable. We will get such value of
variable and the number of people who have scored each value.
The data is arranged to derive some conclusions.
Data can be arranged in groups on the basis of their similarities. The classification of data takes the researchers a
step ahead to the scores and proceeds forward concrete decision. Given below are the objectives of classifica-tion of
data:
• It presents data in a condensed form.
• It explains the affinities and diversities of the data.
• Classification may be qualitative and quan-titative.
• Frequency distribution.

U T
Frequency distribution shows the number of cases in a range of scores. It shows how each score as obtained by
a group of individuals and how frequently each score occurred.

N
Frequency distribution can be with ungrouped data and grouped data.

O
(i) In an ungrouped frequency distribution, score values either from highest to lowest or lowest to highest and
placing a tally mark (/) besides each scores every times it takes place. The frequency of each score is denoted by ‘f’.

N E
(ii) Grouped Frequency Distribution is developed when there is a wide range of score value in the data. The
objective here is to get a clear picture of the data. It organises data into classes, into groups of values describing a

M
feature of the data. It shows the number of observations from the data set that fall into each of the class.

G
Construction of Frequency Distribution

I
In frequency distribution, certain terminologies are used and we need to understand them.

N
One of them is variable. The phenomenon under study is called variable. For example, the performance of
students on a problem solving issue or a method of teaching students that could affect their performance are variables.

G U
There are two types of variables:

I
(i) Continuous variable
(ii) Discrete variable.

R
In continuous variable, all the possible values are given in a specified range, for example, age, weight, height,

S
etc. They are given with by their units of measurement.

U
The variables which cannot take all the possible values within the given specified range are called discrete

S
variables. For example, number of children, marks scored in a test.

G
Preparation of Frequency Distribution

A
For preparing frequency distribution, first we decide the range of the given data – the difference between the
highest and lowest scores. It shows the range of the scores. It is important to decide the following before developing
any grouped frequency distribution:
1. The number of class intervals: It depends on the number of variables. If there are very few scores, a large
number of class-intervals are not required. The number of classes should be between 5 and 30.
2. Limits of each class interval: Class interval– the size/width or range of the class – is another factor used in
determining the number of classes. It is denoted by ‘i’.
For the same size classes of frequency distribution, class interval should be of uniform width which should also
be a whole number and divisible by numbers like 2, 3, 5, 10 or 20.
The class limits for distribution is described by three methods:
(i) Exclusive Method
(ii) Inclusive Method
(iii) True or Actual Class Method
(i) Exclusive Method: In this method the upper limit of a class becomes the lower limit of the next class. It is
called exclusive as we do not put any item that is equal to the upper limit of a class in the same class; we put it in the
next class, i.e., the upper limits of classes are excluded from them. For example, a person of

4
https://www.facebook.com/IGNOUAssignmentGURU https://twitter.com/IGNOU_Guru
http://www.ignouassignmentguru.com/ IGNOU Assignment GURU Page No.5

age 20 years will not be included in the class-interval (10–20) but taken in the next class (20–30), since in the class
interval (10–20 ) only units ranging from 10 – 19 are included.
(ii) Inclusive Method: In this method the upper limit of any class interval is kept in the same class-interval. In
this method the upper limit of a previous class is less by 1 from the lower limit of the next class interval. In short this
method allows a class-interval to include both its lower and upper limits within it.
(iii) True or Actual Class Method: In true or actual class limits, a score is internal when it extends from 0.5
units below to 0.5 units above the face value of the score. In inclusive method, there is no continuity between the
classes since upper class limit is not equal to lower class limit of the next class.
Types of Frequency Distributions: Frequencies of a data array are arranged in different ways according to the
requirement of the study or statistical analysis.
Relative Frequency Distribution: It shows the proportion of the total number of cases observed at each score
value.
Cumulative Frequency Distribution: It is a summary of a set of data showing the frequency of items less than
or equal to the upper class limit of each class. This definition holds for quantitative data and for categorical (qualitative)

T
data.

O U
Cumulative Relative Frequency Distribution: A cumulative relative frequency distribution is a tabular summary
of a set of data showing the relative frequency of items less than or equal to the upper class class limit of each class.

N
Relative frequency is the fraction or proportion of the total number of items. Given below are ability scores of 20
students:

E
8, 13, 13, 25, 12, 16, 25, 17, 18, 20, 22, 23, 23, 24, 25, 18, 10, 12, 13, 16, 19, 20, 25, 25

N
We will see how these scores could be formed into a frequency distribution.

M
Scores Frequency Cumulative Frequency Relative Cumulative Frequency

G
8 1 1 1/25

I
10
12

N
1
2
2
4
2/25
4/25

U
13 3 7 7/25
16
17

I G R
2
1
9
10
9/25
10/25

S
18 2 12 12/25

U
19 1 13 13/25

S
20 2 15 15/25

G
22 1 16 16/25

A 23
24
25
Total
2
1
6
25
18
19
25
18/25
19/25
20/25

Percentile: A percentile is a measure used in statistics indicating the value below which a given percentage of
observations in a group of observations fall. For example, the 20th percentile is the value below which 20 per cent of
the observations may be found.
Tabulation
Tabulation is the systematic arrangement of the information in rows and columns. The main purpose of the table
is to simplify the presentation and to facilitate comparisons.
Components of a Statistical Table
Table number, Title of the table, Caption, Stub, Body of the table, Head note, Footnote and Source of data are the
key components of a table.

5
https://www.facebook.com/IGNOUAssignmentGURU https://twitter.com/IGNOU_Guru
http://www.ignouassignmentguru.com/ IGNOU Assignment GURU Page No.6

TITLE
Stub Head Caption
Stub Entries Column Head I Column Head II
Total Sub-Head MAIN BODY Sub-Head OF Sub-Head Sub-Head
THE TABLE

Footnote(s):
Source:
Graphical Presentation of Data
Graphical presentation of data means frequencies are plotted on a pictorial platform formed of horizontal and
vertical lines called graph. The purpose is to provide a systematic way of “looking at” and under-standing of the
data.

U T
Graphs can be polygon, chart or diagram.
We can create a graph on two mutually per-pendicular lines called the X and Y-axis.
Appropriate scales are indicated on the axis. The horizontal and vertical lines are called the abscissa and the

N O E N
ordinate. There are different types of graph that improve the scientific understanding. The commonly used graphs
are bar graphs, line graphs, pie, pictographs, etc.
Histogram: It represents tabulated frequencies, shown as adjacent rectangles or squares, erected over discrete
intervals, with an area proportional to the frequency of the observations in the interval. The height of a rectangle

M
is also equal to the frequency density of the interval – the frequency divided by the width of the interval. The total

frequencies.

G
area of the histogram is equal to the number of data. A histogram may also be normalized displaying relative

I N
Frequency Polygon: Frequency polygons are a graphical device for understanding the shapes of distributions.
They serve the same purpose as histograms, but are especially helpful for comparing sets of data. Frequency polygons

G U
are also a good choice for displaying cumulative frequency distributions.

I
To create a frequency polygon, start just as for histograms, by choosing a class interval. Then draw an X-axis

R
representing the values of the scores in your data.

S
Mark the middle of each class interval with a tick mark, and label it with the middle value represented by the
class.

S U
Draw the Y-axis to indicate the frequency of each class. Place a point in the middle of each class interval at the
height corresponding to its frequency.

G
Finally, connect the points. You should include one class interval below the lowest value in your data and one

A
above the highest value. The graph will then touch the X-axis on both sides.
Frequency Curve: To draw the frequency curve it is necessary first to draw the polygon. The polygon is then
smoothened out keeping in view the fact that the area of the curve should be equal to that of the histogram.
Cumulative Frequency Curve or Ogive
Cumulative frequency curve or ogive are the graph of a cumulative frequency distribution
There are two types of ogive – less than ogive and more than ogive.
(i) ‘Less than’ Ogive: We can plot less than cumulative frequencies against the upper class boundaries of the
respective classes. It is a rising curve with slopes going upward from left to right.
(ii) ‘More than’ Ogive: We can plot more than cumulative frequencies against the lower class boundaries of the
respective classes. It is falling curve and goes downwards from left to right.
Diagrammatic Presentations of Data
Diagram is used to present statistical data in simple, readily comprehensible form. Diagrammatic presenta-tion
is not only the presentation of the data in visual form. We use graphic presentation of the data for further analysis.
There are different types of diagram–Bar Diagram, Multiple Bar Diagram, Sub-Divided Bar Diagram, Pictogram.

6
https://www.facebook.com/IGNOUAssignmentGURU https://twitter.com/IGNOU_Guru
http://www.ignouassignmentguru.com/ IGNOU Assignment GURU Page No.7

Bar Diagram: A bar diagram is a chart with rectangular bars with lengths proportional to the values that they
represent. The bars can be plotted vertically or horizontally. One axis of the chart shows the specific categories being
compared, and the other axis represents a discrete value. Also called dimensional diagram, it is most useful for
categorical data.
Sub-divided Bar Diagram: Sub-divided bar diagram is used to represent data in which the total magnitude is
divided into different or components. In this diagram, first we make simple bars for each class taking total magnitude
in that class and then divide these simple bars into parts in the ratio of various components. This type of diagram
shows the variation in different components within each class as well as between different classes. Sub-divided bar
diagram is also known as component bar diagram.
Multiple Bar Diagram: If the data is classified by attributes and if two or more characters or groups are to be
compared within each attribute we use multiple bar diagrams. The multiple bar diagram is simply the extension of
simple bar diagram. For each attribute two or more bars representing separate characters or groups are to be placed
side by side. Each bar within an attribute will be marked or coloured differently in order to distinguish them. Same
type of marking or colouring should be done under each attribute. A footnote has to be given explaining the markings

T
or colourings.

O U
Pie Diagram: Also called circular diagram, pie diagram may be used in place of bar diagrams. It consists of one
or more circles which are divided into a number of sectors. In the construction of pie diagram the following steps are

N
involved. Whenever one set of actual value or percentage are given, find the corresponding angles in degrees using
the following formula. A circle represent 360 degree. So 360 angle is divided in proportion to percentages. The

E
various parts of given magnitude can be obtained by using this formula:

G N
Angle = (Percentge/100) × 360
Different segments can be shaded with different colour.

M
Pictograms: A pictogram uses pictures to represent data. The number of picture or the size of the picture will be

I
proportional to the values of the different magnitudes to be presented. For example, showing population of human

N
beings, human figures are used. One human figure can be used to represent one crore people.
Q. 2. Find out whether correlationship exists between the following sets of scores using Pearson’s Product
Moment Coefficient of Correlation.

I G R U Data X 23 23 32 34 34 45 43 22 12 12
Data Y 10 10 12 14 15 19 17 10 24 25

S
Ans.

U
Data (x) Data (y) (xy)

A S
23
23
32
34
34
G
10
10
12
14
15
5
5
–4
–6
–6
5.6
5.6
3.6
1.6
.6
25
31.
16
36
36
31.36
36
12.96
2.56
.36
230
230
384
476
510
45 19 –17 –3.4 289 11.56 855
43 17 –15 –1.4 225 1.96 731
22 10 6 5.6 35 31.36 220
12 24 16 – 8.4 256 70.56 288
12 25 16 –9.4 256 88.36 300
∑ x = 280 ∑ y = 156 1200 282.4 4224

1200
σx = = 10.95
10

7
https://www.facebook.com/IGNOUAssignmentGURU https://twitter.com/IGNOU_Guru
http://www.ignouassignmentguru.com/ IGNOU Assignment GURU Page No.8

282.4
σy = = 5.31
10
∑ xy
R = y σx x σy
N
4224
R = 10 ×10.95 × 5.31
Q. 3. With the help of ‘t’ test find whether significant difference exists in the scores obtained by the two
groups of students on achievement motivation.

Group A 43 34 45 56 54 56 66 43 23 21
Group B 10 9 8 7 4 5 12 12 12 14
Ans.
Group A Group B A×B

O U N T 43
34

45
10
9

8
430
306

360
∑ xy

Mean =
N =
=
10
3568
∑ xy

E
N

G N M
56 7 392 Mean =
3568
10

I
54 4 216 Mean = 356.8

N
56 5 280
66 12 792

I G R U 43
23
12
12
516
276

S S U
21 14 294
SECTION-B
Answer the following questions in about 400 words (wherever applicable) each.

G
Q. 4. Compute mean, median, mode for the following data

A
23,34,45,43,32,23,43,23,43,45,56,65,67,77,34,23,12,23,45,43,12,23,23,23,23,23,34,34,
45,65,67,65,54,33,56,76,78,76,54,45,43,33,44,56,54,43,34
∑x 2015
Ans. Mean = n = 47 = 42.87

n +1 47 + 1
Median = = = 24 item.
2 2
Median = 23
Mode = 3 × Median – 2 x Mean
Mode = 3 × 23 – 2 y 42.87
Mode = 69 – 85.74
Mode = 16.74
Q. 5. Explain linear and nonlinear relationship with suitable diagram and discuss in detail the direction of
correlation with suitable examples.

8
https://www.facebook.com/IGNOUAssignmentGURU https://twitter.com/IGNOU_Guru
http://www.ignouassignmentguru.com/ IGNOU Assignment GURU Page No.9

Ans. Correlation: Linear And Non-linear Relationship: The relationship between two variables can be linear
and non-linear.
Linear Relationship
In linear relationships, the variable and the constant are connected via a straight line or in a mathematical format
where the independent variable is multiplied by the slope coefficient, added by a constant, which determines the
dependent variable.
This is plotted as a straight line on a graph. The non-linear relationships cannot be a straight line on the graph.
Non-linear relations thus can be cubic, quadratic, polynomial, exponential, etc. Pearson‘s product-moment correlation
and Spearman‘s rho are linear correlations.
The linear relationship can be expressed in the following equation:
Y=α+β X ...(i)
Y is a variable on y-axis (often called as dependent),
α (alpha) is a constant or Y intercept of straight line,
β (beta) is slope of the line and

U T
X is variable on x-axis (often called as indepen-dent).
We will plot the data given in Table . The figure below shows the scatter of the same data. It also shows the line

N
which is best fit line for the data. It shows a linear relationship between two variables, extroversion and number of
friends.

N O E
Number of Friends (Y)

I G NM α
β

I G R U Extroversion (X)

S
Linear relationship between extroversion and number of friends

U
1. Non-Linear Relationship

S
Non-linear relationship is also called curvilinear. The relationship between stress and performance, known as
Yorkes-Dodson Law, is an example of non-linear relationship. The performance is poor when the stress is too little

A G
or too much. It improves if the stress is moderate. This relationship is plotted in the following figure:

Curvilinear relationship between stress and performance. The performance


is poor at extremes and improves with moderate stress.
The curvilinear relationships can be of various types. It can be cubic, quadratic, polynomial and exponential.

9
https://www.facebook.com/IGNOUAssignmentGURU https://twitter.com/IGNOU_Guru
http://www.ignouassignmentguru.com/ IGNOU Assignment GURU Page No.10

Q. 6. Compute Kendall’s tau for the following data:


A B C D E
Data X 5 1 2 3 4
Data Y 4 3 1 2 5
Ans.
A B C D E ΣC ΣD
Data x 5 1 2 3 4
Data y 4 3 1 2 5
4 D O O C 2 3
3 O O C 1 2
1 C C 2 0

U T
2 C 1 0
∑∑C = 5 ∑∑D = 5

O N
nC – nD
tau (t) = n(n – 1)

E
2

N
(t) = 0
Q. 7. Discuss the application of Chi-square test.

M
Ans. Application of Chi-square Test: We apply chi-square in categorical data which comprises unordered
q u a n t i t a

I G
t i v

(i) Test of goodness of fit,

N
e

(ii) Test of independence and


(iii) Test for homogeneity.
c a t e g o r i e s l i k x2 test are:
e c o l o u r s a n d p o l i t i c a l a f f i l i a t i o n . T h e t h r e e i m p o r t a n t a p p l i c a t i o n o f

G U
Test of Goodness of Fit

I
A decision-maker should know if an actual sample distribution matches with a known theoretical distribution like

R
poisson, binomial, normal and so on. Researchers sometimes make comparison of the observed fre-quencies

S
characterising the several categories of the distribution with those frequencies expected as per the hypothesis.
The goodness of fit is a statistical test. It finds out how well the data provided support an assumption about the

S U
population distribution. It finds out if it supports a random variable of interest. The test finds how well an assumed
distribution matches to the given data.

G
In this test, first we have to hypothesise a theoretical distribution and then we conduct the test to see whether the

A
sample data could have come from the population of interest with the hypothesised theoretical distribution. We get
the observed frequencies from the observation of sample and the expected frequencies from the hypothesised theoretical
distribution. This test focuses on the difference between the observed frequencies and expected frequencies.
We take some examples to illustrate this point.
We want to study four brands of glucose biscuits to find out whether there a difference in the proportion of
consumers who prefer the taste of each of the biscuits. We make the null hypothesis that there is equal preference
among consumers for the four brands of biscuits.
We will thus test the hypothesis that there are equal probability of people preferring each brand and it is
1/4 = .25. If the sample has 100 people, 25 each will prefer each brand.
Another hypothesis will be that the first brand of the biscuits is preferred and the other three are equally less
preferred or that the first two brands are preferred more than the other two.
The Chi-square test finds out whether the relative frequencies observed in the several categories of our sample
frequency distribution are as per the set of frequencies hypothesised.
To test the null hypothesis, we may allow subjects to taste each of the biscuits and then know their preference.
We may conceal the brands name and order of presentation. If we had randomly selected 100 people and that our
observed frequencies of preference are presented in the table below:

10
https://www.facebook.com/IGNOUAssignmentGURU https://twitter.com/IGNOU_Guru
http://www.ignouassignmentguru.com/ IGNOU Assignment GURU Page No.11

Brand Brand Brand Brand


A B C D
Observed 20 18 30 32
Frequencies
The expected frequency for each brand is calculated by multiplying the proportion hypothesised
to characterise that brand in the population by the sample size. The null hypothesis states that the expected proportionate
preference for each brand is 1/4 and the frequencies of preference for each biscuit is thus 25 (1/4) (100/4) = 25.
The observed frequency of choice is expected to vary from the expected frequencies. However, we will not use
these differences as some differences are positive and some are negative they would cancel each other out. We can use
chi-square test to know the differences. If the chi-square value is more than the critical values at .05 or .01 level than
we will reject the null hypothesis and if the value is less than the given values then we maintain the null hypothesis.
The size of the difference between observed frequency and expected frequencies will determine the obtained
value of chi-square. If the difference between observed frequencies and expected frequencies is large, the chi-square
value will be larger and vice versa.

U T
The value of Chi-square is also determined by the size of the discrepancy relative to the magnitude of the
expected frequency. For example, if we toss coins 12 times and 1000 times. As per the null hypothesis, we expect the

N
frequency will be 6 and 500 respectively. If we get 11 heads in 12 tosses the difference is 5. If we get 505 heads in

N O
1000 tosses, the difference is also 5.
The number of differences involved in its calcula-tion also affects the chi-square. For example, if we use 2

E
brands of biscuits instead of the four, there would be less differences to influence the Chi-square value. The degrees
of freedom will be affected. If the number of difference is 4 the degrees of freedom will be 3 and when number of

M
difference are 2 the degree of freedom will be 1. When degrees of freedom is 4 and the chi-square value is more than

G
9.48 and we reject the null hypothesis at .05 level. When the degree of freedom is 4 and the chi-square value is 7.81

I
then we hold the null hypothesis at .05 level.

G N
2. Test of Independence
In social science research, the chi-square test has a much broader use and test if one observed frequency distribution

U
differ significantly from another observed frequency. The chi-square test is used for the analysis of bivariate frequency.

I
For example, a researcher wants to survey the attitudes of high school students to know how much importance

R
they give to getting a college degree. She talks to a sample of 60 senior high school students. Her question is whether

S
they believe that college education is becoming less important, more important, or staying the same and whether
boys respond differently from girls.

A S U
In a cross-tabulation, both the nominal and ordinal variables are generally presented. In such case, the focus is
given on the difference between group, here between boys and girls, in terms of the dependent variable, here their

G
opinion about the importance of college education.
With this objective, the data (observed frequencies) can be classified in a bivariable distribution. The cate-gories
are mutually exclusive. The data is presented in the following table:
More Less About the
Important Important Same
Boys 25 6 8
Girls 10 4 7
This type of distribution is called contingency table. Taking the two as independent of each other in the population,
the chi-square test may be used to compare the observed cell frequencies with those expected under the null hypothesis
of independence. If there is small discrepancy, chi-square will be small. It will suggest that the two variables do not
have much difference.
In such cases, the chi-square test is used to study the significance of difference between mean in two types of
population. The null hypothesis for the chi-square test states that the populations do not differ in terms of frequencies
of occurrence of a given charac-teristic.
If the hypothesis is false, the chi-square value will tend to be larger than when the alternate hypothesis is true.

11
https://www.facebook.com/IGNOUAssignmentGURU https://twitter.com/IGNOU_Guru
http://www.ignouassignmentguru.com/ IGNOU Assignment GURU Page No.12

3. Test of Homogeneity
We use the test of homogeneity when we want to verify whether different populations are homogeneous with
regard to some attributes of interest. For example, a cookie producer is bringing out a new product. To develop a
marketing strategy, he wants to determine whether the product will appeal to a particular age-group or whether it will
appeal equally to all age-groups of population. We make the null hypothesis that the taste of all age groups is same
about the new product. Thus, we can use the test of homogeneity to test the null hypothesis that different populations
are homogeneous with regard to that attribute.
The following reasons make this test different from the test of independence:
(i) In this test, we would like to know whether different samples are taken from the same population instead of
knowing if two charac-teristics are independent or not.
(ii) Two or more independent samples are derived from each population rather taking one sample for this test.
(iii) For this test, first a random sample is drawn from each population, and the proportion in each category is
decided later on. However, the procedure for testing of hypothesis in this test is same as for test of

U T
independence.
Q. 8. What is normal distribution? Describe deviation from normality.

N
Ans. The Normal Distribution: The normal distribution is very useful in statistical theory and practice.

O
The formula of the normal curve was discovered in 1973 by the French mathematician Abraham de Moivere.
The normal curve was rediscovered independently by Gauss and Laplace. Gauss’s interest was in the problem of

N E
astronomy which led to the consideration of a theory of error of observation. Normal curve is also called bell-shaped
curve and Gaussion Curve. In the mid-19th century, the applicability of the normal curve was promoted by Quetelet.

M
It was Quetelet for the first time believed that the normal cure could be extended to apply to problem of anthropology,

I G
sociology and human affairs.
In the later 19th century, Sir Francis Galton started the first serious study of individual differences and found

N
that most of the physical and psychological traits of human being conformed to the normal curve. He then extended
the applicability of the normal curve.

G U
A normal curve graphically represents normal distribution. In a normal distribution, majority of the cases come

I
in the middle of the scale and a few cases fall at both extremes of the scale.

R
If we test intelligence of a group of students, the greatest proportion of IQ scores will be between 85 and 115.

S
There will be few who score more than 145 and few who score lower than 55.

U
If we measure physical human characteristics, most adults would have 5 to 6 feet height, with very fewer will

S
have less than 5 feet or very few have more than 6 feet.
Normal probability distribution is a continuous probability distribution. The frequency of variable occurs in

A G
normal distribution when the laws of chance rule the occurrence of that variable.
The normal curve is a theoretical or ideal model developed from a mathematical equation and not from any
research and gathering data.
The normal curve is based on the law which states that greater is the deviation of an event from the mean value
in a series, the less frequently it occurs. In social sciences we conduct the study of the sample and not the entire
population. Thus, the slightly deviated or distorted bell-shaped curve is also accepted as the normal curve.
DEVIATION FROM THE NORMALITY
Some of the variables in social sciences deviate from the normal distribution. This deviation happens in two
ways – Skeweness and Kurtosis.
Skeweness
Skeweness means lack of symmetry. A normal curve has a balance between the right and left halves of the curve.
It is always symmetrical. The mean, median and mode fall at the same point. In skewed curve, the mean and median
fall at different points and the balance is shifted to the left or to the right. Look at the figure given below:

12
https://www.facebook.com/IGNOUAssignmentGURU https://twitter.com/IGNOU_Guru
http://www.ignouassignmentguru.com/ IGNOU Assignment GURU Page No.13

The following are the properties of a normal curve:


(i) The normal curve is a possible model of probability distribution.
(ii) The normal curve is an infinite number of possible curves. The same algebraic expres-sion describes all of
them.
Normal curves are similar in shape and symmetry. In normal curves tails never touch the X-axis. They are
bilaterally symmetrical. Most of the area under normal curve falls within a limited range of the number line. The

U T
total area of normal curves is 1.00. So the area in each half of the distribution is 0.5.
SECTION-C
Answer the following in about 50 words each.

N O
Q. 9. Statistics

E N
Ans. Meaning of Statistics: The word statistics is derived from Latin word ‘status’ or Italian ‘Statista’ which
means statesman. It was used in the 18th century by Professor Gott Fried Achenwall. These words were used for
political state during the early period. ‘Statista’ was used to keep the census records or data on state’s wealth. Its
meaning and usage gradually have changed.

G M
Statistics conveys different meanings in singular and plural sense.

I
Statistics in Singular Sense

N
In singular sense, it is a branch of science that deals with classification, tabulation and analysis of numerical
facts and makes decision on that basis. It includes statistical methods for collection, classification, analysis and
interpretations of data.

I G U
Statistics in Plural Sense
In plural sense, statistics means that quantitative information or available ‘data’. For instance, informa-tion on

R
population or demographic features of a country, enrolment of students in a college are statistics.

S
Websters define statistics as the classified facts on the conditions of the people in a State and those facts can be
presented in number or in tables of number or classified arrangement.

U
In plural sense, Horace Secrist describes statistics “as aggregates of facts affected to a marked extent by multi-

S
plicity of causes numerically expressed, enumerated or estimated as per the reasonable standard of accuracy, col-
lected in a systematic manner for a pre-determined purpose and placed in relation to each other.” Thus, statistics

A G
should have the following characteristics:
1. They must be aggregate of facts i.e., no individual figure is regarded as statistics.
2. They are affected by multiplicity of factors; like circumstances. For example, any yield of crop is affected
by various circumstances i.e., soil, seed, rainfall and temperature etc.
3. They must be enumerated or estimated accord-ing to reasonable standards of accuracy. How-ever, degree of
accuracy depends on nature of data. Again, Whatever standard of accuracy is once adopted, it should be
maintained throughout the whole study.
4. They must be collected in a systematic manner for a predetermined purpose i.e., the data must be properly
arranged.
5. They must be placed in relation to each other i.e., the facts should be comparable regarding time, space or
condition.
Thus, we can say that all statistics are numerical statements of facts, but all numerical statements of facts cannot
be called statistics.
Q. 10. Measures of dispersion.
Ans. Measures of Dispersion: Measures of dispersion describe how similar a set of scores are to each other.
The more similar the scores are to each other, the lower the measure of dispersion will be. The less similar the scores
are to each other, the higher the measure of dispersion will be. In general, the more spread out a distribution is, the
larger the measure of dispersion will be.

13
https://www.facebook.com/IGNOUAssignmentGURU https://twitter.com/IGNOU_Guru
http://www.ignouassignmentguru.com/ IGNOU Assignment GURU Page No.14

It is useful because even if we know the mean, median or mode, it is not possible to have a whole picture of a set
of data. We will not be able to understand about how the score or measurements are arranged in relation to the center.
Two sets of data with equal mean or median may be different with regard to their variability. Measures of these
variations help in understanding how far these observations are scattered from each other. Range, average deviation,
quartile deviation, variance and standard deviation are the main measures of dispersion.
Range: Range is a measure of dispersion. It is defined as the difference between the highest and the lowest
values. It provides an indication of statistical dispersion. It is most useful in representing the dispersion of small data
sets. It is designated by ‘R’.
It does not provide any information on the values in between the extreme values. A smaller value indicates lesser
dispersion among the scores, while a large value of range shows greater dispersion.
Merits: The range is only measure that is technically meaningful if the data are at the ordinal level. If the
distribution is not much skewed, Range can be a good measure.
Average Deviation: The average deviation is one of several indices of variability that is used to characterize the
dispersion among the measures in a given population. To calculate the average deviation of a set of scores first it is
necessary to compute their mean and then specify the distance between each score and that mean without regard to

U
signs are not taken.

O N T
whether the score is above or below the mean. The average deviation is defined as the mean of these absolute values.
Average deviation is denoted as AD. When we sum all the deviations from the mean, the positive (+) or negative ()

Merits: It is a better measure for comparison about the formation of different distributions. As compared to

E
standard deviation, it is less affected by extreme values.

N
Q. 11. Point estimation and interval estimation.
Ans. Point Estimation is the use of sample data to calculate a single value which is to serve as a best guess or

M
best estimate of an unknown (fixed or random) population parameter.

G
More formally, it is the application of a point estimator to the data. Point estimation should be contrasted with

I
interval estimation: such interval estimates are typically either confidence intervals in the case of frequentist inference,

N
or credible intervals in the case of Bayesian inference.
Interval Estimation is the use of sample data to calculate an interval of possible values of an unknown population

U
parameter, in contrast to point estimation, which is a single number. Interval estimation is distinct from point estimation.

I G
In interval estimation, we are very concerned about confidence and so give up trying to get point estimate of parameter,
but rather just try to know that it lies within some region with some prescribed probability.

R
Q. 12. Level of significance

S
Ans. Level of Significance
The level of significance (α) is the low probability of obtaining at least as extreme results given that the null

A S U
hypothesis is true. It is an integral part of statistical hypothesis testing where it helps investigators to decide if a null
hypothesis can be rejected. The results of the experiment are considered significant (p ≤ α) when the probability of

G
chance occurrence of observed results up to and below which the probability ‘p’ of the null hypothesis being correct
is considered too low. Observed results are not considered significant (p > α) if p exceeds α, the null hypothesis (H0)
cannot be rejected because the probability of it being correct is considered quite high. The researcher decided the
selection of level of significance. Generally, 5% or 1%, i.e., α = .05 or α = .01 is taken level of significance. The
results are considered significant if out of 100 such trials only 5 or less number of the times the observed results may
arise from the accidental choice in the particular sample by random sampling. If null hypothesis is rejected at .05
level, it means that the results are considered significant so long as the probability ‘p’ of getting it by mere chance of
random sampling works out to be 0.05 or less (p < .05).
Q.13. Pictograms.
Ans. Pictograms: A pictogram uses pictures to represent data. The number of picture or the size of the picture
will be proportional to the values of the different magnitudes to be presented. For example, showing population of
human beings, human figures are used. One human figure can be used to represent one crore people.
Q. 14. Scatter diagram.
Ans. Scatter Diagram: Scatter diagram is a plot pairs of values of on a graph. It is also called as scattergram,
scatter and scatterplot. It shows the relationship between two variables.

14
https://www.facebook.com/IGNOUAssignmentGURU https://twitter.com/IGNOU_Guru
http://www.ignouassignmentguru.com/ IGNOU Assignment GURU Page No.15

How to Make Scatter Diagram?


Step-1: We need a graph and then we have to draw the x-axis (horizontal) and the y-axis is vertical. We will take
one variable will be on x-axis and another on y-axis. We can plot any variable on any axis for correlation analyses.
However, the causal variable should be on x-axis and effect variable on y-axis if the two variables share a cause-
effect relationship. Correlation does not necessarily imply causality. We plot number of hours spent in studies on x-
axis and marks obtained on the y-axis.
Step-2: We will decide the range of values. The lowest score can be zero, but if the range of values starts from
higher value, then we have to start from a higher value than zero. In the example we have, 55 is the lowest mark, so
we have to start the y-axis from 50. The axis will continue with the highest value.
Step-3: We have to get the pairs of values from the given data. For example, Anita’s scores on first variable
(number of hours spent in studies) is 2, and corresponding value on other variable (marks obtained) is 55. So the pair
is 2 and 55. We have four other pairs.
Step-4: We will locate the pairs in the graph. We will then see the intersection point of x and y in the graph for
each pair. Mark it by a clear dot. Then take second pair and so on. For example, Javed’s six hours are plotted with his

U T
60 marks, and so is the case with others.
The table below has experience and salaries for 10 employees.

N
Table-Experience in months and salary

N O E
Employee

1
2
Experience
(in months)
5
10
Salary
(in Rupees)
1000
2000

G M
3 15 3000

I
4 20 4000

N
5 25 5000
6 30 6000
7 35 7000

G U
8 40 8000

I
9 45 9000
10 50 10000

S R
The data shows positive relationship since the salary of the employee is increasing with increase in experience.

U
The data has been plotted in the following graph:

A S G
Fig-Scatter plot shows relationship between experience and salary
The table provides extroversion score and informa-tion about number of friends. We will plot a scatter gram
with this data.

15
https://www.facebook.com/IGNOUAssignmentGURU https://twitter.com/IGNOU_Guru
http://www.ignouassignmentguru.com/ IGNOU Assignment GURU Page No.16

Table-Extroversion and No. of Friends

Extroversion Scores Number of Friends


10 7
16 1
12 2
14 5
20 7
12 4

The data shows there is no one-to-one relationship between extroversion and number of friends. It implies
imperfection in the relationship. There is no trend that the number of friends increase with increase in extro version.
The data is plotted in the following figure:

O U N T
N M E
I G N
I G R U Fig-Relationship between Extroversion and Number of Friends

S
The Table below presents the data for intelligence and mistakes on reasoning task. We will plot a scatter plot on
this information.

A S G U Table 4: Intelligence and mistakes on reasoning task

Individual

1
2
Intelligence
(IQ)
100
105
Mistakes on
reasoning task
7
6
3 110 5
4 115 4
5 120 3
6 125 2
7 130 1
8 135 0

The relationship between intelligence and the number of mistakes on reasoning tasks is plotted in the following
scatter diagram:

16
https://www.facebook.com/IGNOUAssignmentGURU https://twitter.com/IGNOU_Guru
http://www.ignouassignmentguru.com/ IGNOU Assignment GURU Page No.17

O U N T Fig.-Intelligence and no. of mistakes on reasoning task.

Here the relationship is different from the one in other figures. Here as intelligence is increasing, the mistakes
appear to be reducing.

N E
Q. 15. Outliers
Ans. Outliers: Extreme score on one of the variables or both the variables has deterring impact on the correlation
values. Outliers affect the strength and degree of the correlation. If we compute correlation between height and

G M
weight, one of the scores has low score on weight and high score on height. Without the outlier, the correlation is

I
0.95. The presence of an outlier drastically reduces a correlation coefficient to 0.45. Figure 2 shows the impact of an

N
outlier. Without the outlier, the correlation is 0.95. The presence of an outlier has reduced a correlation coefficient
to 0.45.

I G R U
S S U
A G
Q. 16. Point- Biserial Correlation.
Ans. Point-Biserial Correlation (rpb)
Fig

The point biserial correlation coefficient (rpb) is a correlation coefficient used when one variable is dichotomous
and the other variable is continuous. For example, if we want to correlate marital status with satisfaction with life,
we take marital status at two levels–married and unmarried. The satisfaction can be measured by using a standardised
test of satisfaction with life. Satisfaction with life can be taken as a continuously measured variable and marital
status as a dichotomous variable. Here we will use Point-Biserial Correlation (rpb).
Q. 17. Probability.
Ans. Definitions of Probability: Probability is the process of testing hypothesis through analysis of data. Beri
defines probability as the chance that a particular event will happen. For example, if a coin is tossed, we will say
what is chance that head appears.

17
https://www.facebook.com/IGNOUAssignmentGURU https://twitter.com/IGNOU_Guru
http://www.ignouassignmentguru.com/ IGNOU Assignment GURU Page No.18

Levin and Fox define probability as “the relative likelihood of occurrence of any given outcome or event.”
Probability of an event = The total number of times the occurrence of the event/the total possible times an event
can occur.
For example, if there are five men and seven women in a room, the probability that the next person coming out
of the room is a woman would be 5 in 12.
Probability of a man coming out next = number of man in the room/total number of men and women in the room
= 5/12 = .41
Converse rule of probability is the probability of an event not occurring.
Types of Probability
Probability can be of two types: theoretical and empirical.
Theoretical probabilities show the operation of chance and the assumption that we make about
the events. For example, when we toss a coin the probability of getting a head is .5 (1/2 = .5). The probability of
providing the correct answer of five item multiple choice question is .20 (1/5).
Empirical probabilities depend on observation to determine their value. For example, the probability that Indian

U T
hockey team wins a match is about .6 (6 out of 10 matches). When we say a zero probability, it means impossible and
1.00 probability means certainty. Probability is expressed in term of percentage and not in decimal.

N
Probability Distribution

O
A probability distribution is like a frequency distribution. The only difference is that frequency distribution is

E
based on empirical data and probability distribution is based on the probability theory. In a probability distribution,

N
the possible value of a variable is mentioned first and we calculate the probability associated with each. Probability
distribution is of three types: the Binomial distribution, the Poisson distribu-tion and the Normal distribution.

M
Q. 18. Standard error.

G
Ans. Standard Error

I
Standard error measures sampling error and error of measurement. When we have knowledge of the true mean,

N
means of the population, we randomly select 100 representative sample from the population and compute their mean
and standard deviations. The standard deviation we got from this representative sample is called standard error of

G U
the mean. The following formula is used to get the standard error of the mean:

I
SEm or σm = σ/ N

R
where,

S
• σ is the standard deviation of the sample mean
• N is the number of cases in the sample.

U
• If the standard error of measurement is large it means considerable sampling error.

A S G
n n

18
https://www.facebook.com/IGNOUAssignmentGURU https://twitter.com/IGNOU_Guru

You might also like