You are on page 1of 48

DATA

Figures do not lie but liers do figure!!

PROCESSED THROGH COMPUTER AND STATISTICAL METHODS

INFORMATION

Lots of data are generated by various activities everyday

Statistics today is not an ornament but a necessary tool!

Data does not mean numerical facts aloneit includes text, pictures as well as voice!! Some one has to compile that large data and make valuable information
Statistics and Computers help in this process

Data flows from sources like..


Village Level Mandal Level

Data flows like flood water!! Database is like a Check dam!

District Level
State Level Region Level Country Level Global Level A database is a collection of records (or data files) combined and treated as a unit for information retrieval

We have statistical databases on various aspects like


Food grains Blood Banks Tax Payers Agricultural Output Share Market and many more..

Statistical Databases!

The DATA should be converted into information (reports) by applying Data Analysis Tools

Examining data for its relevance Preparation of tables

What is Data Graphic display of information Analysis? Estimating the unknown

Making Figures Speak (the truth!)

Example: Agricultural output by Cropcutting experiments

Establishing functional relationship between causes and effect Computing the Growth rates Understanding the Trends and making forecasts and many more! Preparing a document stating the methodology and interpreting the results

The Common and Old Method


Physical counting of cases from data sheets Hand Calculations

How to do?

Reference to Statistical Books for formulae Bypassing complex calculations and reporting the easy-to-do things alone!

The Contemporary Method


Get data into the computer

Use a statistical software Prepare document using a Word Processor

A new health insurance scheme is introduced by a company for its employees The management wishes to know the reaction of its employees to the new scheme Opinions were collected from 50 employees on several aspects like

A survey on health insurance

Age, Gender, Marital Status, Education level, Present arrangements for health check up, monthly income and Concept Rating.

A questionnaire has been designed and used for collecting data

Collection of data with suitable coding

Opinions were sought on a five point scale (multiple choice-tick one only)

Coding of responses is as follows.


Extremely interested Interested Indifferent Not interested Not at all interested 5 4 3 2 1

Coding for personal factors

Age (initially no coding ) actual years Gender Male Female Marital Status Married Single Monthly income Less than Rs.1000 Rs.1000 to Rs.2999 Rs.3000 to Rs.4999 Rs.5000 & above

M F M S 1 2 3 4

Coding for personal Present Arrangement Private doctor-own expenses factors


Government/Corporate Hospitals Partial reimbursement Full reimbursement

Education Below Higher Secondary Higher Secondary Graduation Post-graduation

1 2 3 4 1 2 3
4

Analysis is based on the questions for which the data is expected to provide answers Some questions Identify how many are interested in the new scheme and how many are either indifferent or not interested Analyze Cross tabulate them along Gender, Age, the Data! Education, marital status etc Is there any relationship between the income level and the type of response? Identify the factors influencing the adoption to new scheme? What else the data speaks!

Data Entry -The First Step Analysis with Software The Second Step

The physical structure of data

The data collected from the field contains filled-in questionnaires or sheets Each sheet must have a serial number The sheets should be converted into a data file for use in computer We can probably divide the work and make more than one file and assign the work to Data Entry Operators The Data Entry Design should be well planned and be common for all operators These data files can be pooled up if necessary to make a project-data-file

Data should be arranged as separate records one for each individual (entity) The data should be numeric for carrying out any analysis Names and other labels will not go in for analysis but can be used for reporting

TAKING DATA FROM BOOK TO COMPUTER

Suitable coding should be defined before entering data in the computer

There are many packages for data entry like..


FoxPro Lotus MS-Excel MS-Access Oracle On-line formats

Software for data entry and data Packages for Statistical Analysis analysis
SPSS SAS MINITAB SYSTAT

A VISIT TO EXCEL

MAKING A DATA FILE

Open Excel On the title bar of the Excel window the file name appears as Microsoft Excel Book1 It usually contains three sheets named Sheet1,Sheet2 and Sheet3 In Sheet1 start entering the data from cell A1 Reserve the first row for column headings like Sno, Age, Gender etc Key in the data row wise or column wise (press ENTER key after each entry) Save the file with a suitable name in a Folder meant for this project

A SAMPLE DATA SHEET

File Name: Food Folder: D:\Statman

NOT THE CORREC T STYLE OF DATA ENTRY

THE RIGHT WAY!

DATA SHEET OF HEALTH INSURANCE

Finding sums
Data sorting and Filtering Making one dimension tables Cross tabulations Creating different types of graphs Making abstracts from worksheets Changing the styles of presenting data Linking Excel report to a document

ANALYTIC AL FEATURE S IN EXCEL

Selecting a part of data


Sorting Filtering Column width Cut, Copy & Paste Auto Fill Paste Special Freeze Panes Exporting Excel data to Word

SOME TIPS IN DATA HANDLIN G

A free package of simple statistical tools is available in Excel


It is called Data Analysis Pak It provides for analyses like Summary statistics Comparison of groups Correlations Regression analysis Statistical tests of hypothesis ..and many more

DATA ANALYSIS PAK

Data Prepared In Word Table

SNO 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

NAME GENDER RAJA. M B ANITHA. R G NEELIMA. K G SIVARAJAN. A B MUTHU. B G GOPAL.R B BEENA. A G ACHUTAN. S B PRADEEP.M B PERUMAL. S B VARADAN. D B DIVYA. T G VASUMATHI. D G ANDAL. B B JAYA. L G RAMAN. N B MUREGESH. M B GANESH. L B SASIKALA. R G VALLI. M G

CASTE SC SC ST OC OC OC BC BC BC OC OC BC BC SC ST BC ST ST BC SC

ENGLISH MATHS 60 27 55 44 46 54 35 47 20 46 54 50 63 46 54 52 35 40 25 36 28 40 64 56 37 45 63 44 56 52 45 48 50 46 35 38 52 50 41 55

SCIENCE 45 36 65 28 35 45 64 65 54 45 38 37 54 36 63 54 68 65 54 58

It is enough to copy the Word Table and Paste in Excel!

We have got it in Excel!

Soft Skill

Can we make a table of counts (frequencies) from this data?

WHY NOT ? USE PIVOT TABLES OPTION

Skill

You can make one-way and two-way frequency tables from Excel sheet Use Data menu and select the Pivot Table and Chart sub menu Follow the Wizard steps You will get the required tables

Make Frequency Tables!

Frequency distribution of students by caste (one-way table)


Count of SNO CASTE Total BC 7 OC 5 SC 4 ST 4 Grand Total 20

Frequency distribution of students by Caste and Gender (two-way table)


Count of SNO GENDER CASTE B BC 3 OC 4 SC 2 ST 2 Grand Total 11 G 4 1 2 2 9 Grand Total 7 5 4 4 20

Can we do this with hand calculations if there are thousands of cases? Not impossible but difficult to do!

Soft Skill

Can we make a Frequency table with given class intervals?

CERTAINLY !

USE STATISTICAL FUNCTIONS

ENGINEERING FUNCTIONS

Built-in Functions In Excel

STATISTICAL FUNCTIONS

Built-in Functions In Excel

AQUIRE SKILL BY DOING

DEMO FOLLOWS..

Making a Frequency Table


Body length (cm) of 120 fish

16.7 16.9 14.3 13.8 16.9 15.3 15.6 15.6 12.7 19.5 16.9 12.9

12.6 13.7 18.3 13.2 15.0 18.9 18.0 15.4 14.1 14.3 12.4 13.5

15.1 16.0 18.3 13.7 17.2 14.8 15.8 12.6 12.2 16.2 15.4 15.1

13.4 14.4 16.6 18.4 14.5 16.0 15.7 15.4 16.6 15.9 17.6 14.2

16.7 15.3 13.2 17.1 13.6 18.5 20.6 17.2 17.0 16.8 16.2 15.3

17.7 16.4 17.5 13.9 16.6 13.3 13.5 15.1 15.6 15.3 14.4 14.8

14.6 12.8 16.9 20.5 13.0 19.2 16.3 14.1 14.7 17.3 18.8 15.2

18.0 11.5 15.2 13.2 17.9 16.2 15.1 13.1 18.7 13.1 13.5 14.4

15.8 13.4 14.0 14.9 18.8 14.4 14.3 15.4 18.3 12.3 14.2 16.1

14.8 16.0 17.7 17.4 17.9 17.8 10.7 13.5 13.2 17.0 14.8 18.2

Prepare a frequency table using Excel

min max range interval

10.7 20.6 9.9 2


freq 2 26 43 31 16 2

We use the Paste function


FREQUENCY

lower limit upper limit upper bound (BIN) 10 12.0 11.9 12 14.0 13.9 14 16.0 15.9 16 18.0 17.9 18 20.0 19.9 20 22.0 21.9

Class 10 - 12 12 - 14 14 -16 16 - 18 18 - 20 20 - 22

freq 2 26 43 31 16 2 120

Learn more by Do it yourself

You can also construct a Bar Chart

Class 10 - 12 12 - 14 14 -16 16 - 18 18 - 20 20 - 22 TOTAL

freq 2 26 43 31 16 2 120

ADVANCED FEATURES

Data Analysis Pak

Data Analysis Pak

Body Mass Index of Tribal Groups

The t-test

Is the Average BMI Same for the two groups ?

Sugali Yanadi 20.43 17.7 22.51 21.4 18.99 20.7 20.49 19.3 23.12 21 25.63 17.9 18.08 18.6 20.63 18.5 22.55 18.2 22.43 20.3 22.77 23.23

t-test output

t-Test: Two-Sample Assuming Equal Variances Sugali Yanadi Mean 21.73833 19.36 Variance 4.319215 1.898222 Observations 12 10 Pooled Variance 3.229768 Hypothesized Mean Difference 0 df 20 t Stat 3.090767 P(T<=t) one-tail 0.002882 t Critical one-tail 1.724718 P(T<=t) two-tail 0.005764 t Critical two-tail 2.085962

p-p Plot

WIDE RANGE OF APPLICATIONS


Control charts Forecasting Curve fitting Solver for optimization

College Admissions
Evaluation of test scores & ranking and many more!

The best way of learning Excel is to work with Excel

Statistics Made Simple Do it yourself on PC


By K.V.S.Sarma Prentice Hall India

Thank you

You might also like