You are on page 1of 20

MB0040-STATISTICS FOR MANAGEMENT Q1. What are the functions of Statistics?

Disti nguish between Primary data and Secondary data. Statistics as a discipline is co nsidered indispensable in almost all spheres of human knowledge. There is hardly any branch of study which does not use statistics. Scientific, social and econo mic studies use statistics in one form or another. These disciplines makeuse of observations, facts and figures, enquiries and experiments etc. using statistics and statistical methods. Statistics studies almost all aspects in an enquiry. I t mainly aims at simplifying the complexity of information collected in an enqui ry. It presents data in a simplified form as to make them intelligible. It analy ses data and facilitates drawal of conclusions. Now let us briefly discuss some of the important functions of statistics. Presents facts in. simple form: Statis tics presents facts and figures in a definite form. That makes the statement log ical and convincing than mere description. It condenses the whole mass of figure s into a single figure. This makes the problem intelligible. Reduces the Complex ity of data: Statistics simplifies the complexity of data. The raw data are unin telligible. We make them simple and intelligible by using different statistical measures. Some such commonly used measures are graphs, averages, dispersions, sk ewness, kurtosis, correlation and regression etc. These measures help in interpr etation and drawing inferences. Therefore, statistics enables to enlarge the hor izon of one's knowledge. Facilitates comparison: Comparison between different se ts of observation is an important function of statistics. Comparison is necessar y to draw conclusions as Professor Boddington rightly points out. the object of s tatistics is to enable comparison between past and present results to ascertain the reasons for changes, which have taken place and the effect of such changes i n future. So to determine the efficiency of any measure comparison is necessary. Statistical devices like averages, ratios, coefficients etc. are used for the p urpose of comparison. Testing hypothesis: Formulating and testing of hypothesis is an important function of statistics. This helps in developing new theories. S o statistics examines the truth and helps in innovating new ideas. Formulation o f Policies : Statistics helps in formulating plans and policies in different fie lds. Statistical analysis of data forms the beginning of policy formulations. He nce, statistics is essential for planners, economists, scientists and administra tors to prepare different plans and programmes. Forecasting : The future is unce rtain. Statistics helps in forecasting the trend and tendencies. Statistical tec hniques are used for predicting the future values of a variable. For example a p roducer forecasts his future production on the basis of the present demand condi tions and his past experiences. Similarly, the planners can forecast the future population etc. considering the present population trends. Derives valid inferen ces : Statistical methods mainly aim at deriving inferences from an enquiry. Sta tistical techniques are often used by scholars planners and scientists to evaluat e different projects. These MB0040 Page 1

MB0040-STATISTICS FOR MANAGEMENT techniques are also used to draw inferences reg arding population parameters on the basis of sample information. Statistics is v ery helpful in the field of business, research, Education etc., some of the uses of Statistics are: Statistics helps in providing a better understanding and exa ct description of a phenomenon of nature. Statistics helps in proper and efficie nt planning of a statistical inquiry in any field of study. Statistical helps in collecting an appropriate quantitative data. Statistics helps in presenting com plex data in a suitable tabular, diagrammatic and graphic form for any easy and comprehension of the data. Statistics helps in understanding the nature and patt ern of variability of a phenomenon through quantitative observations. Statistics helps in drawing valid inference, along with a measure of their reliability abo ut the population parameters from the sample data Any statistical data can be cl assified under two categories depending upon the sources utilized. These categor ies are, 1. Primary data 2. Secondary data Primary Data: Primary data is the one , which is collected by the investigator himself for the purpose of a specific i nquiry or study. Such data is original in character and is generated by survey c onducted by individuals or research institution or any organisation. 1. The coll ection of data by the method of personal survey is possible only if the area cov ered by the investigator is small. Collection of data by sending the enumerator is bound to be expensive. Care should be taken twice that the enumerator record correct information provided by the informants. 2. Collection of primary data by framing a schedules or distributing and collecting questionnaires by post is le ss expensive and can be completed in shorter time. 3. Suppose the questions are embarrassing or of complicated nature or the questions probe into personnel affa irs of individuals, then the schedules may not be filled with accurate and corre ct information and hence this method is unsuitable 4. The information collected for primary data is mere reliable than those collected from the secondary data. Importance of Primary data cannot be neglected. A research can be conducted with out secondary data but a research based on only secondary data is least reliable and may have biases because secondary data has already been manipulated by huma n beings. In statistical surveys it is necessary to get information from primary sources and work on primary data: for example, the statistical records of femal e population in a country cannot be based on newspaper, magazine MB0040 Page 2

MB0040-STATISTICS FOR MANAGEMENT and other printed sources. One such sources are old and secondly they contain limited information as well as they can be mislea ding and biased. Secondary Data: Secondary data are those data which have been a lready collected and analysed by some earlier agency for its own use; and later the same data are used by a different agency. According to W.A.Neiswanger, A pri mary source is a publication in which the data are published by the same authori ty which gathered and analysed them. A secondary source is a publication, report ing the data which have been gathered by other authorities and for which others are responsible. 1. Secondary data is cheap to obtain. Many government publicatio ns are relatively cheap and libraries stock quantities of secondary data produce d by the government, by companies and other organizations. 2. Large quantities o f secondary data can be got through internet. 3. Much of the secondary data avai lable has been collected for many years and therefore it can be used to plot tre nds. 4. Secondary data is of value to: - The government help in making decisions and planning future policy. Business and industry in areas such as marketing, a nd sales in order to appreciate the general economic and social conditions and t o provide information on competitors. Research organizations by providing social , economical and industrial information. Secondary data can be less valid but it s importance is still there. Sometimes it is difficult to obtain primary data; i n these cases getting information from secondary sources is easier and possible. Sometimes primary data does not exist in such situation one has to confine the research on secondary data. Sometimes primary data is present but the respondent s are not willing to reveal it in such case too secondary data can suffice: for example, if the research is on the psychology of transsexuals first it is diffic ult to find out transsexuals and second they may not be willing to give informat ion you want for your research, so you can collect data from books or other publ ished sources. MB0040 Page 3

MB0040-STATISTICS FOR MANAGEMENT Q2. Draw a histogram for the following distribu tion: Age 0-10 10-20 20-30 30-40 40-50 No. of people 5 10 15 8 2 16 14 12 10 8 6 4 2 0 Frequency 0-10 10-20 20-30 30-40 40-50 Q3. Find the median value of the following set of values: 45, 32, 31, 46, 40, 28 , 27, 37, 36, 41 Arranging in ascending order 27, 28, 31, 32, 36, 37, 40, 41, 45 , 46 Median= 36+37/2 = 36.5 MB0040 Page 4

MB0040-STATISTICS FOR MANAGEMENT Q4. Calculate the standard deviation of the fol lowing data: Marks 78-80 80-82 82-84 84-86 86-88 88-90 No. of students 3 15 26 23 9 14 Arithmetic mean= 3+15+26+23+9+4 6 = 15 X 3 15 26 23 9 4 15 15 15 15 15 15 M (X-M) -12 0 11 8 -6 -11 (X-M)2 144 0 121 64 36 121 S= ( ) /n-1 = 81 =9 Q5. An unbiased coin is tossed six times. What is the probability that the tosse s will result in: (i) exactly two heads (ii) at least five heads Q5. Explain bri efly the types of sampling. Two categories: probability samples or non-probabili ty samples. PROBABILITY SAMPLES MB0040 Page 5

MB0040-STATISTICS FOR MANAGEMENT The idea behind this type is random selection. More specifically, each sample fr om the population of interest has a known probability of selection under a given sampling scheme. There are four categories of probability samples described bel ow. SIMPLE RANDOM SAMPLING The most widely known type of a random sample is the simple random sample (SRS). This is characterized by the fact that the probability of selection is the same for every case in the population. Simple random sampling is a method of selecti ng n units from a population of size N such that every possible sample of size a nd has equal chance of being drawn. An example may make this easier to understan d. Imagine you want to carry out a survey of 100 voters in a small town with a p opulation of 1,000 eligible voters. With a town this size, there are "oldfashion ed" ways to draw a sample. For example, we could write the names of all voters o n a piece of paper, put all pieces of paper into a box and draw 100 tickets at r andom. You shake the box, draw a piece of paper and set it aside, shake again, d raw another, set it aside, etc. until we had 100 slips of paper. These 100 form our sample. And this sample would be drawn through a simple random sampling proc edure - at each draw, every name in the box had the same probability of being ch osen. In real-world social research, designs that employ simple random sampling are difficult to come by. We can imagine some situations where it might be possi ble - you want to interview a sample of doctors in a hospital about work conditi ons. So you get a list of all the physicians that work in the hospital, write th eir names on a piece of paper, put those pieces of paper in the box, shake and d raw. But in most real-world instances it is impossible to list everything on a p iece of paper and put it in a box, then randomly draw numbers until desired samp le size is reached. STRATIFIED RANDOM SAMPLING In this form of sampling, the population is first divided into two or more mutua lly exclusive segments based on some categories of variables of interest in the research. It is designed to organize the population into homogenous subsets befo re sampling, then drawing a random sample within each subset. With stratified ra ndom sampling the population of N units is divided into subpopulations of units respectively. These subpopulations, called strata, are non-overlapping and toget her they comprise the whole of the population. When these have been determined, a sample is drawn from each, with a separate draw for each of the different stra ta. The sample sizes within the strata are denoted by respectively. If a SRS is taken within each stratum, then the whole sampling procedure is described as str atified random sampling. MB0040 Page 6

MB0040-STATISTICS FOR MANAGEMENT The primary benefit of this method is to ensure that cases from smaller strata o f the population are included in sufficient numbers to allow comparison. Stratif ication is a common technique. There are many reasons for this, such as: 1. If d ata of known precision are wanted for certain subpopulations, than each of these should be treated as a population in its own right. 2. Administrative convenien ce may dictate the use of stratification, for example, if an agency administerin g a survey may have regional offices, which can supervise the survey for a part of the population. 3. Sampling problems may be inherent with certain sub populat ions, such as people living in institutions (e.g. hotels, hospitals, prisons). 4 . Stratification may improve the estimates of characteristics of the whole popul ation. It may be possible to divide a heterogeneous population into subpopulatio ns, each of which is internally homogenous. If these strata are homogenous, i.e. , the measurements vary little from one unit to another; a precise estimate of a ny stratum mean can be obtained from a small sample in that stratum. The estimat e can then be combined into a precise estimate for the whole population. 5. Ther e is also a statistical advantage in the method, as a stratified random sample n early always results in a smaller variance for the estimated mean or other popul ation parameters of interest. SYSTEMATIC SAMPLING This method of sampling is at first glance very different from SRS. In practice, it is a variant of simple random sampling that involves some listing of element s - every nth element of list is then drawn for inclusion in the sample. Say you have a list of 10,000 people and you want a sample of 1,000. Creating such a sa mple includes three steps: 1. Divide number of cases in the population by the de sired sample size. In this example, dividing 10,000 by 1,000 gives a value of 10 . 2. Select a random number between one and the value attained in Step 1. In thi s example, we choose a number between 1 and 10 - say we pick 7. 3. Starting with case number chosen in Step 2, take every tenth record (7, 17, 27, etc.). MB0040 Page 7

MB0040-STATISTICS FOR MANAGEMENT More generally, suppose that the N units in the population are ranked 1 to N in some order (e.g., alphabetic). To select a sample of n units, we take a unit at random, from the 1st k units and take every k-th unit thereafter. The advantages of systematic sampling method over simple random sampling include: 1. It is eas ier to draw a sample and often easier to execute without mistakes. This is a par ticular advantage when the drawing is done in the field. 2. Intuitively, you mig ht think that systematic sampling might be more precise than SRS. In effect it s tratifies the population into n strata, consisting of the 1st k units, the 2nd k units, and so on. Thus, we might expect the systematic sample to be as precise as a stratified random sample with one unit per stratum. The difference is that with the systematic one the units occur at the same relative position in the str atum whereas with the stratified, the position in the stratum is determined sepa rately by randomization within each stratum. CLUSTER SAMPLING In some instances the sampling unit consists of a group or cluster of smaller un its that we call elements or subunits (these are the units of analysis for your study). There are two main reasons for the widespread application of cluster sam pling. Although the first intention may be to use the elements as sampling units , it is found in many surveys that no reliable list of elements in the populatio n is available and that it would be prohibitively expensive to construct such a list. In many countries there are no complete and updated lists of the people, t he houses or the farms in any large geographical region. Even when a list of ind ividual houses is available, economic considerations may point to the choice of a larger cluster unit. For a given size of sample, a small unit usually gives mo re precise results than a large unit. For example a SRS of 600 houses covers a t own more evenly than 20 city blocks containing an average of 30 houses apiece. B ut greater field costs are incurred in locating 600 houses and in traveling betw een them than in covering 20 city blocks. When cost is balanced against precisio n, the larger unit may prove superior. Important things about cluster sampling: 1. Most large scale surveys are done using cluster sampling; 2. Clustering may b e combined with stratification, typically by clustering within strata; MB0040 Page 8

MB0040-STATISTICS FOR MANAGEMENT 3. In general, for a given sample size n cluster samples are less accurate than the other types of sampling in the sense that the parameters you estimate will h ave greater variability than an SRS, stratified random or systematic sample. NONPROBABILITY SAMPLING Social research is often conducted in situations where a researcher cannot selec t the kinds of probability samples used in large-scale social surveys. For examp le, say you wanted to study homelessness - there is no list of homeless individu als nor are you likely to create such a list. However, you need to get some kind of a sample of respondents in order to conduct your research. To gather such a sample, you would likely use some form of non-probability sampling. To reiterate , the primary difference between probability methods of sampling and non-probabi lity methods is that in the latter you do not know the likelihood that any eleme nt of a population will be selected for study. There are four primary types of n on-probability sampling methods: AVAILABILITY SAMPLING Availability sampling is a method of choosing subjects who are available or easy to find. This method is also sometimes referred to as haphazard, accidental, or convenience sampling. The primary advantage of the method is that it is very ea sy to carry out, relative to other methods. A researcher can merely stand out on his/her favorite street corner or in his/her favorite tavern and hand out surve ys. One place this used to show up often is in university courses. Years ago, re searchers often would conduct surveys of students in their large lecture courses . For example, all students taking introductory sociology courses would have bee n given a survey and compelled to fill it out. There are some advantages to this design - it is easy to do, particularly with a captive audience, and in some sc hools you can attain a large number of interviews through this method. The prima ry problem with availability sampling is that you can never be certain what popu lation the participants in the study represent. The population is unknown, the m ethod for selecting cases is haphazard, and the cases studied probably don't rep resent any population you could come up with. However, there are some situations in which this kind of design has advantages - for example, survey designers oft en want to have some people respond to their survey before it is given out in th e "real" research setting as a way of making certain the questions make sense to respondents. For this purpose, availability sampling is not a bad way to get a group to take a survey, though in this case MB0040 Page 9

MB0040-STATISTICS FOR MANAGEMENT researchers care less about the specific responses given than whether the instru ment is confusing or makes people feel bad. Despite the known flaws with this de sign, it's remarkably common. Ask a provocative question, give telephone number and web site address ("Vote now at CNN.com), announce results of poll. This meth od provides some form of statistical data on a current issue, but it is entirely unknown what population the results of such polls represents. At best, a resear cher could make some conditional statement about people who are watching CNN at a particular point in time who cared enough about the issue in question to log o n or call in. QUOTA SAMPLING Quota sampling is designed to overcome the most obvious flaw of availability sam pling. Rather than taking just anyone, you set quotas to ensure that the sample you get represents certain characteristics in proportion to their prevalence in the population. Note that for this method, you have to know something about the characteristics of the population ahead of time. Say you want to make sure you h ave a sample proportional to the population in terms of gender - you have to kno w what percentage of the population is male and female, then collect sample unti l yours matches. Marketing studies are particularly fond of this form of researc h design. The primary problem with this form of sampling is that even when we kn ow that a quota sample is representative of the particular characteristics for w hich quotas have been set, we have no way of knowing if sample is representative in terms of any other characteristics. If we set quotas for gender and age, we are likely to attain a sample with good representativeness on age and gender, bu t one that may not be very representative in terms of income and education or ot her factors. Moreover, because researchers can set quotas for only a small fract ion of the characteristics relevant to a study quota sampling is really not much better than availability sampling. To reiterate, you must know the characterist ics of the entire population to set quotas; otherwise there's not much point to setting up quotas. Finally, interviewers often introduce bias when allowed to se lf-select respondents, which is usually the case in this form of research. In ch oosing males 18-25, interviewers are more likely to choose those that are better -dressed, seem more approachable or less threatening. That may be understandable from a practical point of view, but it introduces bias into research findings. PURPOSIVE SAMPLING MB0040 Page 10

MB0040-STATISTICS FOR MANAGEMENT Purposive sampling is a sampling method in which elements are chosen based on pu rpose of the study. Purposive sampling may involve studying the entire populatio n of some limited group (sociology faculty at Columbia) or a subset of a populat ion (Columbia faculty who have won Nobel Prizes). As with other non-probability sampling methods, purposive sampling does not produce a sample that is represent ative of a larger population, but it can be exactly what is needed in some cases - study of organization, community, or some other clearly defined and relativel y limited group. SNOWBALL SAMPLING Snowball sampling is a method in which a researcher identifies one member of som e population of interest, speaks to him/her, then asks that person to identify o thers in the population that the researcher might speak to. This person is then asked to refer the researcher to yet another person, and so on. Snowball samplin g is very good for cases where members of a special population are difficult to locate. For example, several studies of Mexican migrants in Los Angeles have use d snowball sampling to get respondents. The method also has an interesting appli cation to group membership - if you want to look at pattern of recruitment to a community organization over time, you might begin by interviewing fairly recent recruits, asking them who introduced them to the group. Then interview the peopl e named, asking them who recruited them to the group. The method creates a sampl e with questionable representativeness. A researcher is not sure who is in the s ample. In effect snowball sampling often leads the researcher into a realm he/sh e knows little about. It can be difficult to determine how a sample compares to a larger population. Also, there's an issue of who respondents refer you to - fr iends refer to friends, less likely to refer to ones they don't like, fear, etc. MB0040 Page 11

MB0040-STATISTICS FOR MANAGEMENT (i) Q1. Explain the following terms with respect to statistics: Sample In statistics, a sample is a subset of a population. Typically, the population i s very large, making a census or a complete enumeration of all the values in the population impractical or impossible. The sample represents a subset of managea ble size. Samples are collected and statistics are calculated from the samples s o that one can make inferences or extrapolations from the sample to the populati on. This process of collecting information from a sample is referred to as sampl ing. A complete sample is a set of objects from a parent population that include s ALL such objects that satisfy a set of well-defined selection criteria. For ex ample, a complete sample of Australian men taller than 2m would consist of a lis t of every Australian male taller than 2m. But it wouldn't include German males, or tall Australian females, or people shorter than 2m. So to compile such a com plete sample requires a complete list of the parent population, including data o n height, gender, and nationality for each member of that parent population. In the case of human populations, such a complete list is unlikely to exist, but su ch complete samples are often available in other disciplines, such as complete m agnitude-limited samples of astronomical objects. An unbiased sample is a set of objects chosen from a complete sample using a selection process that does not d epend on the properties of the objects. For example, an unbiased sample of Austr alian men taller than 2m might consist of a randomly sampled subset of 1% of Aus tralian males taller than 2m. But one chosen from the electoral register might n ot be unbiased since, for example, males aged under 18 will not be on the electo ral register. In an astronomical context, an unbiased sample might consist of th at fraction of a complete sample for which data are available, provided the data availability is not biased by individual source properties. The best way to avo id a biased or unrepresentative sample is to select a random sample, also known as a probability sample. A random sample is defined as a sample where each indiv idual member of the population has a known, non-zero chance of being selected as part of the sample. MB0040 Page 12

MB0040-STATISTICS FOR MANAGEMENT Several types of random samples are simple rand om samples, systematic samples, stratified random samples, and cluster random sa mples. (ii) Variable A variable is a characteristic that may assume more than one set of values to wh ich a numerical measure can be assigned. Height, age, amount of income, province or country of birth, grades obtained at school and type of housing are all exam ples of variables. Variables may be classified into various categories, some of which are outlined in this section. Categorical variables: A categorical variable (also called qualitative variable) is one for which each response can be put into a specific category. These categ ories must be mutually exclusive and exhaustive. Mutually exclusive means that e ach possible survey response should belong to only one category, whereas, exhaus tive requires that the categories should cover the entire set of possibilities. Categorical variables can be either nominal or ordinal. Nominal variables: A nom inal variable is one that describes a name or category. Contrary to ordinal vari ables, there is no 'natural ordering' of the set of possible names or categories . Ordinal variables: An ordinal variable is a categorical variable for which the p ossible categories can be placed in a specific order or in some 'natural' way. N umeric variables: A numeric variable, also known as a quantitative variable, is one that can assume a number of real valuessuch as age or number of people in a h ousehold. However, not all variables described by numbers are considered numeric . For example, when you are asked to assign a value from 1 to 5 to express your level of satisfaction, you use numbers, but the variable (satisfaction) is reall y an ordinal variable. Numeric variables may be either continuous or discrete. C ontinuous variables: A variable is said to be continuous if it can assume an inf inite number of real values. Examples of a continuous variable are distance, age and temperature. The measurement of a continuous variable is restricted by the methods used, or by the accuracy of the measuring instruments. For example, the height of a student is a continuous variable because a student may be 1.63217487 55... metres tall. Discrete variables: As opposed to a continuous variable, a di screte variable can only take a finite number of real values. An example of a di screte variable would be the score given by a judge to a gymnast in competition: the range is 0 to 10 and the score is always given to one decimal (e.g., a scor e of 8.5). (iii) Population Page 13 MB0040

MB0040-STATISTICS FOR MANAGEMENT A statistical population is a set of entities c oncerning which statistical inferences are to be drawn, often based on a random sample taken from the population. For example, if we were interested in generali zations about crows, then we would describe the set of crows that is of interest . Notice that if we choose a population like all crows, we will be limited to ob serving crows that exist now or will exist in the future. Probably, geography wi ll also constitute a limitation in that our resources for studying crows are als o limited. Population is also used to refer to a set of potential measurements o r values, including not only cases actually observed but those that are potentia lly observable. Suppose, for example, we are interested in the set of all adult crows now alive in the county of Cambridge shire, and we want to know the mean w eight of these birds. For each bird in the population of crows there is a weight , and the set of these weights is called the population of weights.

A subset of a population is called a subpopulation. If different subpopulations have different properties, the properties and response of the overall population can often be better understood if it is first separated into distinct subpopula tions. For instance, a particular medicine may have different effects on differe nt subpopulations, and these effects may be obscured or dismissed if such specia l subpopulations are not identified and examined in isolation. Similarly, one ca n often estimate parameters more accurately if one separates out subpopulations: distribution of heights among people is better modeled by considering men and w omen as separate subpopulations, for instance. Populations consisting of subpopu lations can be modeled by mixture models, which combine the distributions within subpopulations into an overall population distribution. Q2. What are the types of classification of Data? According to Nature 1. Quantit ative data- information obtained from numeral variables(e.g. age, bills, etc) 2. Qualitative Data- information obtained from variables in the form of categories , characteristics names or labels or alphanumeric variables (e.g. birthdays, gen der etc.) According to Source 1. Primary data- first- hand information (e.g. aut obiography, financial statement) 2. Secondary data- second-hand information (e.g . biography, weather forecast from news papers) According to Measurement 1. Disc rete data- countable numerical observation.-Whole numbers only has an equal whol e number interval obtained through counting(e.g. corporate stocks, etc.) 2. Cont inuous data-measurable observations. -decimals or fractions obtained through mea suring(e.g. bank deposits, volume of liquid etc.)

QUALITATIVE DATA MB0040 Page 14

MB0040-STATISTICS FOR MANAGEMENT Qualitative data is a categorical measurement e xpressed not in terms of numbers, but rather by means of a natural language desc ription. In statistics, it is often used interchangeably with "categorical" data . For example: favorite color = "yellow" height = "tall" Although we may have ca tegories, the categories may have a structure to them. When there is not a natur al ordering of the categories, we call these nominal categories. Examples might be gender, race, religion, or sport. When the categories may be ordered, these a re called ordinal variables. Categorical variables that judge size (small, mediu m, large, etc.) are ordinal variables. Attitudes (strongly disagree, disagree, n eutral, agree, strongly agree) are also ordinal variables, however we may not kn ow which value is the best or worst of these issues. Note that the distance betw een these categories is not something we can measure. QUANTITATIVE DATA Quantita tive data is a numerical measurement expressed not by means of a natural languag e description, but rather in terms of numbers. However, not all numbers are cont inuous and measurable. For example, the social security number is a number, but not something that one can add or subtract. For example: favorite color = "450 n m" height = "1.8 m" Quantitative data always are associated with a scale measure . Probably the most common scale type is the ratio-scale. Observations of this t ype are on a scale that has a meaningful zero value but also have an equidistant measure (i.e., the difference between 10 and 20 is the same as the difference b etween 100 and 110). For example, a 10 yearold girl is twice as old as a 5 yearold girl. Since you can measure zero years, time is a ratio-scale variable. Mone y is another common ratio-scale quantitative measure. Observations that you coun t are usually ratio-scale (e.g., number of widgets). A more general quantitative measure is the interval scale. Interval scales also have a equidistant measure. However, the doubling principle breaks down in this scale. A temperature of 50 degrees Celsius is not "half as hot" as a temperature of 100, but a difference o f 10 degrees indicates the same difference in temperature anywhere along the sca le. The Kelvin temperature scale, however, constitutes a ratio scale because on the Kelvin scale zero indicates absolute zero MB0040 Page 15

MB0040-STATISTICS FOR MANAGEMENT in temperature, the complete absence of heat. S o one can say, for example, that 200 degrees Kelvin is twice as hot as 100 degre es Kelvin. PRIMARY DATA Primary data means original data that has been collected specially for the purpose in mind. It means when an authorized organization, in vestigator or an enumerator collects the data for the first time from the origin al source. Data collected this way is called primary data. SECONDARY DATA Second ary data is data that has been collected for another purpose. When we use Statis tical Method with Primary Data from another purpose for our purpose we refer to it as Secondary Data. It means that one purpose's Primary Data is another purpos e's Secondary Data. Secondary data is data that is being reused. Usually in a di fferent context. Q3. Find the (i) arithmetic mean and (ii) range of the followin g data 15, 77, 22, 21, 19, 26, 20 Arithmetic mean= (15+77+22+21+19+26+20)/7 =140 /7 =20 Range = highest number- lowest number/2 = 58/2 =29 Q4. Suppose two houses in a thousand catch fire in a year and there are 2000 houses in a village. What is the probability that: (i) none of the houses catch fire (ii) Atleast one hou se catch fire Q5. (i) What are the characteristics of Chi-square test? 1. It is not symmetric 2. The shape of the chi-square distribution depends upon the degre es of freedom, just like Students t-distribution. 3. As the number of degrees of freedom increases, the chi-square distribution becomes more symmetric as is illu strated in Figure 1. 4. The values are non-negative. That is, the values of are greater than or equal to 0. MB0040 Page 16

MB0040-STATISTICS FOR MANAGEMENT 5. This is not a test, but a distribution. The Chi-square distribution, is derived from the Normal distribution. It is the dist ribution of a sum of squared Normal distributed variables. That is, if all Xi ar e independent and all have an identical, standard Normal distribution then X^2 = X1*X1 + X2*X2 + X3*X3 + ... + Xv*Xv is Chi-square distributed with v degrees of freedom with mean = v and variance = 2*v. The importance of the Chi-square dist ribution stems from the fact that it describes the distribution of the Variance of a sample taken from a Normal distributed population. 6. Chi-square is non-neg ative. Is the ratio of two non-negative values, therefore must be non-negative i tself 7. There are many different chi-square distributions, one for each degree of freedom 8. The degrees of freedom when working with a single population varia nce is n-1. since the chi-square distribution isn't symmetric, the method for lo oking up left-tail values is different from the method for looking up right tail values. Area to the right - just use the area given. Area to the left - the table requir es the area to the right, so subtract the given area from one and look this area up in the table. Area in both tails - divide the area by two. Look up this area for the right critical value and one minus this area for the left critical valu e. MB0040 Page 17

MB0040-STATISTICS FOR MANAGEMENT (ii) The data given in the below table shows the production n three shifts and t he number of defective goods that turned out in three weeks. Test at 5% level of significance whether the weeks and shifts are independent. Shift I II III Total 1st week 15 20 25 60 2ndweek 5 10 15 30 3rd week 20 20 20 6 0 Total 40 50 60 150 Q6. Find Karl Pearsons correlation co-efficient for the data given below table: X Y 20 22 16 14 12 4 8 12 4 8 MB0040 Page 18

MB0040-STATISTICS FOR MANAGEMENT MB0040 Page 19

You might also like