You are on page 1of 16

ASSIGNMENT 01/02

NAME REGISTRATION NUMBER LEARNING CENTRE LEARNING CENTRE CODE COURSE SUBJECT SEMESTER MODULE NUMBER DATE OF SUBMISSION MARKS AWARDED : : : MANIPAL INSTITUTE OF TECHNOLOGY : 00952 : MBA HCS : STATISTICS FOR MANAGEMENT : 1ST SEMESTER : : 20TH DECEMBER, 2011 :

Q1. 1a. Statistics is the backbone of decision making. Comment on it. [5] Due to advanced communication network, rapid changes in consumer behavior, varied expectations of variety of consumers and new market openings, modern managers have a difficult task of making quick and appropriate decisions. Therefore, there is a need for them to depend more upon quantitative techniques like mathematical models, statistics, operations research and econometrics. As you can see, what the General Manager is doing here is using Statistics to solve a problem and to increase profits. Decision making is a key part of our day-to-day life. Even when we wish to purchase a television, we like to know the price, quality, durability, and maintainability of various brands and models before buying one. As you can see, in this scenario we are collecting data and making an optimum decision. In other words, we are using Statistics. Again, suppose a company wishes to introduce a new product, it has to collect data on market potential, consumer likings, availability of raw materials, feasibility of producing the product. Hence, data collection is the back-bone of any decision making process. Many organizations find themselves data-rich but poor in drawing information from it. Therefore, it is important to develop the ability to extract meaningful information from raw data to make better decisions. Statistics play an important role in this aspect. Statistics is broadly divided into two categories: STATISTICS Inferential Statistics Making inference Hypothesis testing Determining relationships Making predictions
2

Descriptive Statistics Collecting Organizing Summarizing Presenting data

y Descriptive Statistics: Descriptive statistics is used to present the general description of data which is summarized quantitatively. This is mostly useful in clinical research, when communicating the results of experiments. y Inferential Statistics: Inferential statistics is used to make valid inferences from the data which are helpful in effective decision making for managers or professionals. Statistical methods such as estimation, prediction and hypothesis testing belong to inferential statistics. The researchers make deductions or conclusions from the collected data samples regarding the characteristics of large population from which the samples are taken. So, we can say, Statistics is the backbone of decision making. 1b. Give the plural meaning of the word statistics. [5] The word statistics is used as the plural of the word "Statistic" which refers to a numerical quantity like mean, median, variance etc., calculated from sample value. In plural sense, the word statistics refer to numerical facts and figures collected in a systematic manner with a definite purpose in any field of study. In this sense, statistics are also aggregates of facts which are expressed in numerical form. For example: Statistics on industrial production, statistics or population growth of a country in different years etc. For Example: If we select 15 student from a class of 80 students, measure their heights and find the average height. This average would be a statistic.

Q2. 2a. In a bivariate data on 'x' and 'y', variance of 'x' = 49, variance of 'y' = 9 and covariance ( x, y) = -17.5. Find coefficient of correlation between 'x' and 'y'. [5] We know that:

Hence, there is a highly negative co-relation.

2b. Enumerate the factors which should be kept in mind for proper planning. [5] PLANNING A STATISTICAL SURVEY: The relevance and accuracy of data obtained in a survey depends upon the care exercised in planning. A properly planned investigation can lead to best results with least cost and time. Steps involved in the planning stage. STEPS INVOLVED IN PLANNING A STATISTICAL SURVEY: Step 1 Nature of the problem to be investigated should be clearly defined in an unambiguous manner. Step 2 Objectives of the investigation should be stated at the outset. Objectives could be to: y Obtain certain estimates. y Establish a theory. y Verify an existing statement. y Find relationship between characteristics. Step 3 The scope of the investigation has to be made clear which refers to the area to be covered, identification of the unite to be studied, nature of the characteristics to be observed, accuracy of measurements, analytical methods, cost and other resources required. Step 4 Whether to use data collected from primary to secondary source should be determined in advance. Step 5 The organization of investigation is the final step in the process. It encompasses the determination of the number of investigators required, their training, supervision work needed and funds required.

Q3. The percentage sugar content of Tobacco in two samples is represented in the table below. Test whether their population variances are same. [10] Percentage sugar content of Tobacco in two samples: Sample A Sample B 2.4 2.7 2.7 3.0 2.6 2.8 2.1 3.1 2.5 2.2

3.6

Solution: Sample A:

Sample B:

Q4. 4a. Explain the characteristics of business forecasting. [5] CHARACTERISTICS OF BUSINESS FORECASTING is as follows: y Based on past and present conditions: Business forecasting is based on past and present economic condition of the business. To forecast the future, various data, information and facts concerning to economic condition of business for past and present are analyzed. y Based on mathematical and statistical methods: The process of forecasting includes the use of statistical and mathematical methods. By using these methods, the actual trend which may take place in future can be forecasted. y Period: The forecasting can be made for long term, short term, medium term or any specific period. y Estimation of future: The business forecasting is to forecast the future regarding probable economic conditions. y Scope: The forecasting can be physical as well as financial.

4b. Differentiate between prediction, projection and forecasting. [5] A great amount of confusion seem to have grown up in the use of words 'forecast', 'prediction' and 'projection'. A prediction is an estimate based solely on past data of the series under investigation, it is purely mechanical extrapolation. A projection is a prediction where the extrapolated values are subject to certain numerical assumptions. A forecast is an estimate which relates the series in which we are interested to external factors.

Forecasts are made by estimating future values of the external factors by means of prediction, projection or forecast and from these values calculating the estimate of the dependent variable.

Q5. What are the components of time series? Bring out the significance of moving average in analyzing a time series and point out its limitations. [10] COMPONENTS OF TIME SERIES: The behavior of a time series over periods of time is called the movement of the time series. The time series is classified into the following four components: y y y y Long term trend or secular trend Seasonal variations Cyclic variations Random variations

METHOD OF MOVING AVERAGES: Moving averages method is used for smoothing the time series. That is, it soothes the fluctuations of the data by the method of moving averages. When period of moving average is odd: Procedure for determining the trend when moving average is odd is described in the following table: Step 1 Obtain the time series. Step 2 Select a period of moving average such as 3 years, 5 years and so on. Step 3 Compute moving totals according to the length of the period of moving average. If the length of the period of moving averages is 3, that is, 3yearly moving average is to be calculated, compute moving totals as follows: a + b + c, b + c + d, c + d + e, d + e + f .so on. Similarly 5-years moving average, moving totals are computed as follows: a + b + c + d + e, b + c + d + e + f, c + d + e + f + g so on. Placing the moving total at the center of the time span from which they are computed.
9

Step 4 Compute moving averages by moving tables in step (3) by the length of the period of moving average and place them at the center of the time span from which moving totals are computed. These moving averages are also called the trend values.

By plotting these trend values (if desired) you can obtain the trend curve with the help of which you can determine the trend whether it is increasing or decreasing. If needed, you can also compute short-term fluctuations by subtracting the trend values from the actual values. When period of moving averages is even: Procedure for determining the trend when moving average is even (such as 4 years), is described in the table below: Step 1 Obtain the time series. Step 2 Obtain the length of the period of moving average. Let the length of the moving averages period be 4-years. Step 3 Compute 4-yearly moving totals and place them at the center of the life span. The four-yearly moving totals are computed as follows: a + b + c + d, b + c + d + e, c + d + e + f. Step 4 Compute 4-yearly moving average and place them at the center of the life span. Note that this placement is inconvenient, because the moving average so placed would not coincide with original time period. Step 5 Take two-period moving average of moving averages and place them at the middle of the periods. This process is called centering of moving averages.

Merits and Demerits of moving averages: MERITS y This is a simple method. DEMERITS y No functional relationship between the values and the time. Thus, this method is not helpful in forecasting and predicting the values on the basis of time.
10

y This method is objective in the sense that anybody working on a problem with this method will get the same results.

y This method is used for determining seasonal, cyclic and irregular variations besides the trend values. y This method is flexible enough to add more figures to the data because the entire calculations are not changed.

y No trend values for some years in the beginning and some in the end. For example, for 5 - yearly moving average, there will be no trend values for the first two years and the last three years. y In case of non-linear trend, the values obtained by this method are biased in one or the other direction. y The period selection of moving average is a difficult task. Hence, great care has to be taken in period selection, particularly when there is no business cycle during that time.

y If the period of moving averages coincides with the period of cyclic fluctuations in the data, such fluctuations are automatically eliminated.

Q6. 6a. List down various measures of central tendency and explain the difference between them. [5] Graphical representation is a good way to represent summarized data. However, graphs provide us only an overview and thus may not be used for further analysis. Hence, we use summary statistics like computing averages to analyze the data. Mass data, which is collected, classified, tabulated and presented systematically, is analyzed further to bring its size to a single representative figure. This single figure is the measure which can be found at central part of the range of all values. It is the one which represents the entire data set. Hence, this is called the measure of central tendency.
11

In other words, the tendency of data to cluster around a figure which is in central location is known as central tendency. Measure of central tendency or average of first order describes the concentration of large numbers around a particular value. It is a single value which represents all units. y Arithmetic mean: Arithmetic mean is defined as the sum of all values divided by number of values. Merits and demerits of arithmetic mean: Merits It is simple to calculate and easy to understand. It is based on all values. Demerits It is affected by all values. It cannot be determined for distributions with open-ended class intervals. It cannot be graphically located. Sometimes it is a value that is not in the series.

It is rigidly defined. It is more stable. It is capable of further algebraic treatment.

y Median: Median of a set of values is the value which is the middle most value when they are arranged in the ascending order of magnitude. Merits and demerits of median: Merits It can be easily computed and understood. It is not affected by extreme values. It can be determined graphically. (Ogives) It can be used for qualitative data. It can be calculated for distributions with open-ended classes. Demerits It is based on all values. It is not capable of further algebraic treatment. It is not based on all values.

12

y Mode: Mode is the value which has the highest frequency. Modal value is most useful for business people. For example, shoe and readymade garment manufacturers will like to know the modal size of the people to plan their operations. For discrete data with or without frequency, it is that value corresponding to highest frequency. Merits and demerits of mode: Merits In many cases, it can be found by inspection. It is not affected by extreme values. It can be calculated for distributions with open-ended classes. It can be located graphically. It can be used as a qualitative data. Demerits It is not based on all values. It is not capable of further mathematical treatment. It is much affected by sampling fluctuations.

 The best measure of tendency is arithmetic mean. It is defined as a value obtained by dividing the sum of all the observation by their number, That is; mean = [Sum of all the observations] [Number of the observations] Arithmetic mean is used because it is simple to understand and easy to interpret. It is quickly and easily calculated. It is amenable to mathematical treatments. It is relatively stable in repeated sampling experiments

13

6b. What is a confidence interval, and why it is useful? What is a confidence level? [5] CONFIDENCE INTERVAL: In statistics, a confidence interval (CI) is a particular kind of interval estimate of a population parameter and is used to indicate the reliability of an estimate. It is an observed interval (i.e. it is calculated from the observations), in principle different from sample to sample, that frequently includes the parameter of interest, if the experiment is repeated. How frequently the observed interval contains the parameter is determined by the confidence level or confidence coefficient. A confidence interval with a particular confidence level is intended to give the assurance that, if the statistical model is correct, then taken over all the data that might have been obtained, the procedure for constructing the interval would deliver a confidence interval that included the true value of the parameter the proportion of the time set by the confidence level. More specifically, the meaning of the term "confidence level" is that, if confidence intervals are constructed across many separate data analyses of repeated (and possibly different) experiments, the proportion of such intervals that contain the true value of the parameter will approximately match the confidence level; this is guaranteed by the reasoning underlying the construction of confidence intervals. A confidence interval does not predict that the true value of the parameter has a particular probability of being in the confidence interval given the data actually obtained. (An interval intended to have such a property, called a credible interval, can be estimated using Bayesian methods; but such methods bring with them their own distinct strengths and weaknesses). The confidence level sets the boundaries of a confidence interval; this is conventionally set at 95% to coincide with the 5% convention of statistical significance in hypothesis testing. In some studies wider (e.g. 90%) or narrower (e.g. 99%) confidence intervals will be required. This rather depends upon the nature of your study. You should consult a statistician before using CI's other than 95%.
14

You will hear the terms confidence interval and confidence limit used. The confidence interval is the range Q-X to Q+Y where Q is the value that is central to the study question, Q-X is he lower confidence limit and Q+Y is the upper confidence limit. Alternative CI interpretations: y Common: A 95% CI is the interval that you are 95% certain contains the true population value as it might be estimated from a much larger study. The value in question can be a mean, difference between two means, a proportion etc. The CI is usually, but not necessarily, symmetrical about this value. y Pure Bayesian: The Bayesian concept of a credible interval is sometimes put forward as a more practical concept than the confidence interval. For a 95% credible interval, the value of interest (e.g. size of treatment effect) lies with a 95% probability in the interval. This interval is then open to subjective molding of interpretation. Furthermore, the credible interval can only correspond exactly to the confidence interval if prior probability is so called "uninformative". y Pure frequentist: Most pure frequentists say that it is not possible to make probability statements, such CI interpretation, about the study values of interest in hypothesis tests. y Neymanian: A 95% CI is the interval which will contain the true value on 95% of occasions if a study were repeated many times using samples from the same population. Neyman originated the concept of CI as follows: If we test a large number of different null hypotheses at one critical level, say 5%, and then we can collect all of the rejected null hypotheses into one set. This set usually forms a continuous interval that can be derived mathematically and Neyman described the limits of this set as confidence limits that bound a confidence interval. If the critical level (probability of incorrectly rejecting the null hypothesis) is 5% then the interval is 95%. Any values of the treatment effect that lie outside the confidence interval are regarded as "unreasonable" in terms of hypothesis testing at the critical level.
15

16

You might also like