MATHEMATICAL MODEL SUMMARY

Thus we can say that Mean, Median, Mode is the essential phenomena in any statistical analysis. Thus the measure central tendency helps in summarising the data and classify it into simple form.

HARMONIC MEAN

The harmonic mean of n observations, none of which is zero, is defined as the reciprocal of the arithmetic mean of their reciprocals.
Calculation of Harmonic Mean
(a) Individual series
If there are n observations X1, X2, ...... Xn, their harmonic mean is defined as
Example : A train travels 50 kms at a speed of 40 kms/hour, 60 kms at a speed of 50 kms/hour and 40 kms at a speed of 60 kms/hour. Calculate the weighted harmonic mean of the speed of the train taking distances travelled as weights. Verify that this harmonic mean represents an appropriate average of the speed of train.
Verification : Average speed Total distance travelled/Total time taken We note that the numerator of Equation (1) gives the total distance travelled by train. Further, its denominator represents total time taken by the train in travelling 150 kms, Since 50/40 is time taken by the train in travelling 50 kms at a speed of 40 kms/hour.
Similarly 60/50 and 40/60 are time taken by the train in travelling 60 kms and 40 kms at the speeds of 50 kms./hour and 60 kms/hour respectively. Hence, weighted harmonic mean is most appropriate average in this case.
Example : Ram goes from his house to office on a cycle at a speed of 12 kms/hour and returns at a speed of 14 kms/hour. Find his average speed.Solution: Since the distances of travel at various speeds are equal, the average speed of Ram will be given by the simple harmonic mean of the given speeds.
Choice between Harmonic Mean and Arithmetic Mean
The harmonic mean, like arithmetic mean, is also used in averaging of rates like price per unit, kms per hour, work done per hour, etc., under certain conditions. To explain the method of choosing an appropriate average, consider the following illustration.
Let the price of a commodity be Rs 3, 4 and 5 per unit in three successive years. If we take A.M. of these prices, i.e., 3+4+5/3 = 4, then it will denote average price when equal quantities of the commodity are purchased in each year. To verify this, let us assume that 10 units of commodity are purchased in each year.
Total expenditure on the commodity in 3 years = 10*3 + 10*4 + 10*5.
which is arithmetic mean of the prices in three years.
Further, if we take harmonic mean of the given prices, i.e.
it will denote the average price when equal amounts of money are spent on the commodity in three years. To verify this let us assume that Rs 100 is spent in each year on the purchase of the commodity.
Next, we consider a situation where different quantities are purchased in the three years. Let us assume that 10, 15 and 20 units of the commodity are purchased at prices of Rs 3, 4 and 5 respectively.
which is weighted arithmetic mean of the prices taking respective quantities as weights. Further, if Rs 150, 200 and 250 are spent on the purchase of the commodity at prices of Rs 3, 4 and 5 respectively, then
purchased in respective situations. The above average price is equal to the weighted harmonic mean of prices taking money spent as weights.
Therefore, to decide about the type of average to be used in a given situation, the first step is to examine the rate to be averaged. It may be noted here that a rate represents a ratio, e.g., price = money/quantity, speed = distance/time , work done per hour = work done/time taken etc.
We have seen above that arithmetic mean is appropriate average of prices (Money/quantity) when quantities, which appear in the denominator of the rate to be averaged, purchased in different situations are given. Similarly, harmonic mean will be appropriate when sums of money, that appear in the numerator of the rate to be averaged, spent in different situations are given.
To conclude, we can say that the average of a rate, defined by the ratio p/q, is given by the arithmetic mean of its values in different situations if the conditions are given in terms of q and by the harmonic mean if the conditions are given in terms of p. Further, if the conditions are same in different situations, use simple AM or HM and otherwise use weighted AM or HM.
Example : An individual purchases three qualities of pencils. The relevant data are given below:
Example : In a 400 metre athlete competition, a participant covers the distance as given below. Find his average speed.


Example : Peter travelled by a car for four days. He drove 10 hours each day. He drove first day at the rate of 45 kms/hour, second day at the rate of 40 kms/hour, third day at the rate of 38 kms/hour and fourth day at the rate of 37 kms/hour. What was his average speed.
Solution: Since the rate to be averaged is speed= (Distance/time) and the conditions are given in terms of time, therefore AM will be appropriate. Further, since Peter travelled for equal number of hours on each of the four days, simple AM will be calculated.
∴ Average speed = 45+40+38+37/4 = 40 kms/hour
Example : In a certain factory, a unit of work is completed by A in 4 minutes, by B in 5 minutes, by C in 6 minutes, by D in 10 minutes and by E in 12 minutes. What is their average rate of working? What is the average number of units of work completed per minute? At this rate, how many units of work each of them, on the average, will complete in a six hour day? Also find the total units of work completed.
Solution: Here the rate to be averaged is time taken to complete a unit of work, i.e., time/units of work done . Since we have to determine the average with reference to a (six hours) day, therefore, HM of the rates will give us appropriate average.
Thus, the average rate of working =
The average number of units of work completed per minute = 1/6.25 = 0.16. The average number of units of work completed by each person = 0.16 *360 = 57.6. Total units of work completed by all the five persons = 57.6 * 5 = 288.0.
Example : A scooterist purchased petrol at the rate of Rs 14, 15.50 and 16 per litre during three successive years. Calculate the average price of petrol (i) if he purchased 150, 160 and 170 litres of petrol in the respective years and (ii) if he spent Rs 2,200, 2,500 and 2,600 in the three years.
Solution: The rate to be averaged is expressed as Money/litre (i) Since the condition is given in terms of different litres of petrol in three years, therefore, weighted AM will be appropriate
Merits and Demerits of Harmonic Mean
Merits
It is rigidly defined average and its value is always definite.Its value is based on all observation in a given series.It is capable of further algebraic treatment.It is not affected by sampling fluctuations.In problems relating to time and rates, it gives better results as compared to other averages. Harmonic mean gives the best result when distance covered are the same, but speed of coverage varies.
Demerits
It is not easily understood and hence its application is ignored.It is not easy to calculate as it involves reciprocal values (The use of calculators can help to remove this difficulty).It gives undue weights to small items and ignores bigger items. This restricts its use in the analysis of economic data.In case of zero or negative values, it cannot be computed.
Relationship among AMGM and HM
If all the observations of a variable are same, all the three measures of central tendency coincide, i.e., AM = GM = HM. Otherwise, we have AM > GM > HM.
Example : Show that for any two positive numbers a and b, AM ³ GM ³ HM.
Solution: The three averages of a and b are:
Exercise with Hints
A train runs 25 miles at a speed of 30 m.p.h., another 50 miles at a speed of 40 m.p.h., then due to repairs of the track, 6 miles at a speed of 10 m.p.h. What should be the speed of the train to cover additional distance of 24 miles so that the average speed of the whole run of 105 miles is 35 m.p.h?
Hint: Let x be the speed to cover a distance of 24 miles,
Prices per share of a company during first five days of a month were Rs 100, 120, 150, 140 and 50.Find the average daily price per share.Find the average price paid by an investor who purchased Rs 20,000 worth of shares on each day.Find the average price paid by an investor who purchased 100, 110, 120, 130 and 150 shares on respective days.
Hint: Find simple HM in (ii) and weighted AM in (iii).
Typist A can type a letter in five minutes, B in ten minutes and C in fifteen minutes. What is the average number of letters typed per hour per typist?
Hint: Since we are given conditions in terms of per hour, therefore, simple HM of speed will give the average time taken to type one letter. From this we can obtain the average number of letters typed in one hour by each typist.
Ram paid Rs 15 for two dozens of bananas in one shop, another Rs 15 for three dozens of bananas in second shop and Rs 15 for four dozens of bananas in third shop. Find the average price per dozen paid by him.
Hint: First find the prices per dozen in three situations and since equal money is spent,
HM is the appropriate average.
A country accumulates Rs 100 crores of capital stock at the rate of Rs 10 crores/year, another Rs 100 crores at the rate of Rs 20 crores/year and Rs 100 crores at the rate of Rs 25 crores/year. What is the average rate of accumulation?
Hint: Since Rs 100 crores, each, is accumulated at the rates of Rs 10, 20 and 25 crores/year, simple HM of these rates would be most appropriate.
A motor car covered a distance of 50 miles 4 times. The first time at 50 m.p.h., the second at 20 m.p.h., the third at 40 m.p.h. and the fourth at 25 m.p.h. Calculate the average speed.Hint: Use HM.The interest paid on each of the three different sums of money yielding 10%, 12% and 15% simple interest p.a. is the same. What is the average yield percent on the sum invested?Hint: Use HM
Quadratic Mean
Quadratic mean is the square root of the arithmetic mean of squares of observations. If X1, X2 ...... Xn are n observations, their quadratic mean is given by
Moving Average
This is a special type of average used to eliminate periodic fluctuations from the time series data.
Progressive Average
A progressive average is a cumulative average which is computed by taking all the available figures in each succeeding years. The average for different periods is obtained as shown below:
This average is often used in the early years of a business.
Composite Average

GEOMETRIC MEAN

The geometric mean of a series of n positive observations is defined as the nth root of their product.
Calculation of Geometric Mean
(a) Individual series
If there are n observations, X1, X2, ...... Xn, such that Xi > 0 for each i, their geometric mean (GM) is defined as
Average Rate of Growth of Population
The average rate of growth of price, denoted by r in the above section, can also be interpreted as the average rate of growth of population. If P0 denotes the population in the beginning of the period and Pn the population after n years, using Equation (2), we can write the expression for the average rate of change of population per annum as
Similarly, Equation (4), given above, can be used to find the average rate of growth of population when its rates of growth in various years are given.
Remarks: The formulae of price and population changes, considered above, can also be extended to various other situations like growth of money, capital, output, etc.
Example : The population of a country increased from 2,00,000 to 2,40,000 within a period of 10 years. Find the average rate of growth of population per year.
Solution: Let r be the average rate of growth of population per year for the period of 10 years. Let P0 be initial and P10 be the final population for this period. We are given P0 = 2,00,000 and P10 = 2,40,000.
Thus, r = 1.018 - 1 = 0.018. Hence, the percentage rate of growth = 0.018 ××100 = 1.8% p. a.
Example : The gross national product of a country was Rs 20,000 crores before 5 years. If it is Rs 30,000 crores now, find the annual rate of growth of G.N.P.
Solution: Here P5 = 30,000, P0 = 20,000 and n = 5.
Hence r = 1.084 - 1 = 0.084 Thus, the percentage rate of growth of G.N.P. is 8.4% p.a
Example : Find the average rate of increase of population per decade, which increased by 20% in first, 30% in second and 40% in the third decade.
Solution: Let r denote the average rate of growth of population per decade, then
Hence, the percentage rate of growth of population per decade is 29.7%.
Suitability of Geometric Mean for Averaging Ratios
It will be shown here that the geometric mean is more appropriate than arithmetic mean while averaging ratios. Let there be two values of each of the variables x and y, as given below:
We note that their product is not equal to unity. However, the product of their respective geometric means, i.e., 1/√6 and √6 , is equal to unity. Since it is desirable that a method of average should be independent of the way in which a ratio is expressed, it seems reasonable to regard geometric mean as more appropriate than arithmetic mean while averaging ratios.
Properties of Geometric Mean
As in case of arithmetic mean, the sum of deviations of logarithms of values from the log GM is equal to zero. This property implies that the product of the ratios of GM to each observation, that is less than it, is equal to the product the ratios of each observation to GM that is greater than it. For example, if the observations are 5, 25, 125 and 625, their GM = 55.9. The above property implies that 55.9/5 x 55.9/25 = 125 /55.9 x 625/55.9Similar to the arithmetic mean, where the sum of observations remains unaltered if each observation is replaced by their AM, the product of observations remains unaltered if each observation is replaced by their GM.
Merits, Demerits and Uses of Geometric Mean
Merits
It is based on all the items of the data..It is rigidly defined. It means different investigators will find the same result from the given set of data.It is a relative measure and given less importance to large items and more to small ones unlike the arithmetic mean.Geometric mean is useful in ratios and percentages and in determining rates of increase or decrease.It is capable of algebraic treatment. It mean we can find out the combined geometric mean of two or more series.
Demerits
It is not easily understood and therefore is not widely used.It is difficult to compute as it involves the knowledge of ratios, roots, logs and antilog.It becomes indeterminate in case any value in the given series happens to be zero or negative.With open-end class intervals of the data, geometric mean cannot be calculated.Geometric mean may not correspond to any value of the given data.
Uses
It is most suitable for averaging ratios and exponential rates of changes.It is used in the construction of index numbers.It is often used to study certain social or economic phenomena.
Exercise with Hints
A sum of money was invested for 4 years. The respective rates of interest per annum were 4%, 5%, 6% and 8%. Determine the average rate of interest p.a.
The number of bacteria in a certain culture was found to be 4 ´ 106 at noon of one day. At noon of the next day, the number was 9 ´ 106. If the number increased at a constant rate per hour, how many bacteria were there at the intervening midnight?
Hint: The number of bacteria at midnight is GM of 4 ´ 106 and 9 ´ 106.
If the price of a commodity doubles in a period of 4 years, what is the average percentage increase per year?
Hint
A machine is assumed to depreciate by 40% in value in the first year, by 25% in second year and by 10% p.a. for the next three years, each percentage being calculated on the diminishing value. Find the percentage depreciation p.a. for the entire period.
Hint
A certain store made profits of Rs 5,000, Rs 10,000 and Rs 80,000 in 1965, 1966 and 1967 respectively. Determine the average rate of growth of its profits.
Hint
An economy grows at the rate of 2% in the first year, 2.5% in the second, 3% in the third, 4% in the fourth ...... and 10% in the tenth year. What is the average rate of growth of the economy?
Hint
The export of a commodity increased by 30% in 1988, decreased by 22% in 1989 and then increased by 45% in the following year. The increase/decrease, in each year, being measured in comparison to its previous year. Calculate the average rate of change of the exports per annum.
Hint
Show that the arithmetic mean of two positive numbers a and b is at least as large as their geometric mean.
Hint: We know that the square of the difference of two numbers is always positive, i.e., (a - b)2 ³0. Make adjustments to get the inequality (a + b)2³4ab and then get the desired result, i.e., AM ³ GM.
If population has doubled itself in 20 years, is it correct to say that the rate of growth has been 5% per annum?
Hint
The weighted geometric mean of 5 numbers 10, 15, 25, 12 and 20 is 17.15. If the weights of the first four numbers are 2, 3, 5, and 2 respectively, find weight of the fifth number. Hint: Let x be the weight of the 5th number, then

RELATION BETWEEN MEAN, MEDIAN AND MODE

The relationship between the above measures of central tendency will be interpreted in terms of a continuous frequency curve. If the number of observations of a frequency distribution is increased gradually, then accordingly, we need to have more number of classes, for approximately the same range of values of the variable, and simultaneously, the width of the corresponding classes would decrease. Consequently, the histogram of the frequency distribution will get transformed into a smooth frequency curve, as shown in the following figure.
For a given distribution, the mean is the value of the variable which is the point of balance or centre of gravity of the distribution. The median is the value such that half of the observations are below it and remaining half are above it. In terms of the frequency curve, the total area under the curve is divided into two equal parts by the ordinate at median. Mode of a distribution is a value around which there is maximum concentration of observations and is given by the point at which peak of the curve occurs. For a symmetrical distribution, all the three measures of central tendency are equal i.e. X = Md = Mo, as shown in the following figure.
Imagine a situation in which the symmetrical distribution is made asymmetrical or positively (or negatively) skewed by adding some observations of very high (or very low) magnitudes, so that the right hand (or the left hand) tail of the frequency curve gets elongated.
Consequently, the three measures will depart from each other. Since mean takes into account the magnitudes of observations, it would be highly affected. Further, since the total number of observations will also increase, the median would also be affected but to a lesser extent than mean. Finally, there would be no change in the position of mode. More specifically, we shall have Mo < Md < X , when skewness is positive and X < Md < Mo, when skewness is negative, as shown in the following figure.
Empirical Relation between Mean, Median and Mode
Empirically, it has been observed that for a moderately skewed distribution, the difference between mean and mode is approximately three times the difference between mean and median, i.e.,
This relation can be used to estimate the value of one of the measures when the values of the other two are known.
Example :
The mean and median of a moderately skewed distribution are 42.2 and 41.9 respectively. Find mode of the distribution.For a moderately skewed distribution, the median price of men's shoes is Rs 380 and modal price is Rs 350. Calculate mean price of shoes.
Solution:
(a) Here, mode will be determined by the use of empirical formula.
Choice of a Suitable Average
The choice of a suitable average, for a given set of data, depends upon a number of considerations which can be classified into the following broad categories:
Considerations based on the suitability of the data for an average.Considerations based on the purpose of investigation.Considerations based on various merits of an average.
(a) Considerations based on the suitability of the data for an average:
The nature of the given data may itself indicate the type of average that could be selected. For example, the calculation of mean or median is not possible if the characteristic is neither measurable nor can be arranged in certain order of its intensity. However, it is possible to calculate mode in such cases. Suppose that the distribution of votes polled by five candidates of a particular constituency are given as below:
Since the above characteristic, i.e., name of the candidate, is neither measurable nor can be arranged in the order of its intensity, it is not possible to calculate the mean and median. However, the mode of the distribution is D and hence, it can be taken as the representative of the above distribution.
If the characteristic is not measurable but various items of the distribution can be arranged in order of intensity of the characteristics, it is possible to locate median in addition to mode. For example, students of a class can be classified into four categories as poor, intelligent, very intelligent and most intelligent. Here the characteristic, intelligence, is not measurable. However, the data can be arranged in ascending or descending order of intelligence. It is not possible to calculate mean in this case.If the characteristic is measurable but class intervals are open at one or both ends of the distribution, it is possible to calculate median and mode but not a satisfactory value of mean. However, an approximate value of mean can also be computed by making certain an assumption about the width of class (es) having open ends.If the distribution is skewed, the median may represent the data more appropriately than mean and mode.If various class intervals are of unequal width, mean and median can be satisfactorily calculated. However, an approximate value of mode can be calculated by making class intervals of equal width under the assumption that observations in a class are uniformly distributed. The accuracy of the computed mode will depend upon the validity of this assumption.
(b) Considerations based on the purpose of investigation:
The choice of an appropriate measure of central tendency also depends upon the purpose of investigation. If the collected data are the figures of income of the people of a particular region and our purpose is to estimate the average income of the people of that region, computation of mean will be most appropriate. On the other hand, if it is desired to study the pattern of income distribution, the computation of median, quartiles or percentiles, etc., might be more appropriate. For example, the median will give a figure such that 50% of the people have income less than or equal to it.
Similarly, by calculating quartiles or percentiles, it is possible to know the percentage of people having at least a given level of income or the percentage of people having income between any two limits, etc.
If the purpose of investigation is to determine the most common or modal size of the distribution, mode is to be computed, e.g., modal family size, modal size of garments, modal size of shoes, etc. The computation of mean and median will provide no useful interpretation of the above situations.
(c) Considerations based on various merits of an average: The presence or absence of various characteristics of an average may also affect its selection in a given situation.
If the requirement is that an average should be rigidly defined, mean or median can be chosen in preference to mode because mode is not rigidly defined in all the situations.An average should be easy to understand and easy to interpret. This characteristic is satisfied by all the three averages.It should be easy to compute. We know that all the three averages are easy to compute. It is to be noted here that, for the location of median, the data must be arranged in order of magnitude. Similarly, for the location of mode, the data should be converted into a frequency distribution. This type of exercise is not necessary for the computation of mean.It should be based on all the observations. This characteristic is met only by mean and not by median or mode.It should be least affected by the fluctuations of sampling. If a number of independent random samples of same size are taken from a population, the variations among means of these samples are less than the variations among their medians or modes. These variations are often termed as sampling variations.
Therefore, preference should be given to mean when the requirement of least sampling variations is to be fulfilled. It should be noted here that if the population is highly skewed, the sampling variations in mean may be larger than the sampling variations in median.
It should not be unduly affected by the extreme observations. The mode is most suitable average from this point of view. Median is only slightly affected while mean is very much affected by the presence of extreme observations.It should be capable of further mathematical treatment. This characteristic is satisfied only by mean and, consequently, most of the statistical theories use mean as a measure of central tendency.It should not be affected by the method of grouping of observations. Very often the data are summarized by grouping observations into class intervals. The chosen average should not be much affected by the changes in size of class intervals.
It can be shown that if the same data are grouped in various ways by taking class intervals of different size, the effect of grouping on mean and median will be very small particularly when the number of observations is very large. Mode is very sensitive to the method of grouping.
It should represent the central tendency of the data. The main purpose of computing an average is to represent the central tendency of the given distribution and, therefore, it is desirable that it should fall in the middle of distribution. Both mean and median satisfy this requirement but in certain cases mode may be at (or near) either end of the distribution.