You are on page 1of 11

ESOMAR

Cross Media Conference, Montreal, June 2005


www.esomar.org

Estimation method for media audience duplication


Patricio Moyano Galdames and Orlando Muoz Balmaceda
Time Ibope, Chile
Elias Selman Carranza
Ibope Time Pacific, Chile
OVERVIEW
Modeling the duplication of vehicles audiences has a long history in our field, both successful and unsuccessful, from Agostini's method,1 with its
controversial K constant, to the more broadly accepted Metheringham Method (Beta Binomial Distribution).2 But this outstanding pioneer in the study of
audience duplication phenomena left us with a serious problem: the decline of reach in the case of the addition of a spot with a lower average rating than
the previous average.3 The discussion of this problem has not advanced significantly. This undesirable effectthe decline of reachled many of our
colleagues to improve their estimations using proprietary models of a similar type.4 However, these experiences are all linked principally to readership
estimations used to solve the problem of advertising in print media.
In television, the considerations are somewhat different, as people's exposure depends on day parts and programming schedules. Therefore, it is
necessary to analyze intra- and inter-channel duplication. Initial investigations assumed constant duplication5 in the set of media outlets analyzed.
One approach that simplifies the analysis, and which has been used for a long time, is to assume that duplication is a random event and that
consumption of a media outlet, program, etc., is an independent phenomenon. But this solution is questionable with respect to the estimation of media
reach.
With the rise of personal computers, there has been an explosive growth in the analysis of media plans and of software used for this purpose with
evaluation functionalities6 that provide statistics on reach, average frequency, exposure distributions, etc. It should be pointed out that these systems
work by calculating real duplication using a raw database produced by research, especially using the People Meter system. In some cases a final
adjustment is performed in order to match the published GRPs (daily) with those calculated using a constant sample panel that is generally formed on the
middle day7 of the period being evaluated.
The need for combined media assessments has led market researchers to design a methodology generically known as single source.8 When inquiring
about the consumption of different media in a single interview, it is possible to use this same data to perform multimedia evaluations using the same
sample. This is unquestionably an adequate solution, but the information it provides is more for media strategy (long-term), while the purchase of space
in vehicles is more closely related to media tactics (short-term), particularly in television, where the fight for audience occurs on a daily basis and
specialized studies, such as those of readership or those using the People Meter system, provide more accurate and detailed data.
Another alternative, which is both new and promising, is data fusion. This interesting technique uses common elements to match up the contents of
various databases, thereby creating one single database. It is also possible to assume that a multimedia estimation is an adequate approximation,
except that the complexity involved in matching more than two databases requires the additional assumption that the matching variables are sufficient to
establish a consistent fusion.
In short, there are different approaches to the problem of evaluating media plans, especially when they are mixed, e.g. based on the different needs of
communication campaigns for products and services.
Our approach takes into account the fact that specialized studies for audience measurement provide the highest-quality and most detailed data, and that
they are used intensively in the purchase of spaces. What is lacking is a consistent link that would allow consolidation of the results of globally-viewed
advertising media plans with high precision and low information loss.
THE MODEL
We define f(x) as the distribution of the frequency of a certain schedule for a media outlet, A, and g(y) as the distribution of the frequency of another
schedule for a media outlet, B. Thus, the problem is creating a new intermediate distribution that we will call h(x,y) and which describes the joint effect
of the two schedules on the campaign's target group. Then, in the following stage, it consolidates this in a new distribution that we will call T(z).
Thus, the distributions X and Y shall be the marginal distributions of the joint distribution obtained from the algorithm that we will describe below.
The expected value E(X) of the distributions is equal to the GRPs of each schedule (see Equation 1).
Then, the condition that must be satisfied by the new distribution T(z) final is that the GRPs be equal to the sum of the two distributions and that the
total reach be equal to or greater than the largest of the two reaches and less than the one produced under the hypothesis of independence.
Conditions:
1.

Grps(T) = Grps(X) + Grps(Y)

2.

Reach(T)Max(Reach(X),Reach(Y))

3.

Reach(T)Reach(X) + Reach(Y) (Reach(X)) (Reach(Y))

Where Reach = l-f(0), that is, 1 minus those who do not see the schedule.
This procedure begins by calculating the new total reach, which is assumed to be within the interval of the conditions described above.
The Limits of the Reach from the Point of View of Set Theory
1. The maximum reach of both schedules (see Figure 4)
For the minimum level, we are assuming that the reach of schedule A is contained in or is a subset of B, which means that A does not exceed the reach
(see Equation 2).
This means that when consolidating both schedules, the total reach is equal to the reach of B.
2. The schedule B is independent from A (see Equation 3)
This means that the intersection of schedules A and B is the product of their probabilities (see Equation 4).
This means that the maximum level is equal to random duplication.
CREATING THE FACTORS
First, the factors that determine the maximum and minimum levels of reach are calculated; the interval of the solution of the total reach (the result of
consolidating both schedules) is defined here.
Factor: Maximum Reach of Both Schedules. This corresponds to the factor with which the reach of the lower limit of the consolidated schedule is obtained
(see Equation 5).
Factor: Random Reach (Independence). This corresponds to the factor with which the reach of the upper limit of the consolidated schedule is obtained
(see Equation 6).
New Factor or Probabilistic Factor. The factor most likely to occur (see Equation 7).
The new total reach or mixed reach of the consolidated model is determined using the probabilistic factor (see Equation 8).
Figure 1 depicts the curves of the three factors as a function of increases in the GRPs. They make the reach increase, but with decreasing returns. The
factor acts to decelerate the reach as a function of increases in the GRPs, as they are basically the OTS minus the GRPs.
The step that follows the estimation of the mixed reach of the schedules being consolidated is to calculate the ratio that allows the joint distribution of the
distributions to be created.
Estimation of the Ratio: This Ratio will make it possible to distribute the joint proportions of the schedule distributions that are being consolidated (see
Equation 9).9
Where random duplication (Duprdm) is: (see Equation 10)
Within the concept of probabilities, this duplication involves independence between the media that are being consolidated. In other words, the
consumption of one media outlet has no influence on the consumption of the other.
Where actual duplication is: (see Equation 11)
In the following section we present an example that illustrates the application of this method.
ILLUSTRATION
In order to illustrate the methodology that consolidates the frequency distributions of various media, two schedules were created: a television schedule
and a print media schedule. The schedules were then evaluated using the software currently available in the Chilean market, TVdata and PrintPlan, which
are used for television and print media schedules, respectively.
The target group used here is the total number of people (in Chile) between the ages of 25 and 54; this universe comprised of 2,149,519 people. Figure 2
shows the output of the software when applied to the two schedules being studied.
The frequency distributions of individual media are presented in Table 1 (also see Table 2).
Estimation of Maximum Factor. This is the factor that corresponds to the lower limit of the consolidated schedule. In this case it is the minimum reach
possible in the new distribution, which is to say the maximum reach of the media evaluations, independently. Let Reachmax be the maximum reach of the
two schedules, let GRP tv be the GRP obtained in the television schedule, let GRP pr be the GRP obtained in the print media schedule, and let GRP mix be
GRP tv + GRP pr (see Equation 12).
Estimation of the Random Factor. This is the factor that corresponds to the upper limit of the consolidated schedule. In this case, it corresponds to the
random duplication, which means assuming independence between the media. Let Reachrdm be the reach when assuming random duplication (Duprdm)
(see Equation 13).

Estimation of the Probabilistic Factor. This is an average of the Maximum and Random factors, weighted according to their respective reaches (see
Equation 14).
Estimation of the Reach of the Consolidated Schedule: The weighted factor is used to determine the reach of the consolidated schedule (Reachmix). It is
calculated as follows in Equation 15.
Calculation of Actual Duplication. Once the reach of the consolidated schedule has been determined, the actual duplication of the television and print
media schedules can be determined. We label designate duplication as Dupact. We know that the reach of the consolidated schedule (Reachmix) is the
sum of the reaches of the television schedule (Reachtv ) and the print media schedule (Reachpr), minus their duplication (Dupact), that is: (See Equation 16)
This produces: (see Equation 17)
Estimation of the Frequency Distribution of the Consolidated Schedule: In order to estimate the consolidated frequency distribution, the proportion of the
non-impacted ones (zero frequency) of both distributions (television and print media) must be recalculated.
1.

Estimation of the Ratio: (see Equation 18)

2.

Re-estimation of the proportion of individuals not reached by each schedule: (see Equation 19)

Where:
P * ,0 : Re-estimation of the proportion of people not reached by the television schedule.
P j =0 : Original proportion of people not reached by the television schedule.
P * 0,: Re-estimation of the proportion of people not reached by the print media schedule.
P i=0 : Original proportion of people not reached by the print media schedule (see Tables 3 and 4).
The distribution of the consolidated frequency is obtained from the matrix in which the frequency distributions of the media are combined with their re
calculated zero frequency.
Let
Be
P i,j : Matrix cell i,j contains the proportion of individuals exposed i times to the print media model and j times to the television model, where i=0,1,...,11+
and j=0,l,...,9+.
P ,j : Frequency distribution of television with the recalculated zero frequency.
P i,: Frequency distribution of print media with recalculated zero frequency.
Matrix cell 0,0 contains the proportion of individuals not exposed to either of the two schedules, meaning the people who are not exposed to the
consolidated schedule. The method of calculating this is as follows in Equation 20.
The remaining cells in the matrix are calculated the same way. To exemplify our methodology, the calculation of cell (1,0) is shown (see Equation 21).
Cell (i,j) indicates the proportion of individuals who were impacted i times by the print media schedule and j times by the television schedule. Adding up
the diagonals of the matrix produces the respective frequencies of the consolidated schedule. By way of illustration, Table 4 shows the diagonal that
corresponds to frequency 4 of the consolidated distribution (0.0056 + 0.0284 + 0.0387 + 0.0349 + 0.0145 = 0.1221).
Table 5 shows the frequency distribution after consolidating the television and print media distributions, which produces a reach of 75.04 (100 24.96)
and a GRP of 247. This coincides with the sum of the GRPs obtained by the television and print schedule. Figure 3 shows a graph of the consolidated
frequency distribution.
Table 6 summarizes different evaluations of combined schedules, comparing the proposed method with the results obtained by treating the data as a
single source.
CONCLUSION
This model incorporates a methodology which allows consistent evaluations to be performed using diverse data sources, especially those from
specialized media studies, due to the fact that it uses the final data of each distribution.
The table above compares the results of this method with those obtained by processing the data as a single source.
This confirms that the differences are not statistically significant, which validates the model.
FOOTNOTES
1.

Agostini, J.M. How to Estimate Unduplicated Audiences. JAR, March 1963.

2.

Metheringham, R. A. Measuring the Net Cumulative Coverage of a Print Campaign. JAR, December 1964.

3.

Leckenby, J.D. and M.D. Rice. The Declining Reach Phenomenon in Exposure Distribution Models. Journal of Advertising (15), 1986.

4.

Metrex, TruCume and MetherPlus are a few examples of improved estimation models.

5.

See Goodhart and Ehrenberg's 1969 papers.

6.

Time Ibope provides the TV data software that was used to carry out these evaluations.

7.

This adjustment generally uses probabilistic negative binomial distributions.

8.

For example, TGI (Target Group Index) and EGM (Estudio General de Medios, General Media Study).

9.

For more information, see Katz, Lancaster. Strategic Media Planning.

NOTES & EXHIBITS

EQUATION 1

FIGURE 4

EQUATION 2

EQUATION 3

EQUATION 4

EQUATION 5

EQUATION 6

EQUATION 7

EQUATION 8

FIGURE 1: SOLUTION OF THE REACH COMBINED

EQUATION 9

EQUATION 10

EQUATION 11

FIGURE 2: RESULTS PRODUCED BY TVDATA AND PRINTPLAN SOFTWARE

TABLE 1: DISTRIBUTION OF TELEVISION AND PRINT MEDIA FREQUENCY

TABLE 2: SUMMARY OF THE RESULTS OF EVALUATIONS OF A TELEVISION SCHEDULE AND A PRINT MEDIA SCHEDULE

EQUATION 12

EQUATION 13

EQUATION 14

EQUATION 15

EQUATION 16

EQUATION 17

EQUATION 18

EQUATION 19

TABLE 3: FREQUENCY DISTRIBUTIONS OF TELEVISION AND PRINT MEDIA WITH RECALCULATED ZERO FREQUENCIES

TABLE 4: COMBINATION OF THE FREQUENCY DISTRIBUTIONS OF TELEVISION AND PRINT MEDIA

EQUATION 20

EQUATION 21

TABLE 5: FREQUENCY DISTRIBUTION AFTER CONSOLIDATING TELEVISION AND PRINT MEDIA DISTRIBUTIONS

FIGURE 3: CONSOLIDATED FREQUENCY DISTRIBUTION

TABLE 6: SUMMARY OF EVALUATIONS OF COMBINED SCHEDULES

Copyright ESOMAR 2005


ESOMAR
Eurocenter 2, 11th floor, Barbara Strozzilaan 384, 1083 HN Amsterdam, The Netherlands
Tel: +31 20 664 2141, Fax: +31 20 664 2922
All rights reserved including database rights. This electronic file is for the personal use of authorised users based at the subscribing company's
office location. It may not be reproduced, posted on intranets, extranets or the internet, e-mailed, archived or shared electronically either within
the purchasers organisation or externally without express written permission from Warc.

www.warc.com