You are on page 1of 33

Internet vs.

TV Advertising: A Brand-Building Comparison


Michaela Draganska The Wharton School Wesley R. Hartmann Stanford GSB Gena Stanglein Google

Abstract A key issue for media planners determining the share of their advertising budgets to spend on Internet advertising is whether Internet advertising can build brands as eectively as television advertising. To address this question, we extend traditional brand-message recall measurement to facilitate comparisons between Internet formats and television. Specically, we supplement brand-message surveys conducted during the campaign with a set of pre-campaign surveys to control for pre-existing brand knowledge, and use a matching procedure to ensure the pre-campaign sample is comparable to the in-ight one. For our analysis, we use a rich data set comprising 20 campaigns, across multiple industries ranging from consumer packaged goods to telecommunications. We nd substantial cross-brand variation in pre-existing knowledge as well as variation across advertising formats. In particular, individuals exposed to Internet display ads have signicantly lower levels of pre-existing brand knowledge than television viewers. Such dierences in initial conditions suggest biases in comparisons between Internet and television ads, and possibly a more general failure of the brands to establish lasting associations among individuals shifting media consumption from TV to the Internet. Incorporating these pre-existing dierences between media formats results in brand lift measures for Internet ads that are statistically indistinguishable from comparable television lift measures. Keywords: advertising, display, television, Internet.
The authors would like to thank Oscar Mitnik and Amogh Vasekar for valuable assistance, as well as Rawley Cooper, Brent Davis and Scott McKinley at Nielsen for their help in executing the study. Draganska and Hartmann served as consultants during the survey design and administration phases of the project. Email: dragansk@wharton.upenn.edu Email: hartmann_wesley@gsb.stanford.edu Email: gstanglein@google.com

Introduction

Over the past decade advertising expenditures have shifted from traditional media to the Internet. In 2011, online advertising in the United States alone reached $32 billion and is projected to reach $62 billion by 2016 (eMarketer, February 2012 report). Internet portals determined to use their inventory to substitute for traditional advertising formats have turned to quantitative metrics to illustrate the advantages of online advertising. They are armed with the ability to readily observe behavioral responses on the web, such as click-through rates, and to conduct large-scale online experiments to provide the most accurate measure of the eectiveness of the ads in driving consumer purchasing decisions (Lewis & Reilley 2011, Goldfarb & Tucker 2011). Nevertheless, many advertisers still hesitate to shift spending from television campaigns to the Internet, pointing to the established role of TV advertising in building brands. The solid experimental evidence quantifying the behavioral response to Internet advertising does not seem to be a sucient reason, because no direct comparison to the eectiveness of TV as a brand-building medium is available. In general, TV experiments are costly and thus not scalable for wide-spread application to allow for a comparative study of the eectiveness of online and oine campaigns. Older experimental studies on TV advertising, most notably by Lodish, Abraham, Kalmenson, Livelsberger, Lubetkin, Richardson & Steve (1995a) and Lodish, Abraham, Kalmenson, Livelsberger, Lubetkin, Richardson & Steve (1995b) do not have Internet data. Considering also the typical sample sizes for such experiments, obtaining a signicant eect of advertising on sales frequently fails due to lack of power (Lewis & Reilley 2011), so we cannot really rely on them. For that reason and because of the perception that TV advertising is the main medium for brand building, the metrics typically used to assess its eects are brand awareness and preference. The rationale is that, although direct links to eventual purchase are sometimes possible, brand advertising on television is primarily aimed at inuencing the mindset of a customer who may purchase anytime within a reasonably long horizon (Assmus, Farley & 2

Lehmann 1984). By contrast, the eect of online advertisements has been measured mostly on outcomes such as click-through rates and generated sales. A few recent studies have questioned the emphasis on sales measures and have pointed to the brand-building potential of the Internet (Briggs & Hollis 1997, Dreze & Hussherr 2003). In one of the earliest studies of online advertising, Briggs & Hollis (1997) show that banner ads can also have an eect on brand awareness and image, even in the absence of a behavioral response such as a click-through. Using eye-tracking devices in conjunction with a large-scale survey, Dreze & Hussherr (2003) nd that consumers avoid looking at banners, but there is still an eect on brand recall measures, suggesting a pre-attentive level of processing. This research implies that attitudinal measures may be more appropriate not just for assessing the eectiveness of TV commercials but that of online advertising as well. To date, however, no eld data have been available to enable media planners to conduct an apples-to-apples comparison of the advertising eectiveness of online and oine media in terms of creating brand awareness and establishing brand associations. This paper seeks to ll this gap and measure Internet advertising performance, specically the performance of various non-search advertising formats, according to the metrics advertisers have historically relied on for their television campaigns.1 We have a unique data set of 20 advertising campaigns spanning a wide variety of product categories and industries. In addition to TV commercials, we have data for Internet banner, rich media and video ads. The advertising campaigns use the online and TV advertising formats concurrently, and the eect of the commercials is assessed using the same brandrecall measure (ability of respondent to correctly link creative to brand) for all advertising formats, thus providing the data for a valid comparison of the eectiveness of the dierent media.
At the start of 2011, Google, The ARF, Nielsen, Stanford, and Wharton collaborated on an initiative to enhance the media planning and buying process. The goal was to quantify cross-media ad-format eectiveness, and derive the relative impact of ad formats. The rst phase of this project was a pilot measuring the brand cut-through (i.e., ability of consumers to correctly link a brand to a creative) of ad formats across ad campaigns.
1

The performance of an ad campaign is gauged relative to a baseline and is referred to as lift. One could posit, as is common in the industry, that absent advertising consumers would randomly associate brands with commercials. However, especially for mature brands, assuming consumers do not have any pre-existing brand knowledge due to exposure to past advertising, word of mouth, or other experiences with the brand is naive. In addition, for us to compare advertising eects for a given campaign across formats, potential customers who extensively use the Internet and those who predominantly watch TV need to have the same level of pre-existing familiarity with the brand. If the existing stock of past advertising diers by media behavior and is thus dependent on the type of ad format, to which an individual is likely to be exposed, using a constant baseline across formats would no longer yield a valid comparison. To account for such potential disparities in the pre-existing familiarity with the brand across media, we have modied the traditional television recall methodology to include a precampaign survey to obtain the initial conditions of the advertising stocks for consumers with dierent media-consumption habits. We avoid a testing bias by employing a repeated cross-section design rather than a true panel; that is, we measure pre-campaign brand recall and in-ight (during the campaign) brand recall for separate sets of consumers. We ensure the comparability of the pre-campaign and in-ight survey groups by employing a nearest-neighbor matching procedure (Abadie & Imbens 2012). This technique allows us to select only those individuals from the pre-campaign sample who exhibit media consumption behavior similar to that of the individuals surveyed during the campaign. Having this pre-campaign measure for an equivalent group gives us a much more accurate baseline to establish the lift of a campaign relative to assuming random guessing as is typically done in the industry. We nd substantial dierences in the pre-existing levels of brand knowledge both across campaigns and across advertising formats. In particular, respondents who were exposed predominantly to the Internet formats had a lower level of pre-existing brand knowledge

than TV viewers. Our analysis further reveals that ignoring the initial conditions results in dierent conclusions regarding the relative eectiveness of TV versus online formats. Comparing the impact of the three online formats - banner ads, rich media, and video - to commercials aired on TV using the traditional measure of in-ight brand recall, we nd TV is superior to the Internet. Upon adjusting for the pre-existing dierences in brand knowledge by format, however, we nd that Internet ad performance is statistically indistinguishable from TV. In the next section, we provide some background on the traditional brand-recall measures and dene the conditions under which the methodology can be interpreted causally. Section 3 explains the data-collection procedure and provides a description of the variables used in the analysis. We proceed by outlining our empirical strategy in section 4 and then present the ndings in section 5. Section 6 concludes with directions for future research.

Traditional Recall Methodology

A long-established practice in advertising research is to survey individuals who were exposed to an ad the previous day to determine the extent to which they recall the ad message, the brand, and can link the message to the corresponding brand (Rossiter & Bellman 2005). Although these attitudinal measures only approximate the eect on purchase behavior that advertisers are ultimately interested in, they have gained wide acceptance and usage. In an early study, Wells (1964) compared the dierent recognition, recall and rating scales employed in practice and concluded that recall scores, which reect the advertisements ability to register the sponsor name and to deliver a meaningful message to the consumer, are particularly trustworthy. More recently, Krishnan & Chakravarti (1999) review existing memory tests for assessing advertising eectiveness and underscore their value across a wide range of advertising objectives. The ad message could inform the viewer about the existence or functional attributes of the brand, or establish non-functional brand associations. It produces memory traces about 5

brand-specic and message-specic information, about the product category, evaluative reactions, and brand identication (Hutchinson & Moore 1984). For the message to have an eect, the consumer needs to know which brand is being advertised. Empirical studies have shown that this is a nontrivial task, as only about 40% of consumers who have viewed a commercial recall the sponsor of the message (Franzen 1994). Establishing brand-message links therefore is a critical input to brand building. In the present research, we focus specifically on the question of whether a respondent prompted with a description of the ad can recall the brand. Message-recall studies have a few reasons for selecting individuals who viewed the ad the day before. First, the goal is to assess the creatives ability to link the brand and message, and not necessarily to assess the quality of the message itself. For example, one could imagine assessing recall several days after the individuals view the ad to see whether the ad sticks. This measure, however, says more about the memorability of the message than the creatives ability to help the brand cut through and get viewers attention. Second, exposure is traditionally inferred based on self-reported viewing of a TV program during which a commercial was aired (opportunity to see). That is, respondents have been required to recall their program viewership to establish exposure. Doing so for a longer period of time can result in too much error. Passive measurement of exposure, for example, through a meter installed on the TV set or through other tracking devices, can resolve this problem, but such measurement is not available at a large enough scale for all advertising. To describe the traditional methodology, we begin by introducing some notation. Let Ys be an indicator for whether respondent s can correctly select the brand from a multiplechoice list after being prompted with a description of the advertising message. Xs is an indicator for whether respondent s was exposed to the ad. Finally, let ys0 be a probabilistic assessment of how well respondent s could have guessed the associated brand before the ad was run. Then the traditional recall methodology denes = E [Y y0 |X = 1] to be the expected lift among the exposed population in linking the brand with the message. The

estimator for is =
{s|Xs =1}

(Ys ys0 ) ws ,

(1)

where the summation conditions on respondents who have been exposed to the ad and ws weighs respondents based on how representative they are of the entire population exposed to the ad.2 If respondents are randomly drawn from the exposed population, there is no need for the weight ws . This term arises here because of selection issues market research organizations have in recruiting their panels. The baseline ys0 captures consumer past interactions with the brand and provides a measure of the extent to which its advertising has established an association between brand and message. One might expect successful brands to already have a reasonably high baseline association with the message because the message is probably related to associations they have previously communicated. On the other hand, a new brand may have no pre-existing associations that could be tied to the message, resulting in a small baseline. In practice, ys0 is typically a predetermined constant, such as the success rate of guessing at random, that is the same for all respondents. Obtaining a more accurate measure of the baseline ys0 by establishing an initial condition for the campaign is important both for the lift measurement above as well as for providing the advertiser with information about how well past campaigns have imprinted a brand image. Furthermore, if is to be compared across ad formats, recognizing that individuals exposed to dierent formats may systematically dier in their level of pre-campaign associations is critical. The traditional recall methodology can be characterized as trying to measure a treatment eect on the treated population (Heckman, Ichimura & Todd 1997, Imbens 2004). This
The canonical message-recall measure focuses on assessing a single airing of an ad. Yet practitioners often group together multiple ads in a day as well as ads aired across multiple weeks of the campaign. Nevertheless, the estimator in equation (1) is still applied, but the meaning may change because responses for ads later in the campaign could involve more campaign exposures than those responses for ads earlier in the campaign. Practitioners have attempted to account for multiple campaign exposures by considering the build and/or decay in the brand associations throughout the campaign. With enough surveys, one could repeat the above analysis at each point in time, but more often the researcher tries to estimate how the responses vary with how far along the campaign is in terms of either time or total exposures.
2

terminology arises from the focus on only measuring the eect for those individuals who were exposed, that is, conditioning on X = 1 in = E [Y y0 |X = 1]. The primary challenge to a causal interpretation of recall studies is the establishment of a control condition. Because the same individual cannot be simultaneously exposed and unexposed, measuring ys0 for a respondent who is exposed is typically impossible. To clarify the problem, we separate the lift measure, = E [Y y0 |X = 1], into two independent expectations: = E [Y |X = 1] E [y0 |X = 1]. The rst component of this expression, the probability of correctly identifying the brand if an individual was exposed to the ad campaign, E [Y |X = 1], can be easily obtained from observed recall and exposure data. The latter component, E [y0 |X = 1], requires assessing the control outcomes, that is assessing whether a respondent would have correctly linked the creative to the brand without seeing the ad. This measurement requires experimentation and presents a particular challenge for media such as television. In section 4 we propose a method to obtain this measure by augmenting the traditional methodology described above with a pre-campaign survey. Before we proceed, we rst introduce the data set in the next section.

3
3.1

Data
Data Collection and Variables

The data-collection eort employs Nielsens TV Brand Eect panel. This panel consists of a large number of participants who reveal their advertising exposures across Internet and television formats and answer creative and brand-recall survey questions on rewardtv.com. The panel consists of more than six million registered members, with a weekly average of 26,000 participants. On average, a panelist would visit the rewardtv.com site 1.5 times a week and take 1.7 surveys per visit. Approximately 83% of panelists are new each month. Because we rarely observe the same individual for longer stretches of time, the panel is best considered a repeated cross section, which limits our ability to make before-after comparisons of the same individual. However, repeatedly asking a respondent about the same ad and set 8

of brands may lead to conditioning eects (testing bias), so not having a long time series is not necessarily a negative feature in this setting. Nielsen recruits panelists across various Internet portals and sites and through word of mouth. To maximize daily participation, the site provides a lot of entertainment content, along with sweepstakes, auctions, and discounts. The incentives are soft, though, thus ensuring a high turnover and minimizing the potential for conditioning eects. Nielsen conducts periodic checks to ensure the panelists exhibit the same TV viewing and Internet usage behavior as other Nielsen panelists, and uses weights to ensure the representativeness of each surveyed individual. The rst survey for new panelists is eliminated to allow for a training/experimentation period, and any abnormal participation and response patterns are carefully examined. In addition to TV commercials, we investigate three online formats - banner ads, rich media, and video. Banner ads are ads with or without animation with which the user cannot interact. Examples include overlays on video content, companion banners, wallpapers, and skyscraper. Video is any streaming video, pre-roll, post-roll, or in-roll. Rich media are any ads with which the mouse can interact without necessarily activating a click-through, such as expandable ads, interactive game ads, and corner peels. To record online ad exposure, online ad creatives are tagged and then linked to the panelists via cookies on their computers. Provided cookies are not erased and the user does not change computers, Internet exposures are complete for the duration of the campaign irrespective of an individual logging in on rewardtv.com. Television exposure is inferred when a respondent logs in to rewardtv.com and states that on the preceding day, she watched a program that is known to have run an advertisement from the campaign (opportunity to see). For TV exposures, we thus do not observe exposures an individual may have had prior to logging onto rewardtv.com. When an individual logging in is identied as having been exposed to an ad, she is presented with a description of a scene from a commercial (for an example, see Table 1).

This description often comes in the form of a question assessing whether the respondent can recall the creative. Next, the respondent is asked to indicate which of four listed brands the commercial was for.
Table 1: Example for creative-recall and brand-recall questions

In a commercial during this show, who spoke directly to the camera and said, I just bought stock you just saw me buy stock, as he sat at a computer keyboard? Well-spoken baby who eventually spat up all over the place Monkey wearing a custom-tailored suit and a ne silk tie Simple peasant from the past who came from a rural village Alien from outer space who did not speak earth language What was this a commercial for? E Trade TD Ameritrade Scottrade Charles Schwab

Questions are asked in the same way for all formats. Brand recall, however, is only measured conditional on creative recall in the case of TV, as opposed to rich media, video, and banner ads, where all responses are recorded. To keep the data comparable, we retain individuals who answered the creative-recall question correctly for all formats. The sample sizes by format and campaign are reported in Table 2. We collected data for 20 advertising campaigns run in 2011 across several industries: telecom, food and beverage, beauty, nancial services, and pharmaceuticals. For condentiality reasons, we cannot share the brand names that were advertised, but Table 3 gives some information about each campaign and the brand advertised. We see that the campaigns vary considerably in terms of duration, with the shortest campaign being four weeks and the

10

Table 2: Sample sizes for in-ight sample (rst number) and unmatched pre-campaign sample (second number) by survey question format and campaign.

campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

banner 307/1723 225/2313 157/3562 78/1199 146/826 468/1023 78/1225 189/1400 258/2135 820/1980 53/1277 73/1271 1386/1570 53/1723

rich media 93/3419 36/2339

90/1031

245/1320 467/2123

36/3419

TV 1893/1729 3909/2327 334/919 721/3546 2338/1199 239/804 366/807 83/1068 2518/3348 2269/5966 959/1255 84/1409 1955/2935 131/2123 3875/3396 407/1083 75/957 352/964 380/1254 87/971 2426/1602 1108/438 57/1425 3658/1453 2118/2241 648/1729

video 92/1695 141/2312

11

longest, 36 weeks.
Table 3: Duration of advertising campaigns, penetration of advertised brand in its respective product category, and share of TV GRPs for the four quarters prior to current campaign.

campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

TV weeks 12 8 8 8 10 19 32 36 12 27 12 12 4 6 4 14 8 14 7 11

online weeks 15 25 8 8 10 19 32 36 12 27 30 30 4 6 4 14 8 14 7 11

penetration 0.33 0.33 0.08 0.15 new 0.02 0.12 0.19 0.21 0.17 new 0.03 0.19 0.35 0.01 0.23 0.12 0.12 0.36 0.36

TV GRP share 0.48 0.47 0.36 0.00 0.16 0.32 0.22 0.55 1.00 0.45 0.00 0.68 0.15 0.30 0.43 0.44 0.16 0.08 0.31 0.27

The percentage of US households who are buying a certain CPG brand or using a service (penetration of the brand) varies widely across campaigns: we have a new brand (campaign 11), a new line extension (campaign 5), along with several category leaders with a high penetration of more than 30% (campaigns 1, 2, 14, 19 and 20). The level of advertising in the four quarters prior to the current campaign also exhibits substantive variation: from non-existent (campaigns 4 and 11) to 100% of the TV GRPs in the category for campaign 9.

3.2

Recall Measures: In-Flight Sample

The brand-recall analysis as described in section 2 consists only of respondents correct or incorrect associations of the brand with the message. To collect these data, Nielsen deploys 12

surveys while an ad campaign is running. When individuals report they have viewed a TV program that aired a commercial for the focal campaign or when they have visited a web page featuring an online ad, they are presented with the brand-recall question. Table 4 displays the average of the responses from the in-ight survey by campaign and format. Because these estimates do not include an adjustment for a baseline response, they are calculated as in equation (1), except that ys0 is set to zero: Ys w s .
{s|Xs =1}

Table 4: Percentage of correct linkages of brand and creative across formats and campaigns for all individuals surveyed in-ight. Standard deviations are reported in parentheses.

campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

banner rich media video TV 0.40 (0.49) 0.30 (0.46) 0.39 (0.49) 0.40 (0.49) 0.39 (0.49) 0.24 (0.43) 0.41 (0.49) 0.37 (0.48) 0.53 (0.50) 0.37 (0.48) 0.35 (0.48) 0.15 (0.35) 0.31 (0.46) 0.44 (0.50) 0.43 (0.50) 0.41 (0.49) 0.35 (0.48) 0.34 (0.48) 0.44 (0.50) 0.42 (0.49) 0.79 (0.41) 0.85 (0.36) 0.78 (0.42) 0.51 (0.50) 0.50 (0.50) 0.80 (0.40) 0.68 (0.47) 0.36 (0.48) 0.36 (0.48) 0.59 (0.49) 0.49 (0.50) 0.58 (0.49) 0.48 (0.50) 0.38 (0.49) 0.18 (0.39) 0.48 (0.50) 0.84 (0.37) 0.34 (0.47) 0.60 (0.49) 0.48 (0.50) 0.55 (0.50) 0.46 (0.50) 0.55 (0.50) 0.55 (0.50) 0.53 (0.50) 0.39 (0.49) 0.49 (0.51) 0.38 (0.49)

Looking at the average brand recall rates in Table 4, we see many substantial brandmessage links. There is also substantial variation both across formats and campaigns. Although the numbers in the table cannot be directly interpreted as a lift measure because the baseline has not been removed, we can subtract the one traditionally used in practice, ys0 = 0.25, from the reported numbers to get an estimate of the lift. It is notable that 13

although many campaigns have a positive lift, quite a few format-campaign combinations (e.g., banners in campaign 5, rich media in campaign 2, and TV in campaign 13) are below the baseline of 0.25. These numbers could be indicative of a poor campaign that broke previously established brand-message links, or as we will explore with our initial conditions methodology cases, in which the baseline should actually be lower. To formally assess the dierences between recall rates for Internet formats and television, we aggregate across campaigns. Table 5 reports the results of comparing the average recall rates for campaigns that used Internet formats to the recall rates for TV for these campaigns. For campaigns that ran some banner ads, the average brand-message recall of banners is 0.45, whereas it is 0.50 for TV ads, with the dierence having a p-value of 0.01. Similarly, among the campaigns running rich media, the recall is 0.37 for rich media, but signicantly greater at 0.46 for TV. The video ads recall is signicantly greater than TV (0.50 versus 0.44) in those campaigns airing some video ads. Based on these data, we might therefore conclude that TV outperforms banner ads and rich media in terms of brand recall, whereas video outperforms TV.
Table 5: Comparison of average recall rates for Internet formats vs. TV across campaigns in in-ight sample. Campaigns that do not use a given online format were excluded.

banner TV rich media TV video TV

avg. recall t-stat p-value 0.45 -3.24 0.01 0.50 0.37 -3.92 0.00 0.46 0.50 2.88 0.00 0.44

3.3

Recall Measures: Pre-Campaign Sample

For this research project, we augmented the in-ight data collection with a set of surveys, which were deployed before the advertising campaign was run, to account for pre-existing dierences in respondents abilities to link the brand and message. As we describe in sec14

tion 4, these pre-campaign surveys can be used to measure more accurately the lift the ad campaign provides relative to an initial condition than by simply assuming that, absent advertising, consumers would randomly guess.
Table 6: Percentage of correct linkages of brand and creative across formats and campaigns in pre-campaign survey sample. Standard deviations are reported in parentheses.

campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

banner rich media video TV 0.38 (0.48) 0.39 (0.49) 0.39 (0.49) 0.37 (0.48) 0.35 (0.48) 0.35 (0.48) 0.38 (0.48) 0.36 (0.48) 0.49 (0.50) 0.27 (0.44) 0.29 (0.45) 0.18 (0.39) 0.17 (0.37) 0.26 (0.44) 0.25 (0.44) 0.27 (0.45) 0.48 (0.50) 0.43 (0.49) 0.47 (0.50) 0.53 (0.50) 0.54 (0.50) 0.41 (0.49) 0.50 (0.50) 0.45 (0.50) 0.41 (0.49) 0.47 (0.50) 0.47 (0.50) 0.10 (0.30) 0.09 (0.29) 0.10 (0.31) 0.08 (0.28) 0.43 (0.50) 0.40 (0.49) 0.16 (0.37) 0.17 (0.38) 0.31 (0.46) 0.34 (0.47) 0.23 (0.42) 0.30 (0.46) 0.34 (0.47) 0.33 (0.47) 0.17 (0.38) 0.18 (0.39) 0.19 (0.39) 0.27 (0.44) 0.24 (0.43) 0.23 (0.42) 0.26 (0.44)

Preliminary examination of the average pre-campaign brand-recall rates in Table 6 reveals that the recall rates vary substantially across campaigns and that large deviations from a random guess rate of ys0 = 0.25 are present. As expected, the correct linkages for the new products (campaigns 5 and 11) are quite low. In line with our intuition, the preexisting brand knowledge for campaign 5, which is a line extension, is somewhat higher than the entirely new brand in campaign 11. Campaign 18, which has a low share of TV GRPs (8%), is also characterized by a low level of creative-brand association. By contrast, campaigns with a relatively high penetration and share of TV GRPs have higher creative-brand associ-

15

ations. We do not have enough data to fully document a relationship between the campaign characteristics and the probability of correctly linking a creative to a brand, but sucient evidence exists to suggest that subsequent analyses should account for, and possibly attempt to explain, the presence of systematic variation.
Table 7: Comparison of average recall rates for Internet formats vs. TV across campaigns in pre-campaign survey sample. Campaigns that do not use a given online format were excluded.

banner TV rich media TV video TV

avg. recall t-stat p-value 0.31 -3.82 0.01 0.33 0.32 -4.78 0.00 0.35 0.30 -0.93 0.39 0.31

One notable dierence in the pre-campaign recall rates reported in Table 6, relative to the in-ight recall rates in Table 4, is that much less variation is present across formats. This lack of variation is to be expected because the dierences across formats in Table 6 are only in the question asked, not in the respondents past or future exposure to a given format (the questions were asked before the campaign had begun, so the respondents could not have been exposed to the ad. Table 7 reports a direct comparison of average recall for each Internet format to that for TV. Both banners and rich media perform slightly worse relative to TV (a dierence of -0.02 for banners and -0.03 for rich media), whereas video is statistically indistinguishable from TV.

3.4

Comparison of In-Flight and Pre-Campaign Samples

For the summary statistics of the pre-campaign sample to be considered a valid baseline to calculate the lift of a campaign, we need to ensure the respondents included in the precampaign sample are comparable to the ones surveyed during the campaign. This may not 16

be the case, however, for a number of reasons. First, we are only interested in the eect of the campaign on the exposed individuals, and therefore respondents who are not exposed should receive a weight of zero in our analysis. Survey respondents in the pre-campaign sample by denition have not been exposed to the ad at the time they are surveyed, but their subsequent exposures (if any) have been recorded. We can therefore examine their various media exposures to verify they saw the commercial in the focal campaign and format. Table 8 reports the percentage of the pre-campaign sample eventually exposed to an ad in the focal format and campaign. Although this percentage is quite high for TV, many pre-campaign respondents in the Internet formats were never exposed to the campaign. By contrast, all in-ight respondents have by denition been exposed. To make the samples comparable, we thus need to focus only on individuals who were eventually exposed (Xs = 1).
Table 8: Percentage of pre-campaign sample who are eventually exposed to focal format (i.e., were asked about the respective format).

campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

banner 0.70 0.63 0.59 0.53 0.69 0.70 0.63 0.53 0.50 0.76 0.78 0.50 0.84 0.67

rich media 0.47 0.41

video 0.54 0.49

0.48

0.46 0.40

0.64 0.73

0.41 0.46 0.60

0.24 0.64

TV 0.83 0.88 0.96 0.94 0.94 0.87 0.85 0.98 0.94 0.89 0.82 0.86 0.91 0.91 0.98 0.98 1.00 0.84 0.99 0.94

17

A second issue is the extent to which the exposed pre-campaign sample and the in-ight sample are similar in terms of exposures to the dierent advertising formats. As can be seen by looking at the averages for both groups reported in Table 9, even those respondents who were eventually exposed to an ad in the focal campaign have a dierent rate of exposure than the respondents included in the in-ight sample. In general, those in the pre-campaign group have a much higher exposure to TV relative to the in-ight group. Thinking about what may explain these dierences in media exposures, the dierent sampling time frames emerge as a possible cause. Whereas the in-ight surveys were collected for the entire duration of the campaign (anywhere between 4 and 36 weeks), the pre-campaign measures were typically collected within a week. To obtain the necessary sample size to ensure we would have an adequate group of individuals who are eventually exposed to the focal campaign, the selection of respondents had to be much more aggressive, thus yielding a potentially dierent sample. For example, the high TV exposures among the pre-campaign sample could be attributed to a greater number of professional survey takers that might have overstated TV exposure rates in order to earn more points on rewardtv.com. Using a matching methodology, we remove these outliers and create a sample comparable to the in-ight group.

18

Table 9: Average number of exposures to dierent ad formats by campaign. Comparison of pre-campaign (left column) and in-ight (right column) samples.

banner pre-camp. in-ight 5.61 4.74 3.79 3.08 6.27 3.41 3.72 5.6 3.4 2.82

rich media pre-camp. in-ight 3.22 2.94 2.08 1.68

video pre-camp. in-ight 2.45 1.98 2.87 2.04

4.89 3.19 5.7 6.58

19 3.68 9.47 8.78 5.55 5.48 4.39 4.25 3.02 1.84 7.76 5.22 8.63 6.77 4.8 6.46 4.49

5.79 7.26 3.28 4.18

3.44 6.81 1 1.85 3.65 3.57 4.5

3.82 4.03 2.57 1.67 3.48 4.38

2.31

2.33

campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign 3.23

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

9.91

6.12

TV pre-camp. 10.75 16.67 21.96 7.34 13.91 4.29 11.51 1.56 11.45 17.16 10.65 4.23 4.33 2.95 1.85 11.54 5 22.89 2.17 11.31

in-ight 2.87 3.94 13.7 1.48 6.23 2.31 6.41 11.49 2.65 2.51 2.5 1.46 2.56 2.7 1.23 8.55 2.2 15.81 12.02 9.42

Matching Methodology

Given our pre-campaign survey data, we conceptualize lift as = E [Ys1 Ys0 |X = 1] , where Ys1 indicates correct association of the message and brand during the campaign by respondent s and Ys0 indicates correct association before the campaign.3 Numbering the surveys before the campaign as {1, ..., S0 } and those during the campaign as {S0 + 1, ..., S1 + S0 }, we would ideally measure =
{s|s>S0 ,Xs =1}

Ys1 ws1

Ys0 w s0 ,
{s|sS0, Xs =1}

(2)

where the weights w1s and w0s ensure the surveyed in-ight and pre-campaign individuals are representative of the population of exposed individuals. We cannot, however, estimate the above equation because we do not observe ws0 ; that is, the weights are only calculated for the individuals surveyed during the campaign. Furthermore, the analysis and discussion in section 3 indicate the pre-campaign group is systematically dierent from the in-ight group for which we observe the weights. To prune the non-representative pre-campaign respondents, we employ a matching procedure that restricts the analysis to each in-ight survey and its nearest-neighbor from the pre-campaign group.

4.1

The Matching Estimator

We match each in-ight respondent s surveyed during the campaign with a set Ms of presurvey respondents based on a set of variables Zs that we describe below. Then we estimate the following: =
{s|s>S0 ,Xs =1}mMs
3

(Y1s Y0m )

w1s . |Ms |

(3)

Our notation for s equates surveys and respondents. Given the sampling approach described in section 3, a given respondent could potentially ll out multiple surveys in the repeated cross section. We currently cannot separate such cases to treat them specially.

20

In the above expression, Ms is the set of pre-campaign respondents that are matched to in-ight respondent s, and Y0m indicates whether the mth matched pre-survey respondent correctly recalled the brand. We divide by the number of matched respondents, |Ms |, such that the total weight for each in-ight respondent s is equal to that respondents reported weight, w1s . The assumption underlying this estimator is E [Y0 |s > S0 , Z ] = E [Y0 |s S0 , Z ] . (4)

In words, we assume that conditional on the matching variables, Zs , the expected response to the pre-campaign survey is invariant to whether the individual was surveyed before or after the campaign began. The assumption therefore guarantees our estimator removes any systematic sampling dierences between the pre-campaign and in-ight groups.

4.2

Matching Variables

Given our goal is to compare advertising eectiveness across various media formats, we decided to focus on media consumption as the most relevant descriptor of the surveyed individuals. The matching variables Zs include the total number of campaign exposures for each of the three Internet formats, as well as the total number of TV exposures across all campaigns in our data. The Internet formats provide valuable match variables because they are passively observed and thus do not suer from self-reporting issues. Furthermore, they are highly reective of the type of individual. Specically, exposure to the campaigns advertisements signies the individual is in the campaigns target, and the number of exposures provides a measure of the intensity of viewership of the targeted medium. We do not match on television exposures within the campaign because they are not passively observed. A pre-campaign respondent could have been exposed to TV even if we do not observe TV exposure. However, we include the total television exposures across all campaigns so that we are matching on a measure of television viewership intensity. The 21

number of total television exposures also helps us separate out individuals that might take many surveys, because reported television exposures give the respondent the opportunity to take more surveys. Such individuals are down-weighted in Nielsens estimate of each in-ight respondents weight, but we need to match on this characteristic to ensure similar down-weighting of pre-campaign respondents who might have reported many exposures. We use a nearest-neighbor matching approach (Abadie, Drukker, Herr & Imbens 2004, Abadie & Imbens 2012) in which we nd at least one pre-campaign survey to match to each in-ight survey. As Abadie & Imbens (2012) show, allowing individual observations to be used as a match more than once lowers the bias of the estimates. We seek exact matches on the campaigns passively observed Internet exposures and allow the overall television exposures to sort among ties in terms of shortest distance. If ties are still present, we include all tied matches, which accounts for |Ms | in equation (3) being greater than 1 for in-ight survey s. As per equation (3), we include the additional matches based on their share of the total matches to s. If no exact match exists, we nd the nearest neighbor in terms of the distance between the two vectors Zs for the in-ight survey and Zs for the pre-campaign survey. Our procedure worked well. For the TV format question, we are able to match exactly 96% of the in-ight respondents on the passively observed Internet exposures. Note that we exclude any pre-campaign respondents that do not match in-ight respondents, because our in-ight respondent weights sum to form the true distribution of exposed individuals. For banners the percentage is 84%, followed by video at 75% and rich media at 69%.

4.3

Causal Interpretation

Because pre-campaign surveys are conducted well before most of the in-ight surveys (given that some campaigns last 4-5 months), time-varying unobservables could make a causal interpretation dicult. Moreover, although matching ensures pre-campaign and in-ight respondents are comparable in terms of ad exposure and media consumption over the entire

22

time frame of the data, it cannot make up for the time gap between the two surveys. Because our goal is to compare exposures across dierent media formats, our primary concern arises from time- varying unobservables that dier based on the media format to which a respondent is exposed. One source of time-varying unobservables we know exists is unobserved television exposures. Due to the inability to passively measure television exposures, we only observe a subset of the actual exposures to TV ads. However, in trying to assess whether Internet formats can build brands comparably to television, unobserved television exposures would likely overstate television eects relative to Internet eects. This overstatement is likely to occur, because we should expect individuals exposed to television to watch more television on average than individuals exposed to the Internet, giving television-exposed individuals relatively more unobserved exposures to the ad campaign. Other sources of time-varying unobservables include other non-advertising marketing activity by the rm or its competitors. For example, in-store displays do not include messages that would increase association of the message with a brand but could increase the salience of the brand in the mind of the customer and therefore increase the focal brands choice in random guessing. We have no a-priori reason to believe television- or Internet intensive media consumers should see a rms non-advertising marketing activity at a systematically higher or lower rate. Competitors are likely to target their marketing activity at the same targets as those chosen by the focal brand and these competitive actions could lead to systematically higher or lower levels of associations as the time since the pre-campaign survey increases. If competitive advertising creates biases in favor of one format over another, we should expect these biases to be increasing with time since the pre-survey. We therefore consider our eects separately for dierent progressions of our campaigns, measured as the number of previous exposures respondents have to the campaign.

23

Findings

We discuss the results from the above matching procedure in the context of two separate yet related research questions. First, by examining the pre-campaign brand recall of the exposed population, we can evaluate whether past advertising or brand experiences have led to a divergence in brand associations between Internet- and television-intensive targets. We nd that banner- and rich media- intensive targets have systematically lower levels of brand recall, which suggests either past advertising was insucient or less eective for Internet media. Second, the pre-campaign brand-recall measures derived after the matching procedure serve as the baseline in our lift measures. The matched pre-campaign sample allows for a more accurate measure of the campaign lift, and it ensures that is more comparable across formats because it takes into account any pre-existing cross-format dierences in brand knowledge.

5.1

Existing Brand Knowledge across Formats

As some consumers have shifted their media consumption away from television toward various online formats, a concern arises as to whether brand-building activities can be transferred easily across formats. Before we examine the eectiveness of various ad platforms, we consider the lasting eects of past campaigns. Specically, we measure pre-campaign brand knowledge separately by the media format to which a respondent is eventually exposed (and presumably favors). Although our data do not allow us to infer why, for instance, an Internet-exposed individual may have had less pre-campaign knowledge of the brand than a television-exposed individual, two explanations for the dierence in baseline brand knowledge are possible: (i) brands may have devoted fewer past exposures to the Internet formats the individual views, or (ii) past Internet exposures had less persistent eects. We are able to assess pre-campaign associations by exposure format because we observe pre-campaign respondents eventual exposures to the campaign. Table 10 reports the initial conditions based on the matched pre-campaign surveys. These initial conditions dier from the ones reported in Table 6 in that they reect the responses for only those individuals who 24

are exposed to the format-campaign combination and are matched to an in-ight respondent.
Table 10: Percent of correct brand associations before each campaign in matched pre-campaign sample.

campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

banner rich media video TV 0.31 (0.46) 0.34 (0.48) 0.32 (0.47) 0.30 (0.46) 0.26 (0.44) 0.24 (0.44) 0.35 (0.48) 0.32 (0.47) 0.32 (0.47) 0.36 (0.48) 0.36 (0.48) 0.24 (0.43) 0.08 (0.28) 0.24 (0.43) 0.23 (0.42) 0.14 (0.35) 0.42 (0.49) 0.38 (0.49) 0.50 (0.50) 0.62 (0.49) 0.59 (0.49) 0.30 (0.46) 0.63 (0.48) 0.53 (0.50) 0.45 (0.50) 0.47 (0.50) 0.66 (0.47) 0.07 (0.25) 0.05 (0.21) 0.12 (0.32) 0.07 (0.25) 0.42 (0.49) 0.42 (0.49) 0.11 (0.32) 0.13 (0.34) 0.32 (0.47) 0.66 (0.47) 0.14 (0.35) 0.22 (0.42) 0.41 (0.49) 0.18 (0.39) 0.16 (0.36) 0.43 (0.50) 0.16 (0.37) 0.20 (0.40) 0.11 (0.32) 0.08 (0.27) 0.16 (0.37)

The primary change in Table 10 relative to Table 6 is that substantial variation in brand recall now exists across formats within a campaign. For example, campaign 7 has a TV baseline of 0.62, but the baseline is 0.5 or less for the three Internet formats. Alternatively, campaign 2 has a high baseline on video and TV at 0.35 and 0.32, respectively, but is close to 0.25 for banners and rich media. Although the campaign-by-campaign measures are illustrative, our focus is on the averages across campaigns and within format, where the aggregated sample sizes allow us more conclusive inference. Table 11 compares average pre-campaign brand recall for each Internet format to the average pre-campaign brand recall for TV. It also compares the matched estimates with the unmatched estimates. For campaigns running banner ads, we see that 25

Table 11: Dierence across formats in the percentage correct brand associations in pre-campaign sample. Comparison between matched and unmatched samples. Asterisk denotes a signicant dierence at the 5% level.

banner TV rich media TV video TV

unmatched 0.31 0.33 0.32 0.35 0.30 0.31

matched 0.28 0.36 0.26 0.35 0.32 0.30

exact matches 84% 97% 69% 96% 75% 97%

the initial condition for those exposed to banner ads dropped to 0.28 with matching, which is signicantly lower than the 0.36 for TV. Rich-media matched initial conditions are also signicantly lower than TV at 0.26. Video is indistinguishable from TV in both the matched and unmatched samples. We suspect video and TV may be similar, because many of the video ads were for online viewership of episodes from television series (e.g., through Hulu). The banner and rich-media dierences from TV are worth considering. The fact that the target audience for the online ad campaigns has a lower level of existing brand knowledge than the target audience exposed to TV suggests advertisers eorts to reach this population have been ineective thus far. The TV population is more familiar with the brand message and is thus better able to correctly link the commercial to the corresponding brand. This nding could be the result of insucient or ineective past advertising to Internet-intensive media viewers.

5.2

Comparison of Advertising Lift across Formats

The metric we use to compare the performance of the dierent advertising formats is the campaign lift, calculated as the dierence in the brand-recall measure between the matched pre-campaign sample and the in-ight sample (see equation (3)). Table 12 reports the lift by campaign and format. We observe a dramatic eect across all formats for the new brand in campaign 12. Similarly, we nd large and signicant eects for campaign 21 (banners, rich

26

Table 12: Adjusted lift by campaign and format. Asterisk denotes signicance at the 5% level.

campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign campaign

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

banner 0.10 0.13 0.00 -0.09 0.20 -0.08 0.54 -0.02 0.30 0.17 0.15 0.38 0.30 0.28

rich media -0.05 0.00

video 0.07 0.06

-0.04

0.20 -0.05

0.05 0.31

0.33 0.48 0.27

0.12 0.41

TV 0.11 0.05 0.21 -0.01 0.23 0.27 -0.20 0.20 0.15 0.02 0.43 0.06 0.05 0.18 0.20 0.07 0.37 0.39 0.33 0.22

27

Table 13: Aggregate lift comparison across formats.

banner TV rich media TV video TV

avg. lift 0.17 0.14 0.12 0.10 0.19 0.14

t-value 1.56 0.38 1.60

p-value 0.12 0.71 0.11

media, and TV), campaign 19 (banners and TV), campaign 15 (banners and TV), campaign 7 (banners and TV), and campaign 6 (TV and video). Some campaigns have a much greater banner lift than TV (e.g., 9 and 16). Rich media provides the highest lift in campaign 20. Video outperforms other formats in campaigns 10 and 13. These dierences suggest more exploration is needed when data become available for a larger number of campaigns in order to enable the establishment of a relationship between campaign characteristics and the eectiveness of the media vehicles. Comparing the average performance of the advertising formats across the campaigns, we nd all Internet formats perform slightly better than TV, with video having the highest relative lift at 0.05, banners at 0.03, and rich media at 0.01. However, the p-values for video and banners are only 0.11 and 0.12. Thus, accounting for the dierences in pre-existing brand knowledge by format leads to a dierent inference regarding the relative performance of TV versus the online advertising formats. By only comparing in-ight recall rates, TV appears to be the most impactful medium, but adjusting for the initial conditions, Internet formats perform just as well or perhaps even better. One question that arises in comparing in-ight lift across campaigns is whether respondents were exposed the same number of times across campaigns at the time they are surveyed. Table 14 reports the average number of exposures to the surveyed format for each Internet versus TV comparison. Exposure rates among the banner- and rich-media exposed/surveyed

28

Table 14: Average number of exposures to focal format vs. TV at the time a respondent takes a survey in focal format.

banner TV rich media TV video TV


Table 15: TV.

exposures 2.57 1.86 2.58 1.86 2.24 2.01

t-value 4.91 3.78 1.40

p-value 0.00 0.00 0.16

Adjusted lift by number of exposures prior to survey for the pairwise comparison to

Note:

banner rich media video 1 exposure 0.01 -0.04 0.06 0.01 0.07 0.08 2 exposures 3 exposures 0.03 0.03 -0.08 4 exposures -0.02 -0.03 -0.08 5 exposures -0.14 -0.18 -0.18 6 exposures 0.23 0.13 -0.07 avg. di. in lift 0.03 0.01 0.05 denotes signicance at the 10% level, signicance at the 5% level.

are signicantly greater than TV exposures for the same campaigns. Recall, however, that not all TV exposures are observed. Video exposures are also greater than TV exposures, though the dierence is not statistically signicant. Table 15 reports the dierence in lift between TV and each Internet format separately by the total number of exposures to the campaign. Once we condition on exposures, we do not see any format performing systematically better. Even for a given exposure level, only one comparison yields a signicant result (banner lift 0.23 greater than TV at six exposures). Overall this nding suggests that the ability of Internet exposures to produce lift measures comparable to those of TV is not due to systematically greater numbers of exposures to the campaign.

29

Conclusions

In this research, we propose a methodology for establishing a format-specic baseline to assess the lift in brand recall due to an advertising campaign. We supplement the in-ight brand-message surveys with a set of pre-campaign surveys and match the pre-campaign respondents to those eventually exposed to the campaign in order to control for pre-existing brand knowledge. The rich data set we have, tracking the response to TV and Internet advertising for 20 campaigns across a variety of industries, provides us with comparable measures to assess the relative performance of the dierent advertising formats. We nd a systematically lower level of brand knowledge among individuals who are surveyed about banner and rich media. Without a format-specic baseline, a researcher might therefore draw the wrong conclusion and ascribe too much importance to TVs eect on brand recall. Once the dierence in pre-existing knowledge is taken into account, there is no signicant dierence in the eectiveness of TV and Internet ads in terms of correct brand identication. This result underscores the importance of pre-campaign surveys and our matching methodology for comparing ad performance across media formats. The goal of our research was to assess the widely held belief that TV outperforms Internet formats as a brand-building platform, and we therefore focused on head-to-head comparisons of TV to the Internet formats. Nevertheless, as advertisers decide how to use these various formats, knowledge of the complementarities between media will be important. Researchers in marketing have long explored the potential synergies in multimedia communications (see, e.g., Naik & Raman (2003) or Dijkstra, Buijtels & van Raaij (58) for recent examples) but the empirical study of the phenomenon in a eld setting is still challenging. Studies that randomly vary TV and Internet pulses across geographic markets may be best suited to disentangle the optimal combination and sequencing of ad formats. This more detailed analysis was not possible in our context, where most advertising was at the national level, so a focus on brands involved in geo-targeted campaigns may be the most promising approach. Another avenue for future research would be to investigate more formally the link between 30

category characteristics and eectiveness of dierent types of campaigns. Our study and the existing literature on advertising point to a number of potentially relevant brand and category factors such as the maturity level of the category, the stage of the product life cycle (new introduction versus established brand), and the amount of previous advertising, possibly as share of voice in the category. In addition, the type of consumer decision making in the product category - whether it is a low-involvement or a high-involvement process will also likely play a role in determining what media format will be most eective. Finally, our research can be extended by practitioners to include cost measures in comparing the relative performance across ad formats and guiding the media budget allocation decisions. As of now, online advertising still appears to be more cost eective. We anticipate though that once the brand-building potential of Internet formats has been rmly established, the prices for online advertising will increase to reect their relative performance.

31

References
Abadie, A., Drukker, D., Herr, J. & Imbens, G. (2004). Implementing matching estimators for average treatment eects in Stata, The Stata Journal 4(3): 290311. Abadie, A. & Imbens, G. (2012). Bias-corrected matching estimators of average treatment eects, Journal of Business and Economic Statistics 29(1): 111. Assmus, G., Farley, J. U. & Lehmann, D. (1984). How advertising aects sales: Meta analysis of econometric results, Journal of Marketing Research 21(1): 6574. Briggs, R. & Hollis, N. (1997). Advertising on the web: Is there response before clickthrough?, Journal of Advertising Research pp. 3345. Dijkstra, M., Buijtels, H. & van Raaij, F. (58). Separate and joint eects of medium type on consumer responses: A comparison of television, print, and the internet, Journal of Business Research 2005(3): 377386. Dreze, X. & Hussherr, F.-X. (2003). Internet advertising: Is anybody watching?, Journal of Interactive Marketing 17(4): 823. Franzen, G. (1994). Advertising Eectiveness: Findings from empirical research, NTC Publications, Henley-on-Thames, U.K. Goldfarb, A. & Tucker, C. (2011). Online advertising, Advances in Computers, Vol. 81, Elsevier. Heckman, J., Ichimura, H. & Todd, P. (1997). Matching as an econometric evaluation estimator: Evidence from evaluating a job training programme, Review of Economic Studies 64: 605654. Hutchinson, W. & Moore, D. (1984). Issues surrounding the examination of delay eects of advertising, in T. Kinnear (ed.), Advances in consumer research, Vol. 11, Provo, UT: Association for Consumer Research, pp. 650655. 32

Imbens, G. (2004). Nonparametric estimation of average treatment eects under exogeneity: A review, The Review of Economics and Statistics 86(1): 429. Krishnan, S. & Chakravarti, D. (1999). Memory measures for prestesting advertisements: An integrative conceptual framework and a diagnostic template, Journal of Consumer Psychology 8(1): 137. Lewis, R. & Reilley, D. (2011). Does retail advertising work?, Technical report, Yahoo! Research. Lodish, L. M., Abraham, M., Kalmenson, S., Livelsberger, J., Lubetkin, B., Richardson, B. & Steve, M. E. (1995a). How advertising works: A meta-analysis of 389 real world split cable tv advertising experiments, Journal of Marketing Research 32: 125139. Lodish, L. M., Abraham, M., Kalmenson, S., Livelsberger, J., Lubetkin, B., Richardson, B. & Steve, M. E. (1995b). A summary of fty-ve in-market experimental estimates of the long-term eects of advertising, Marketing Science 14(3): G13340. Naik, P. & Raman, K. (2003). Understanding the impact of synergy in multimedia communications, Journal of MArketing Research 40(4): 375388. Rossiter, J. & Bellman, S. (2005). Marketing Communications: Theory and Applications, Pearson Education. Wells, W. (1964). Recognition, recall and rating scales, Journal of Advertising Research 4(3): 28.

33

You might also like