As audience-based linear TV campaigns become more common, we are seeing that the default goal of campaign optimization is shifting from maximizing advanced target impressions to maximizing advanced target reach (or at least, to have more control over it). Therefore, it is important to understand what reach optimization really is, why it is a difficult computational problem, and what approaches can be taken to approximate a solution in a timely fashion. In this post, I focus on the problem itself. Later on, I will focus on solution approaches.

The way we approach problems in life is informed by our past experiences. In my case, my take on the reach optimization problem starts with what happened at the beginning of the 2013-2014 academic year at the University of Delaware. I was a postdoctoral researcher there at the time, and all the community including faculty, staff, and students were very excited. Our excitement was not due to the return to classes, or because we would see friends again. We were excited because a Starbucks coffee shop had opened on campus!

I was not surprised that this momentous event in the life of small Newark, DE, had finally happened. But I was a bit surprised that the owners of the shop in question decided to open the shop inside a lecture hall (Smith Hall). Why was that a reasonable decision?

I will come back to answering this question, but first, I want to say that I find it very satisfying to discover connections between apparently disparate problems. As it turns out, the problem the owners of the coffee shop faced, and the problem of maximizing reach of a linear TV campaign are essentially the same, though of a very different scale.

The analysis that led to the decision to open the coffee shop in that lecture hall presumably considered many factors, but one of those factors was most likely the need to reach as many possible customers as possible. In this context, reach means giving the opportunity to a passerby to go buy a cup of coffee on their way to/from class without spending too much time in doing so.

Mathematically, we could model the problem of choosing where to open a coffee shop on a university campus as follows:

First, each building is a node on a graph:

Second, we connect two buildings (i, j) if the time it takes to walk from building i to building j is less than, say, 5 minutes (or the amount of time deemed reasonable to get coffee) :

Third, for each building, we determine the subset of buildings that are reachable from it:

Building | Reachable Buildings |

A | {A, B, E, F} |

B | {A, B, E, F} |

C | {C, G} |

D | {D, E} |

E | {A, B, D, E, F} |

F | {A, B, E, F, G} |

G | {C, F, G} |

At this point, there are two problems and their respective solutions:

*Set Cover Problem*: In this problem, we want to choose the smallest number of subsets so that their union cover the whole set of buildings (the universe). This problem would be appropriate if we wanted to open one or more coffee shops so that all people on campus are reachable and therefore can buy coffee near to their location regardless of where they are on campus.

In our example, there are two solutions: Solution 1, consists of buildings E and C, that is sets {A, B, D, E, F} U {C, G} , and Solution 2 consists of buildings E and G. Note how Solution 2 duplicates building F. Thus, if we wanted to offer the opportunity to buy coffee at a nearby location to everyone on campus, we could do that with two coffee shops located on building E and either building C or G.

*Maximum Coverage Problem*: In this problem, we first fix the number of subsets we can select and look for the combination that covers the greatest number of elements in the universe. In our example, If we fixed the number of coffee shops to one, then again we have two solutions. Solution 1 consists of building E and Solution 2 consists of building F. These buildings are solutions because with one single selection, we cover five out of the seven buildings on campus. No other selection results in more than five reachable buildings.

Assuming that the owners of the coffee shop had budget for only one coffee shop, we can imagine that they solved an instance of the maximum coverage problem in order to determine in which building to open the shop so as to maximize reach.

So how is this problem the same as maximizing reach of a linear TV campaign?

The equivalent of the whole set of buildings in the coffee shop problem is the whole set of members of a target audience, that is, the target universe:

It is very unlikely that every member of the target segment watches the same commercial spot, just as it is unlikely that all faculty, staff, and students of a university are hosted in the same building. Therefore, then we need to select commercial spots (S1 through S6 in the example) such that the union of their viewers maximizes the coverage.

In this example, the set cover problem does not have a solution because there are two viewers who never watch a commercial spot. The maximum coverage problem, however, does have at least one solution.

If, for example, we could buy only three spots, then in order to maximize reach, we would need to buy spots S4, S5, and S6, for a total of 10 impressions, and a reach of 9/15 = 60%.

S4 provides four impressions, as does S6. However, one respondent is exposed to both S4 and S6, and thus it contributes two impression count but only one for reach.

The Maximum Coverage Problem is known to be NP-hard, which means that in the worst case scenario, an instance of the problem may require any algorithm devised for the task to try all the possible solution combinations. In our case, the number of combinations grows exponentially as a function of the number of sets of selectable spots, that is, *|S|! / k!(|S| – k)!*, where is the number of spots, and is the number of selectable spots determined by the campaign budget and other constraints.

For our example, the number of ways we can choose three spots from the set of six available spots are {S1, S2, S3}, {S1, S2, S4}, {S1, S2, S5}, and so on. In total, there are 20 such subsets. For a real case problem, with tens of thousands of available spots and a few hundreds or thousands of purchasable spots, the number of possible combinations is just mind boggling.

We have seen how seemingly unrelated situations call for the same mathematical formulation, and how difficult it is to solve the problem of maximizing reach exactly. Any solution in the market that claims to maximize reach of a TV advertising campaign must therefore do so approximately.

In the next post, we will explore a few approximation approaches to reach optimization. But first, let’s grab a cup of coffee.