comScore Sets Metrics Record Straight


Following all the hubbub over online ratings, log files, metrics firms, discrepancies and other good online measurement mania, comScore CEO Magid Abraham drafted a letter in respons to the hubbub which we'll reprint here:

An Open Letter to the Industry

A recent article in Media Week and Ad Week discusses a recurring theme in the online industry, asserting that panel-based audience metrics are inaccurate because they do not match Web server logs. Since Web logs record a site's every visit, visitor and page request, it makes intuitive sense that those metrics might be viewed as the gold standard. When third party estimates do not match Web logs, it is easy to view this as a reflection of weaknesses in panel-based measurement.

However, as always, the devil is in the details. When you scrutinize the details, the answer to the question about why Web log and panel-based data don't always match up is ... "it depends." In fact, the reasons for discrepancies depend on a number of factors: the panel data could be wrong, the Web log data could be wrong - or more often - they are both right given the exact definition of what they measure. But, those definitions could be vastly different.

Not All Measures are Created Equal

The most obvious metric where Web logs should be correct is page view counts (PVs). PVs should be simple to compare because, unlike visitor metrics, page views are not confounded by problematic issues such as whether a server's count of cookies is equivalent to counting individual people (it's not!) or how many cookies are deleted by the average user (no-one seems to agree on that answer!). Here again, however, it depends! Web logs measure specific URL's requested from the Web server.
At comScore, we call those hits. Hits need to be filtered properly to get a proper PV count, consistent with IAB definitions. Sometimes this can be done systematically, but at other times it is difficult to do.
Consider these challenging factors:

1. Multiple frames in one page can result in multiple hits.

2. Web servers record that a page has been served, but cannot tell whether a page has actually been loaded on the user's screen. Panel based measurement systems can tell the difference and therefore records pages actually loaded.

3. Pop-up ads are counted as hits by Web logs, but are filtered from comScore's PV measure because they are not requested by the consumer.

4. Ad requests and ad tracking beacons that resolve at a site's domain are recorded as hits and need to be properly filtered.

A Web log-generated PV count is likely to fall somewhere between Web Hits and real PVs, and will almost always be higher than real PV's.
But, the magnitude of the difference can be astounding. Consider the following example: comScore data shows 3 times more hits than PVs for Google. In fact, a difference of up to 300 percent between hits and PVs
- which on the surface appears extraordinary - may be perfectly legitimate when comparing Web log and panel-based data because we are comparing the proverbial apples to oranges. The difference between a PV and hit can be so subtle that detecting it is more akin to telling the difference between a cucumber and pickle! It's no wonder that people get confused!

Are We Even in the Same Universe?

Differences in the measurement universe can add to the discrepancies. For example, comScore's panel does not measure usage from public or office-based shared machines (e.g. shared machines at schools, libraries, Internet cafes, group PC's at work, etc.), nor does it include usage from college dorms, government offices, the military, school/university offices, or mobile phones/PDA's. However, this is all included in Web logs. Another very basic and common mistake is to compare a U.S. panel-based PV estimate to a server log-based number of global hits. Sometimes, content that is delivered by newsletters, RSS, or other content aggregators may be recorded by Web logs but not recorded in a panel unless the user explicitly clicks on an embedded link to the site. Finally, technologies like AJAX are making it increasingly challenging to even define a page view and will increasingly become a potential source of discrepancy.

Cookies are not People

Comparing visitor counts is even trickier. Web logs count cookies and panels count people. Cookies are typically, but not always, unique to each PC. This means that two people can share the same cookie if they use the same PC. Conversely, one person can be counted as two cookies if he/she uses two different PCs at home - or different PCs at home and at the office. Even more confounding, one person using one PC can appear to the Web server as multiple cookies if the cookies on that machine are deleted or reset. Users on PCs that do not accept cookies at all may be counted every time they visit the site. In addition, non-user requested traffic such as robot traffic is a problem in Web logs, unless it is properly filtered. In one recent case, it was determined that robot traffic accounted for 72 percent of all Web log records. Finally, one must be careful to compare the same geography, especially the U.S. based traffic to a U.S. based site instead of traffic logs from all countries. That difference alone could be more than 100 percent.

To illustrate the magnitude of these differences without releasing confidential client data, I will refer to a article by Paul Boutin published on Feb 27, 2006, titled: "How many readers does Slate
really have?" [].

Slate's Web logs show 8 million unique users based on cookie counts, while comScore and Nielsen Netratings both report 4.6 million unique visitors (UV) for Slate. However, comScore estimates Slate's worldwide unique visitors at 6.05 million, which is a more relevant comparison, if the 8 million number from Slate's Web logs includes international visitors.

Even more interesting, when comScore counts the number of unique computers (UC) that visit Slate (the unduplicated number of computers from which a visit to Slate was initiated), we estimate 7.4 million computers in the U.S. and 8.9 million computers on a worldwide basis.
So, which of these comScore metrics should be used as a comparison to Slate's internal 8 million visitor estimate? At first glance, the difference between 4.6 million and 8 million is huge. However, UCs should be more comparable to cookies, albeit still not a perfect match.
In order to form a fair comparison, we ought to be comparing Slate's 8 million cookies to the 7.4 million comScore U.S. UC estimate, or even the 8.9 million worldwide UC estimate depending on whether the Slate numbers counted international visitors. Either way, the difference versus Slate's internal number is now much smaller and probably within an acceptable range given all the other factors at play. Is the panel data wrong? Probably not! Is this confusing? Certainly! But, this example demonstrates that it helps to know what the metrics are actually measuring before jumping to conclusions.

The need for transparency is often mentioned as a potential solution for this confusion. comScore strongly supports transparency and has a number of initiatives under way to provide the industry with greater transparency in what we do, including the MRC pre-audit. Transparency, however, must go both ways.

At the most basic level, transparency requires publishers to disclose what metrics they are using and how they are calculated. All too often, discrepancies are either explained or mostly vanish once we conduct an in-depth review - which enables us to compare cucumbers to cucumbers. We also frequently report on e-commerce dollars transacted on a site.
Dollars are tougher to estimate than page views because transactions occur far less frequently than page views, which means that there are fewer sample points to use in measuring them. Nevertheless, it is remarkable that the dollar differences we see between our data and the clients' data are typically smaller than page view differences - perhaps because there is no confusion regarding the definition of a dollar!

Finally, the concern about panel quality suffering from cost reductions is just not warranted as far as comScore is concerned. We have never felt better about the size and quality of our panel as we do today.

The Numbers Seem Right ... When there's No Other Comparison

Another issue discussed in the recent Media Week and Ad Week article is the contrast in the use of ratings between TV and the Web. The author correctly points out that agencies buy Web advertising inventory based on ad impressions delivered by the ad servers, instead of relying on third party audience measurements as in the TV world. This is entirely appropriate. An exact count of impressions delivered by an ad server is more precise than any panel-based metric. On the other hand, the TV industry does not have TV server logs to count how many people view individual TV ads. The only available estimate is provided by panel-based TV ratings - and advertisers have no choice but to use them as a basis for payment.

This is precisely why TV ratings are considered a currency but Internet ratings are not. However, this does not mean TV ratings are more accurate. In fact, the very absence of audience census data for TV gives the illusion of fewer data problems since there are no differences to scrutinize every time one looks at the ratings numbers.

One might liken this to using a single watch to measure time. With only one watch, you really don't know if the time is off - or by how much - so you have a false sense of accuracy even though the time could be significantly off. On the other hand, if you have two watches, you are almost always going to see a difference between the two time estimates, which leads you to question what time it really is... and which watch is right.

The TV industry measures the world with ONE watch. While that may be comforting, it is not necessarily accurate. As Paul Boutin says in the Slate article:

"The more I dig into how Web ratings work, the more I realize people in other media are in denial. Internet publishing is the most finely measurable medium ever invented; broadcast, movie, and print companies have no way of monitoring individual transactions from their end. Yet, while the Web guys admit they could be off by half, Nielsen claims its television ratings have a margin of error of 4 percent."

He is absolutely right!

Magid Abraham, PhD
President & CEO
comScore Networks, Inc.

by Steve Hall    Sep-29-06   Click to Comment   
Topic: Tools   

Enjoy what you've read? Subscribe to Adrants Daily and receive the daily contents of this site each day along with free whitepapers.