A technology white paper for webmasters and web developers

by David Levine, Adtoniq CTO August, 2019

 

 

Background & History

The Ad Block Army

Ad blockers use four main techniques

Ad block technique #1: Block by URL

Ad block technique #2: Element Hiding using CSS Selectors

Ad block technique #3: Disable JavaScript

Ad block technique #4: Execute JavaScript designed to disable anti-ad-blockers

The “stealth” ad blocker ideal

The Publisher Army

Technology

Content Protection

Providing teaser content

Detecting Ad Blockers

Bypassing CSS Selectors

Ad Block Analytics

Ad block demographics

Re-inflating hidden content

The future battlefield

Background & History

It’s been three years since we released our first technology white paper describing the state of the battle between publishers and ad blockers, and since then the U.S. ad block user penetration rate has increased from 21.5% to 26.4% and is continuing to grow, as shown below.

 

This white paper is an update to reflect the current state of the battle as of 2019, and describes the technology being used.

Ad blockers should be more accurately thought of as generic “blockers” because they block more than just ads – they also block analytics, social sharing widgets, optimization tools, personalization tools, single-sign on tools, and much more. Ad block users wish to block advertising technology in order to make websites less annoying, improve their performance, protect their privacy, and improve their security. Publishers wish to freely offer their content in exchange for monetizing that content using advertising as an alternative to paid subscriptions. And herein lies the battlefield.

Publishers started the war by chasing advertising revenue to the detriment of their end user experience. End users responded by inventing ad blockers to improve their experience, thus breaking the implicit agreement between publishers and their audience which used to be this: Publishers provide their content for free, and in exchange their audience will see advertisements to cover the costs of providing that content. In order for publishers to adapt to this new reality, that implicit contract must be renegotiated, using a combination of technical counter defenses and an open dialog between publishers and their audience that offers their audience simple choices.

This white paper describes the two armies battling it out in the ad block battle theater, their technical strengths and weaknesses, and prospects for the future. At stake is the future of the internet as we know it. We’ll start with an analysis of the ad block army, then move on to the publisher army, and finish with a prediction for the future.

 

The Ad Block Army

 

The goals of the Ad Block Army are:

  1. Reduce annoying advertising clutter, like auto-play videos and pop ups. Everyone hates it when you open a web page and some audio starts playing, annoying everyone around you.
  2. Protect privacy by preventing behavioral tracking and targeting of users. It’s annoying to see that pair of boots you looked at once following you all around the web.
  3. Be secure against malvertising and related threats, such as phishing scams that tell you your computer is infected with a virus when it really isn’t.
  4. Improve site load times and decrease bandwidth usage, especially on mobile devices where people pay for bandwidth.

 

Although ad blocking started as a desktop/laptop browser extension to block annoying advertising, that threat has broadened as companies like Apple and Google integrate ad blockers into their own technology, new mobile browsers emerge like Brave and Ghostery with advanced built-in ad blocking and privacy protection, and network-based ad blocking emerges at ISPs, mobile carriers, and government and corporate firewalls.  We also see consumer devices providing network-based ad blockers emerging in forms like Pi-Hole, and on device proxy servers like AdGuard that filter traffic at the network level rather than just the browser level. New ad block solutions emerge every few months, requiring publishers to stay abreast of the latest threats and potentially make technical changes to their web sites to address the latest ad block threats and prevent their revenue from declining.

Ad blockers rely on a constantly evolving, community-maintained set of filter lists to determine which content to block. The main filter list used by AdBlock Plus and Ad Block, two of the most popular ad blockers, is called the EasyList, and it is maintained by a dedicated team of contributors using the Mercurial source code management system. As of July 2019, the easylist is 2,701,056 lines long and growing. And this is just one of many filter lists. Another popular filter list is called easy-privacy.

Ironically, Google and other ad networks and publishers are among the largest financial contributors to supporting the development of the filter lists, by paying fees that allow their ads through even though the user has installed AdBlock Plus. This is called the “Acceptable Ads” program, and the revenue they receive is used to fund the development of their core ad blocking technology.

 

Ad blockers use four main techniques

 

Despite the plethora of ad blockers on the market, most ad blockers employ only a few basic techniques, and a thorough understanding of these techniques and their counter-defenses is necessary to overcome the harm done by ad blockers. As the battlefield evolves, more techniques will be developed requiring new counter-defenses – and these will be described at the end of this paper – but in the meantime, the majority of the ad block threat can be met by understanding how to cope with these few techniques.

 

Ad block technique #1: Block by URL

 

Ad blockers prevent browsers from loading out-of-line resources by using filter list rules that target patterns in URLs that should be blocked. These filter list rules then leverage hooks in the underlying browser (also known as the “client”) to prevent the network connection from being processed. For example, this rule from the EasyList blocks Google tracking pixels:

||googleadservicepixel.com^

 

The first two vertical bars introduce a domain name that should be blocked, and the final carrot terminates the domain name. When your browser blocks URLs, it will display messages in your browser console like the ones shown below (this is what you might see on adblockplus.org – yes, their own ad blocker blocks stuff on their own site). Each error message represents a filter list rule that was used to block a resource from being loaded by targeting some portion of its URL.

 

Browser extensions and dedicated ad blocking browsers are not the only place in the network ecosystem where network connections are blocked. Ad blocking DNS servers block network connections by augmenting the Resolve DNS function. If the domain requested is on the filter list, a failure is returned, and otherwise a standard recursive resolver is used to retrieve the requested DNS entry from a root DNS server. This supports ad blocking for all types of devices (mobile, desktop, etc.) and all types of software including native mobile applications. What is particularly interesting about this approach is that it does not require any software to be installed by the end user and in fact the end user behind a router using an ad blocking DNS server has no way to disable ad blocking, short of changing network connections. This is a big problem when a website blocks access to its content until the user disables their ad blocker, because the user can’t easily do that when they are behind this kind of ad blocker.

Firewalls are also an easy place to implement network-based ad blocking, from the firewall built into the cheapest home router to the professional grade enterprise firewalls for large corporate networks. All one needs to do is copy and paste one of many publicly available filter lists into a “block” firewall rule, and the network connections will be blocked for all users behind the firewall.

When blocking a network connection, ad blockers can do this in one of two fundamentally different ways, leading to two very different ad campaign metrics, and it’s important to understand which kind of ad blocking is happening in order to understand how accurate publisher reporting is. Misunderstanding this can lead a publisher to bill an advertiser for impressions that were never seen by ad blocked users.

If the ad blocker prevents the network connection from being opened, the ad server  will never even know that a user was there, and it would not report an impression. On the other hand, the ad blocker can open the network connection to the resource, but deliver an empty result to the browser so that these resources are never actually usable within the browser. This could mislead the server into believing that it had served an impression, when it hasn’t. Even if measurement or view-ability code is not blocked, that code may not “realize” that there’s no ad there and report an impression that was never actually seen.

 

Ad block technique #2: Element Hiding using CSS Selectors

 

While URL rules can block entire services from operating such as ad servers, analytic systems, and multivariate testing tools, often times advertisements are site-served or otherwise embedded in HTML content that is not blocked by a URL. In these cases, the advertising content is embedded inside other content that the end user wants to see, which leads to the requirement that the ad blocker must be able to selectively target content within the page. CSS selectors are typically used by ad blockers to identify nested HTML content that should be blocked. If the content is not ultimately rendered as HTML, CSS selectors can’t block it.

In cases where content is generated dynamically by JavaScript after the page loads, ad blockers hook into events that are fired when the DOM content is modified by JavaScript so that they can continuously monitor for new content that should be blocked.

Typically CSS selectors are used to select DOM elements with DOM classes or ids matching values that often indicate ads, such as “adchoices” or “leaderboard”. However many more sophisticated CSS selectors are used as well looking for combinations of attributes, or child elements within specified parent elements. For example, many ad blockers block images that use standard IAB ad unit sizes, such as the banner ad format of 728 x 90. There are also some limitations with using CSS Selectors. The selector can not be too generic, as otherwise it would generate an unacceptable number of false positives which would break websites. 

Identifying the content to be removed is the first step. Once identified, the content must be removed from the visible page. In many cases, the ad blocker can not simply delete all the DOM elements matching the CSS selector because that would break the site. Instead, they generally hide elements by setting various element style attributes including display and visibility, and the element attribute disabled. This is important, as we’ll see later, because counter defenses can recover and leverage these hidden elements for various purposes.

 

Ad block technique #3: Disable JavaScript

 

JavaScript is usually used to power advertising, analytics, tracking software, and anti-ad blocking software, so ad blockers have evolved advanced ways to disable JavaScript. Out of line JavaScript can be disabled by URL using ad block technique #1 described above, so to counter that publishers moved their scripts inline where they were loaded with the page and could not be blocked by URL. Ad blockers responded by evolving the ability to block inline JavaScript with a rule like this:

##script:contains(“isAdBlockedUser”)

This would disable inline JavaScript that contains the string isAdBlockedUser anywhere within the script, which in this example would be an anti-ad-block script that tries to bypass ad blockers.

Another way to block JavaScript, is to use an ad block rule that specifies a content security policy (CSP). For example, the following ad block rule only allows JavaScript to be loaded from the same origin as the website, but blocks scripts loaded from other domains, and prevents inline JavaScript from executing.

||site.com^script-src ‘self’;

 

A variation of this only permits JavaScript from a CDN, and blocks all other JavaScript from executing including inline JavaScript. You could extend this to whitelist only specific scripts.

 

||site.com^script-src cdn.site.com;

A publisher can try to combat this by combining anti-ad-block logic into the same JavaScript resource as other code that is necessary for the site to function properly. This takes more work on the part of the publisher, introduces issues when updating the JavaScript, and also means they risk having critical site functionality disabled by ad blockers if they disable that JavaScript.

 

Ad block technique #4: Execute JavaScript designed to disable anti-ad-blockers

 

Sometimes the declarative approaches described so far are not sufficient to disable anti-ad-blocking software deployed by publishers, so a procedural approach is required. Arbitrary JavaScript code may be executed to detect and disable anti-ad-blocking techniques deployed by publishers. Greasemonkey and Tampermonkey are two examples of so-called “anti-adblock killers” that are targeted at detecting and disabling anti-ad-block software deployed by publishers. Some ad blockers like uBlock also have the option to execute JavaScript.

 

The “stealth” ad blocker ideal

 

Ad blockers would ideally like to operate “stealthily”, meaning that the publisher can not detect that the user is using an ad blocker. When ad blockers can achieve this, then the publisher will not attempt to protect any content and will treat the user as if they had no ad blocker, even though ads are being blocked. These publishers may be unaware of the extent of their ad block threat.

 

The Publisher Army

 

The publisher army consists of a number of different vendors who offer anti-ad blocking solutions, of which Adtoniq is one. Some publishers also have the necessary technical people available and choose to implement their own ad blocking solution. However because this requires ongoing development resources, most publishers prefer to buy rather than build here. The goal of the Publisher army is to monetize their ad blocked users in order to sustain their advertising-based business model, which provides valuable and free content in exchange for an implicit agreement to see advertisements.

 

Technology

 

Publishers usually try to communicate with their ad blocked audience, which starts with detecting that the user has an ad blocker and then showing them a message, typically asking them to take some action like disabling their ad blocker, purchasing an ad-free subscription, or agreeing to see ads. Unfortunately there are filter lists that specifically target this kind of messaging, such as the anti-adblock list. To work around these lists, publishers must stay away from using DOM elements that are targeted by these lists.

Publishers can also use anti-ad block technology which bypasses the ad blocker and shows ads even with ad blockers enabled. This kind of technology is on the front lines of the ad block battle, and sometimes one side is on top and sometimes it’s the other.

Ironically, one of the leading forms of anti-ad block technologies comes from none other than Ad Block Plus and its parent company Eyeo. AdBlock Plus will show users with Ad Block Plus their so-called “acceptable ads,” from which they actually make a profit by charging publishers a fee to show ads in the spots that they just blocked. Some publishers view this as extortion, because first AdBlock Plus blocks their ads, then they offer them back for a fee. For example, here’s an example of an advertisement that Google paid AdBlock Plus to show on the Google search page if you search for Chevrolet even with AdBlock Plus enabled:

 

A better kind of anti-ad block technology bypasses all ad blockers, not just a select few ad blockers, and for any kind of ad, not just one source of ads. One approach to bypass ad blockers is to proxy ad networks either through the website itself or domains that are not on the filter list. If these domains are ever added to the filter list, the domain for the proxy can be easily changed. Another approach is to site-serve the ads in such a way that they are indistinguishable from normal website content, making it impossible to develop rules for the filter list.

 

Content Protection

 

One question every publisher should ask themselves is whether they want to prevent ad block users from seeing all or part of their site. This approach works best when the site has valuable or unique content, or loyal users willing to support the site by seeing its advertisements or paying for a subscription.

Some publishers implement a site-wide lockout policy, locking ad block users out of the site unless they disable their ad blocker or opt in to seeing ads.  Other publishers allow viewing the home page with an ad blocker, but require users to disable their ad blocker if they want to click into the site.

A more sophisticated approach hides specific content within the page, such as selected videos or images, from ad blocked users, while asking them to opt in to seeing advertisements in order to see the hidden content.

 

Providing teaser content

 

Rather than taking an all or nothing approach with respect to hiding the site or individual DOM elements from ad blocked users, the publisher can choose to present some of the protected content as a tease with an offer like “disable your ad blocker to see more.”  This can take several forms including: blurring content, dimming content, or only showing the beginning of a video.

When using client side protection, JavaScript code can apply the necessary transformations to DOM elements to achieve the desired teasing effect. With server-side protection, two versions of the assets must be prepared: the tease version and the full version. To fully protect these assets, they should be accessed either through a server-generated nonce, or a frequently changing random URL.

This presents a special challenge when streaming content off commercial CDNs because they are not able to easily validate a nonce or be fronted by a random URL, without adding an edge trigger which validates the nonce.

Video is a special case because it provides an excellent opportunity to stream the first part of the video to ad blocked users, getting them hooked, and just at the “right” moment, publishers can force the video to pause and then show a message like “to see this rest of this video, please disable your ad blocker.” This is implemented by creating two separate video streams: One for ad blocked users, and the full one for non-ad blocked users.

 

Detecting Ad Blockers

 

In order to communicate with the ad blocked audience, the publisher must be able to detect that an ad blocker is in use and then report that fact into an analytics system. Because analytics software like Google Analytics and Omniture is blocked by ad blockers, publishers can’t use their standard tools to collect this kind of information.

Publishers can detect ad blockers by writing JavaScript code that detects some of the side effects of ad blockers which are: (1) URLs are blocked, and (2) elements are hidden on the page; the code then sends the results of the detection to an analytics collector.  

A small JavaScript code snippet can create a resource such as an image that the user won’t see using an image source URL that appears on the filter list. Without an ad blocker the image should load, but an ad blocker will prevent it from loading. Other resource types can be used as bait too, such as JavaScript code.

To detect element hiding, a JavaScript code snippet can insert an element on the page matching a CSS selector on the filter list, and then check to see whether its display attribute is subsequently set to ‘none.’

A fundamental problem with detecting ad blockers, is that this relies on executing JavaScript. It is not possible to detect ad blockers without using JavaScript, so if the JavaScript performing the ad block detection is disabled by the ad blocker, the publisher may never know that an ad blocked user is there.

To make it harder for ad blockers to disable the detection script, publishers can use a technique called polymorphic script encryption, which scrambles the ad block detection script differently every time it is embedded into the publisher’s page, effectively making it impossible to detect. This borrows a page from polymorphic virus technology, allowing the script to transmute itself into infinite variations. However even this can be blocked if JavaScript is disabled by the blocker.

 

Bypassing CSS Selectors

 

Ad blockers use CSS selectors to hide ads, messages to ad block users, and other content embedded within the publisher’s site. Publishers can bypass these rules by randomizing the various ids, class names, and other details that can be targeted by CSS selectors so that they can never get a solid match. Unfortunately,  most websites are not designed with this in mind, so it is usually preferable for a publisher to build or buy a new subsystem that automates this.

To implement this, new server-side logic is required to generate random class names and ids for DOM elements, and then use those names and ids in all the appropriate places such as within JavaScript or CSS classes.

In addition, the values of other attributes need to be adjusted as needed. For example, some filter list rules look for specific custom attribute names or attribute values, or children of other elements, or they can even target popular ad unit element dimensions such as the 300 x 250 ad unit.

As new filter list rules emerge, the randomizing code must be upgraded to bypass the latest threats. This is an area where publishers may be better served by outsourcing this maintenance to a vendor that can amortize the cost of maintaining this randomizing code across many publishers, rather than building it in-house.

 

Ad Block Analytics

 

Increasingly publishers are tracking new Key Performance Indicators (KPIs) to measure the extent of their ad block problem including:

  1. Percent of ad blocked page views and users
  2. Advertising revenue lost to ad blockers
  3. Percent of users who whitelist the site or opt in to seeing ads

Publishers can use these KPIs to drive both technical and business process changes to improve these metrics. Ideally publishers build multivariate testing on top of these KPIs to further speed up the optimization.

While publishers should be looking at these new KPIs, it’s critical to understand the effect that any ad block strategy can have on other KPIs. For example, if publishers take a hard core stance and require their audience to disable their ad blocker to view the site, the publisher will dramatically reduce their percent of ad blocked page views for sure, but they will have fewer overall page views because some percent of their audience will not accept that deal. That could result in less social sharing, and could increase or decrease total ad revenue, so publishers should understand which is happening and why and be prepared to react quickly.

Once a determination has been made as to whether the user has an ad blocker, JavaScript must execute within the browser to send the analytics data to an analytics collector to record whether this page view is affected by an ad blocker. Some solutions on the market leverage leading analytics solutions such as Google Analytics or Omniture to capture this information, but if the ad blocker is blocking network connections to those servers (as is often the case), then the publisher will not receive any data regarding these ad blocked users.

To reliably get analytics data from the browser, the JavaScript must send the data from the browser to a server, using a URL that can not be on the ad blocker’s filter list. This could be the website itself, or another domain set up to collect analytics. Once collected by a server that is not on the filter list, the analytics data can be forwarded to a mainstream analytics system like Google Analytics or Omniture, or anywhere else.

Another approach to determine the ad block rate is to use a combination of tracking pixels, one of which will not be blocked and counts all page views, and the other is designed to be blocked and counts unblocked page views only. By subtracting the unblocked page views from the total page views, you can accurately calculate the total number of page views and hence the ad block rate without using any JavaScript.

 

Ad block demographics

 

The demographics of a publisher’s ad blocked audience are likely to be more valuable as compared with non-blocked users. For example, they are likely to be skewed towards younger, male, more technical and affluent users. They may also spend more time on site, and have a higher click through rate and therefore RPM. Ad block rates are higher in Europe than in the United States, and even higher in parts of Asia.

 

This valuable audience segment is also an audience that advertisers are unable to reach today. All of these factors combine to make ad blocked users among the most valuable segment of a publisher’s total audience, if they can be reached, and reached in the right way.

 

Re-inflating hidden content

 

 

When ad blockers find inline content to block, they typically hide the matching DOM elements rather than delete them. This is done because deleting the elements often breaks the web site. These are most often ad units.

Because the DOM elements are hidden and not deleted, they are available to be “re-visualized” which means that they are made visible again. This is easy to do simply by targeting the DOM element and resetting its attributes to be visible.

Once made visible again, it’s up to the publisher to decide what to do with this space that was just revealed again after being hidden by the ad blocker. They have a space on their web site where they know they can communicate with their ad blocked audience. They can choose to:

  1. Serve another ad, so long as that ad will not be blocked.
  2. Engage in a dialog with the ad blocked user, motivating them with a call to action to either disable their ad blocker, sign up for a subscription, or take some other action.

 

The future battlefield

 

In the long term as ad blockers evolve, publishers must move more towards a Digital Rights Management (DRM) model, giving them better control over their content, including who gets to see it and under what conditions. DRM is based on cryptography and relies on using a secure lock-box on the client to mediate access to the content.

Such a long term future is not easy to deploy today because the web and content management systems in general were not designed with DRM in mind. However market forces are driving the industry towards new web standards and software that will effectively reintermediate the entire advertising ecosystem and technology stack.

With DRM in place, publishers can configure rights policies that include subscription or micro-payment policies, time-based policies, and ad block status policies, among others.

But the bigger question is: How will publishers and their audience renegotiate the broken contract that used to be free web sites in exchange for advertising?  Will this model continue or will another model dominate the Internet? What do you think? Drop me a line and let me know at david@adtoniq.com.