Third Party Web
Data on third party entities and their impact on the web.
This document is a summary of which third party scripts are most responsible for excessive JavaScript execution on the web today.
Table of Contents
- Goals
- Methodology
- NPM Module
- Updates
- Data
- Summary
- How to Interpret
- Third Parties by Category
- Ads
- Analytics
- Social
- Video
- Developer Utilities
- Hosting Platforms
- Marketing
- Customer Success
- Content & Publishing
- Libraries
- Mixed / Other
- Third Parties by Total Impact
- Future Work
- FAQs
- Contributing
Goals
- Quantify the impact of third party scripts on the web.
- Identify the third party scripts on the web that have the greatest performance cost.
- Give developers the information they need to make informed decisions about which third parties to include on their sites.
- Incentivize responsible third party script behavior.
- Make this information accessible and useful.
Methodology
HTTP Archive is an inititiave that tracks how the web is built. Twice a month, ~4 million sites are crawled with Lighthouse on mobile. Lighthouse breaks down the total script execution time of each page and attributes the execution to a URL. Using BigQuery, this project aggregates the script execution to the origin-level and assigns each origin to the responsible entity.
NPM Module
The entity classification data is available as an NPM module.
const { getEntity } = require("third-party-web");
const entity = getEntity("https://d36mpcpuzc4ztk.cloudfront.net/js/visitor.js");
console.log(entity);
Updates
2019-02-01 dataset
Huge props to WordAds for reducing their impact from ~2.5s to ~200ms on average! A few entities are showing considerably less data this cycle (Media Math, Crazy Egg, DoubleVerify, Bootstrap CDN). Perhaps they've added new CDNs/hostnames that we haven't identified or the basket of sites in HTTPArchive has shifted away from their usage.
Data
Summary
Across top ~1 million sites, ~800 origins account for ~65% of all script execution time with the top 100 entities already accounting for ~59%. Third party script execution is the majority chunk of the web today, and it's important to make informed choices.
How to Interpret
Each entity has a number of data points available.
- Usage (Total Number of Occurrences) - how many scripts from their origins were included on pages
- Total Impact (Total Execution Time) - how many seconds were spent executing their scripts across the web
- Average Impact (Average Execution Time) - on average, how many milliseconds were spent executing each script
- Category - what type of script is this
Third Parties by Category
This section breaks down third parties by category. The third parties in each category are ranked from first to last based on the average impact of their scripts. Perhaps the most important comparisons lie here. You always need to pick an analytics provider, but at least you can pick the most well-behaved analytics provider.
Overall Breakdown
Unsurprisingly, ads account for the largest identifiable chunk of third party script execution. Other balloons as a category primarily due to Google Tag Manager which is used to deliver scripts in multiple categories. Google Tag Manager script execution alone is responsible for more than half of the "Mixed / Other" category.
Ads
These scripts are part of advertising networks, either serving or measuring.
Rank | Name | Usage | Average Impact |
---|
1 | Media Math | 662 | 68 ms |
2 | Adroll | 3,198 | 94 ms |
3 | Amazon Ads | 22,090 | 94 ms |
4 | Scorecard Research | 3,578 | 103 ms |
5 | Rubicon Project | 3,905 | 106 ms |
6 | MGID | 10,317 | 114 ms |
7 | Criteo | 64,547 | 116 ms |
8 | Market GID | 3,873 | 153 ms |
9 | Taboola | 23,853 | 176 ms |
10 | WordAds | 32,295 | 212 ms |
11 | Google/Doubleclick Ads | 1,206,843 | 215 ms |
12 | Pubmatic | 3,140 | 225 ms |
13 | Yahoo Ads | 9,578 | 225 ms |
14 | AppNexus | 14,694 | 265 ms |
15 | Yandex Ads | 39,330 | 272 ms |
16 | Integral Ads | 24,532 | 305 ms |
17 | Sizmek | 4,011 | 374 ms |
18 | DoubleVerify | 1,988 | 600 ms |
19 | MediaVine | 9,801 | 706 ms |
20 | Moat | 14,337 | 708 ms |
21 | OpenX | 10,729 | 836 ms |
22 | 33 Across | 20,137 | 863 ms |
23 | Popads | 5,009 | 1288 ms |
Analytics
These scripts measure or track users and their actions. There's a wide range in impact here depending on what's being tracked.
Rank | Name | Usage | Average Impact |
---|
1 | Alexa | 1,265 | 50 ms |
2 | Google Analytics | 1,163,249 | 77 ms |
3 | Mixpanel | 5,462 | 77 ms |
4 | Snowplow | 2,492 | 77 ms |
5 | Baidu Analytics | 7,041 | 78 ms |
6 | Crazy Egg | 455 | 89 ms |
7 | Hotjar | 91,036 | 92 ms |
8 | Adobe Analytics | 32,173 | 183 ms |
9 | Segment | 6,998 | 201 ms |
10 | Tealium | 14,422 | 207 ms |
11 | Optimizely | 13,482 | 232 ms |
12 | Salesforce | 40,868 | 270 ms |
13 | Yandex Metrica | 221,577 | 356 ms |
14 | Histats | 14,706 | 390 ms |
15 | Lucky Orange | 6,113 | 834 ms |
Social
These scripts enable social features.
Rank | Name | Usage | Average Impact |
---|
1 | VK | 6,342 | 65 ms |
2 | Pinterest | 14,331 | 87 ms |
3 | Facebook | 1,107,461 | 116 ms |
4 | Yandex Share | 29,555 | 128 ms |
5 | LinkedIn | 12,260 | 130 ms |
6 | Twitter | 274,753 | 146 ms |
7 | ShareThis | 32,318 | 229 ms |
8 | Shareaholic | 13,268 | 236 ms |
9 | AddThis | 170,036 | 245 ms |
10 | Tumblr | 40,855 | 312 ms |
11 | Disqus | 741 | 504 ms |
12 | PIXNET | 54,969 | 605 ms |
Video
These scripts enable video player and streaming functionality.
Developer Utilities
These scripts are developer utilities (API clients, site monitoring, fraud detection, etc).
Rank | Name | Usage | Average Impact |
---|
1 | New Relic | 2,334 | 54 ms |
2 | Stripe | 4,751 | 70 ms |
3 | OneSignal | 37,165 | 83 ms |
4 | Google APIs/SDK | 829,509 | 114 ms |
5 | App Dynamics | 1,929 | 124 ms |
6 | Cloudflare | 5,190 | 191 ms |
7 | PayPal | 6,467 | 229 ms |
8 | Yandex APIs | 57,870 | 362 ms |
9 | Distil Networks | 11,313 | 376 ms |
10 | Sentry | 15,981 | 686 ms |
Hosting Platforms
These scripts are from web hosting platforms (WordPress, Wix, Squarespace, etc). Note that in this category, this can sometimes be the entirety of script on the page, and so the "impact" rank might be misleading. In the case of WordPress, this just indicates the libraries hosted and served by WordPress not all sites using self-hosted WordPress.
Marketing
These scripts are from marketing tools that add popups/newsletters/etc.
Customer Success
These scripts are from customer support/marketing providers that offer chat and contact solutions. These scripts are generally heavier in weight.
Content & Publishing
These scripts are from content providers or publishing-specific affiliate tracking.
Libraries
These are mostly open source libraries (e.g. jQuery) served over different public CDNs. This category is unique in that the origin may have no responsibility for the performance of what's being served. Note that rank here does not imply one CDN is better than the other. It simply indicates that the libraries being served from that origin are lighter/heavier than the ones served by another..
Mixed / Other
These are miscellaneous scripts delivered via a shared origin with no precise category or attribution. Help us out by identifying more origins!
Third Parties by Total Impact
This section highlights the entities responsible for the most script execution across the web. This helps inform which improvements would have the largest total impact.
Name | Popularity | Total Impact | Average Impact |
---|
Google Tag Manager | 1,098,396 | 473,333 s | 431 ms |
All Other 3rd Parties | 1,344,782 | 274,947 s | 204 ms |
Google/Doubleclick Ads | 1,206,843 | 259,963 s | 215 ms |
Wix | 192,121 | 199,834 s | 1040 ms |
Google CDN | 744,534 | 131,849 s | 177 ms |
Facebook | 1,107,461 | 128,923 s | 116 ms |
Google APIs/SDK | 829,509 | 94,149 s | 114 ms |
Google Analytics | 1,163,249 | 89,009 s | 77 ms |
Yandex Metrica | 221,577 | 78,814 s | 356 ms |
Squarespace | 87,878 | 43,179 s | 491 ms |
AddThis | 170,036 | 41,730 s | 245 ms |
Twitter | 274,753 | 40,120 s | 146 ms |
Shopify | 220,676 | 34,854 s | 158 ms |
PIXNET | 54,969 | 33,257 s | 605 ms |
Zopim | 53,503 | 32,501 s | 607 ms |
Hatena Blog | 51,333 | 24,848 s | 484 ms |
jQuery CDN | 142,889 | 24,222 s | 170 ms |
Yandex APIs | 57,870 | 20,926 s | 362 ms |
Cloudflare CDN | 101,203 | 19,548 s | 193 ms |
33 Across | 20,137 | 17,375 s | 863 ms |
WordPress | 126,052 | 15,390 s | 122 ms |
Tawk.to | 40,598 | 14,007 s | 345 ms |
ZenDesk | 32,852 | 13,839 s | 421 ms |
Sumo | 35,677 | 13,749 s | 385 ms |
Tumblr | 40,855 | 12,755 s | 312 ms |
AMP | 61,086 | 12,136 s | 199 ms |
Salesforce | 40,868 | 11,025 s | 270 ms |
Sentry | 15,981 | 10,966 s | 686 ms |
Yandex Ads | 39,330 | 10,689 s | 272 ms |
Moat | 14,337 | 10,154 s | 708 ms |
OpenX | 10,729 | 8,974 s | 836 ms |
Beeketing | 61,179 | 8,473 s | 138 ms |
Hotjar | 91,036 | 8,395 s | 92 ms |
Weebly | 35,097 | 8,062 s | 230 ms |
Criteo | 64,547 | 7,496 s | 116 ms |
Integral Ads | 24,532 | 7,477 s | 305 ms |
ShareThis | 32,318 | 7,405 s | 229 ms |
JSDelivr CDN | 24,627 | 7,007 s | 285 ms |
MediaVine | 9,801 | 6,915 s | 706 ms |
WordAds | 32,295 | 6,844 s | 212 ms |
Popads | 5,009 | 6,451 s | 1288 ms |
Adobe Analytics | 32,173 | 5,885 s | 183 ms |
Histats | 14,706 | 5,739 s | 390 ms |
Intercom | 16,809 | 5,614 s | 334 ms |
CreateJS CDN | 1,757 | 5,370 s | 3056 ms |
Wistia | 20,633 | 5,294 s | 257 ms |
Lucky Orange | 6,113 | 5,098 s | 834 ms |
Jivochat | 23,628 | 5,084 s | 215 ms |
Amazon S3 | 32,205 | 5,008 s | 156 ms |
Distil Networks | 11,313 | 4,254 s | 376 ms |
Taboola | 23,853 | 4,190 s | 176 ms |
Olark | 12,258 | 3,902 s | 318 ms |
AppNexus | 14,694 | 3,888 s | 265 ms |
Yandex Share | 29,555 | 3,772 s | 128 ms |
Mailchimp | 22,992 | 3,357 s | 146 ms |
Shareaholic | 13,268 | 3,135 s | 236 ms |
Optimizely | 13,482 | 3,135 s | 232 ms |
OneSignal | 37,165 | 3,075 s | 83 ms |
Tealium | 14,422 | 2,990 s | 207 ms |
YouTube | 22,093 | 2,370 s | 107 ms |
Brightcove | 4,933 | 2,173 s | 441 ms |
Yahoo Ads | 9,578 | 2,158 s | 225 ms |
Dealer | 23,885 | 2,158 s | 90 ms |
Parking Crew | 4,542 | 2,093 s | 461 ms |
Amazon Ads | 22,090 | 2,079 s | 94 ms |
LiveChat | 20,433 | 1,786 s | 87 ms |
FontAwesome CDN | 15,661 | 1,599 s | 102 ms |
LinkedIn | 12,260 | 1,594 s | 130 ms |
Sizmek | 4,011 | 1,501 s | 374 ms |
PayPal | 6,467 | 1,478 s | 229 ms |
Segment | 6,998 | 1,406 s | 201 ms |
Hubspot | 14,148 | 1,287 s | 91 ms |
Pinterest | 14,331 | 1,245 s | 87 ms |
DoubleVerify | 1,988 | 1,193 s | 600 ms |
MGID | 10,317 | 1,174 s | 114 ms |
Albacross | 1,382 | 1,004 s | 727 ms |
Cloudflare | 5,190 | 989 s | 191 ms |
Blogger | 17,943 | 839 s | 47 ms |
Pubmatic | 3,140 | 707 s | 225 ms |
Hotmart | 854 | 670 s | 785 ms |
Market GID | 3,873 | 592 s | 153 ms |
Adobe TypeKit | 4,519 | 590 s | 131 ms |
Drift | 4,073 | 575 s | 141 ms |
Baidu Analytics | 7,041 | 550 s | 78 ms |
Mixpanel | 5,462 | 420 s | 77 ms |
VK | 6,342 | 414 s | 65 ms |
Rubicon Project | 3,905 | 413 s | 106 ms |
Disqus | 741 | 374 s | 504 ms |
Scorecard Research | 3,578 | 369 s | 103 ms |
Stripe | 4,751 | 334 s | 70 ms |
Vox Media | 704 | 321 s | 456 ms |
Adroll | 3,198 | 301 s | 94 ms |
Yandex CDN | 2,020 | 249 s | 123 ms |
App Dynamics | 1,929 | 240 s | 124 ms |
Snowplow | 2,492 | 193 s | 77 ms |
RD Station | 2,517 | 176 s | 70 ms |
OptinMonster | 1,129 | 149 s | 132 ms |
Freshdesk | 909 | 127 s | 140 ms |
New Relic | 2,334 | 126 s | 54 ms |
Listrak | 963 | 123 s | 128 ms |
Help Scout | 627 | 103 s | 164 ms |
Bootstrap CDN | 1,383 | 67 s | 48 ms |
Alexa | 1,265 | 63 s | 50 ms |
Media Math | 662 | 45 s | 68 ms |
Crazy Egg | 455 | 41 s | 89 ms |
Future Work
- Introduce URL-level data for more fine-grained analysis, i.e. which libraries from Cloudflare/Google CDNs are most expensive.
- Expand the scope, i.e. include more third parties and have greater entity/category coverage.
FAQs
I don't see entity X in the list. What's up with that?
This can be for one of several reasons:
- The entity does not have at least 100 references to their origin in the dataset.
- The entity's origins have not yet been identified. See How can I contribute?
How is the "Average Impact" determined?
The HTTP Archive dataset includes Lighthouse reports for each URL on mobile. Lighthouse has an audit called "bootup-time" that summarizes the amount of time that each script spent on the main thread. The "Average Impact" for an entity is the total execution time of scripts whose domain matches one of the entity's domains divided by the total number of occurences of those scripts.
Average Impact = Total Execution Time / Total Occurences
How does Lighthouse determine the execution time of each script?
Lighthouse's bootup time audit attempts to attribute all toplevel main-thread tasks to a URL. A main thread task is attributed to the first script URL found in the stack. If you're interested in helping us improve this logic, see Contributing for details.
The data for entity X seems wrong. How can it be corrected?
Verify that the origins in data/entities.json
are correct. Most issues will simply be the result of mislabelling of shared origins. If everything checks out, there is likely no further action and the data is valid. If you still believe there's errors, file an issue to discuss futher.
How can I contribute?
Only about 90% of the third party script execution has been assigned to an entity. We could use your help identifying the rest! See Contributing for details.
Contributing
Updating the Entities
The domain->entity mapping can be found in data/entities.json
. Adding a new entity is as simple as adding a new array item with the following form.
{
"name": "Facebook",
"homepage": "https://www.facebook.com",
"categories": ["social"],
"domains": [
"www.facebook.com",
"connect.facebook.net",
"staticxx.facebook.com",
"static.xx.fbcdn.net",
"m.facebook.com"
]
}
Updating Attribution Logic
The logic for attribution to individual script URLs can be found in the Lighthouse repo. File an issue over there to discuss further.
Updating the Data
The query used to compute the origin-level data is in sql/origin-execution-time-query.sql
, running this against the latest Lighthouse HTTP Archive should give you a JSON export of the latest data that can be checked in at data/YYYY-MM-DD-origin-scripting.json
.
Updating this README
This README is auto-generated from the templates lib/
and the computed data. In order to update the charts, you'll need to make sure you have cairo
installed locally in addition to yarn install
.
brew install pkg-config cairo pango libpng jpeg giflib