Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

third-party-web

Package Overview
Dependencies
Maintainers
1
Versions
54
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

third-party-web

Categorized data on third party entities on the web.

  • 0.1.1
  • Source
  • npm
  • Socket score

Version published
Weekly downloads
941K
decreased by-0.97%
Maintainers
1
Weekly downloads
 
Created
Source

Third Party Web

Data on third party entities and their impact on the web.

This document is a summary of which third party scripts are most responsible for excessive JavaScript execution on the web today.

Table of Contents

  1. Goals
  2. Methodology
  3. NPM Module
  4. Updates
  5. Data
    1. Summary
    2. How to Interpret
    3. Third Parties by Category
      1. Ads
      2. Analytics
      3. Social
      4. Video
      5. Developer Utilities
      6. Hosting Platforms
      7. Marketing
      8. Customer Success
      9. Content & Publishing
      10. Libraries
      11. Mixed / Other
    4. Third Parties by Total Impact
  6. Future Work
  7. FAQs
  8. Contributing

Goals

  1. Quantify the impact of third party scripts on the web.
  2. Identify the third party scripts on the web that have the greatest performance cost.
  3. Give developers the information they need to make informed decisions about which third parties to include on their sites.
  4. Incentivize responsible third party script behavior.
  5. Make this information accessible and useful.

Methodology

HTTP Archive is an inititiave that tracks how the web is built. Twice a month, ~4 million sites are crawled with Lighthouse on mobile. Lighthouse breaks down the total script execution time of each page and attributes the execution to a URL. Using BigQuery, this project aggregates the script execution to the origin-level and assigns each origin to the responsible entity.

NPM Module

The entity classification data is available as an NPM module.

const { getEntity } = require("third-party-web");
const entity = getEntity("https://d36mpcpuzc4ztk.cloudfront.net/js/visitor.js");
console.log(entity);
//   {
//     "name": "Freshdesk",
//     "homepage": "https://freshdesk.com/",
//     "categories": ["customer-success"],
//     "domains": ["d36mpcpuzc4ztk.cloudfront.net"]
//   }

Updates

2019-02-01 dataset

Huge props to WordAds for reducing their impact from ~2.5s to ~200ms on average! A few entities are showing considerably less data this cycle (Media Math, Crazy Egg, DoubleVerify, Bootstrap CDN). Perhaps they've added new CDNs/hostnames that we haven't identified or the basket of sites in HTTPArchive has shifted away from their usage.

Data

Summary

Across top ~1 million sites, ~800 origins account for ~65% of all script execution time with the top 100 entities already accounting for ~59%. Third party script execution is the majority chunk of the web today, and it's important to make informed choices.

How to Interpret

Each entity has a number of data points available.

  1. Usage (Total Number of Occurrences) - how many scripts from their origins were included on pages
  2. Total Impact (Total Execution Time) - how many seconds were spent executing their scripts across the web
  3. Average Impact (Average Execution Time) - on average, how many milliseconds were spent executing each script
  4. Category - what type of script is this

Third Parties by Category

This section breaks down third parties by category. The third parties in each category are ranked from first to last based on the average impact of their scripts. Perhaps the most important comparisons lie here. You always need to pick an analytics provider, but at least you can pick the most well-behaved analytics provider.

Overall Breakdown

Unsurprisingly, ads account for the largest identifiable chunk of third party script execution. Other balloons as a category primarily due to Google Tag Manager which is used to deliver scripts in multiple categories. Google Tag Manager script execution alone is responsible for more than half of the "Mixed / Other" category.

breakdown by category

Ads

These scripts are part of advertising networks, either serving or measuring.

RankNameUsageAverage Impact
1Media Math66268 ms
2Adroll3,19894 ms
3Amazon Ads22,09094 ms
4Scorecard Research3,578103 ms
5Rubicon Project3,905106 ms
6MGID10,317114 ms
7Criteo64,547116 ms
8Market GID3,873153 ms
9Taboola23,853176 ms
10WordAds32,295212 ms
11Google/Doubleclick Ads1,206,843215 ms
12Pubmatic3,140225 ms
13Yahoo Ads9,578225 ms
14AppNexus14,694265 ms
15Yandex Ads39,330272 ms
16Integral Ads24,532305 ms
17Sizmek4,011374 ms
18DoubleVerify1,988600 ms
19MediaVine9,801706 ms
20Moat14,337708 ms
21OpenX10,729836 ms
2233 Across20,137863 ms
23Popads5,0091288 ms

Analytics

These scripts measure or track users and their actions. There's a wide range in impact here depending on what's being tracked.

RankNameUsageAverage Impact
1Alexa1,26550 ms
2Google Analytics1,163,24977 ms
3Mixpanel5,46277 ms
4Snowplow2,49277 ms
5Baidu Analytics7,04178 ms
6Crazy Egg45589 ms
7Hotjar91,03692 ms
8Adobe Analytics32,173183 ms
9Segment6,998201 ms
10Tealium14,422207 ms
11Optimizely13,482232 ms
12Salesforce40,868270 ms
13Yandex Metrica221,577356 ms
14Histats14,706390 ms
15Lucky Orange6,113834 ms

Social

These scripts enable social features.

RankNameUsageAverage Impact
1VK6,34265 ms
2Pinterest14,33187 ms
3Facebook1,107,461116 ms
4Yandex Share29,555128 ms
5LinkedIn12,260130 ms
6Twitter274,753146 ms
7ShareThis32,318229 ms
8Shareaholic13,268236 ms
9AddThis170,036245 ms
10Tumblr40,855312 ms
11Disqus741504 ms
12PIXNET54,969605 ms

Video

These scripts enable video player and streaming functionality.

RankNameUsageAverage Impact
1YouTube22,093107 ms
2Wistia20,633257 ms
3Brightcove4,933441 ms

Developer Utilities

These scripts are developer utilities (API clients, site monitoring, fraud detection, etc).

RankNameUsageAverage Impact
1New Relic2,33454 ms
2Stripe4,75170 ms
3OneSignal37,16583 ms
4Google APIs/SDK829,509114 ms
5App Dynamics1,929124 ms
6Cloudflare5,190191 ms
7PayPal6,467229 ms
8Yandex APIs57,870362 ms
9Distil Networks11,313376 ms
10Sentry15,981686 ms

Hosting Platforms

These scripts are from web hosting platforms (WordPress, Wix, Squarespace, etc). Note that in this category, this can sometimes be the entirety of script on the page, and so the "impact" rank might be misleading. In the case of WordPress, this just indicates the libraries hosted and served by WordPress not all sites using self-hosted WordPress.

RankNameUsageAverage Impact
1Blogger17,94347 ms
2Dealer23,88590 ms
3WordPress126,052122 ms
4Shopify220,676158 ms
5Weebly35,097230 ms
6Hatena Blog51,333484 ms
7Squarespace87,878491 ms
8Wix192,1211040 ms

Marketing

These scripts are from marketing tools that add popups/newsletters/etc.

RankNameUsageAverage Impact
1RD Station2,51770 ms
2Hubspot14,14891 ms
3Listrak963128 ms
4OptinMonster1,129132 ms
5Beeketing61,179138 ms
6Drift4,073141 ms
7Mailchimp22,992146 ms
8Sumo35,677385 ms
9Albacross1,382727 ms

Customer Success

These scripts are from customer support/marketing providers that offer chat and contact solutions. These scripts are generally heavier in weight.

RankNameUsageAverage Impact
1LiveChat20,43387 ms
2Freshdesk909140 ms
3Help Scout627164 ms
4Jivochat23,628215 ms
5Olark12,258318 ms
6Intercom16,809334 ms
7Tawk.to40,598345 ms
8ZenDesk32,852421 ms
9Zopim53,503607 ms

Content & Publishing

These scripts are from content providers or publishing-specific affiliate tracking.

RankNameUsageAverage Impact
1AMP61,086199 ms
2Vox Media704456 ms
3Hotmart854785 ms

Libraries

These are mostly open source libraries (e.g. jQuery) served over different public CDNs. This category is unique in that the origin may have no responsibility for the performance of what's being served. Note that rank here does not imply one CDN is better than the other. It simply indicates that the libraries being served from that origin are lighter/heavier than the ones served by another..

RankNameUsageAverage Impact
1Bootstrap CDN1,38348 ms
2FontAwesome CDN15,661102 ms
3Yandex CDN2,020123 ms
4Adobe TypeKit4,519131 ms
5jQuery CDN142,889170 ms
6Google CDN744,534177 ms
7Cloudflare CDN101,203193 ms
8JSDelivr CDN24,627285 ms
9CreateJS CDN1,7573056 ms

Mixed / Other

These are miscellaneous scripts delivered via a shared origin with no precise category or attribution. Help us out by identifying more origins!

RankNameUsageAverage Impact
1Amazon S332,205156 ms
2All Other 3rd Parties1,344,782204 ms
3Google Tag Manager1,098,396431 ms
4Parking Crew4,542461 ms

Third Parties by Total Impact

This section highlights the entities responsible for the most script execution across the web. This helps inform which improvements would have the largest total impact.

NamePopularityTotal ImpactAverage Impact
Google Tag Manager1,098,396473,333 s431 ms
All Other 3rd Parties1,344,782274,947 s204 ms
Google/Doubleclick Ads1,206,843259,963 s215 ms
Wix192,121199,834 s1040 ms
Google CDN744,534131,849 s177 ms
Facebook1,107,461128,923 s116 ms
Google APIs/SDK829,50994,149 s114 ms
Google Analytics1,163,24989,009 s77 ms
Yandex Metrica221,57778,814 s356 ms
Squarespace87,87843,179 s491 ms
AddThis170,03641,730 s245 ms
Twitter274,75340,120 s146 ms
Shopify220,67634,854 s158 ms
PIXNET54,96933,257 s605 ms
Zopim53,50332,501 s607 ms
Hatena Blog51,33324,848 s484 ms
jQuery CDN142,88924,222 s170 ms
Yandex APIs57,87020,926 s362 ms
Cloudflare CDN101,20319,548 s193 ms
33 Across20,13717,375 s863 ms
WordPress126,05215,390 s122 ms
Tawk.to40,59814,007 s345 ms
ZenDesk32,85213,839 s421 ms
Sumo35,67713,749 s385 ms
Tumblr40,85512,755 s312 ms
AMP61,08612,136 s199 ms
Salesforce40,86811,025 s270 ms
Sentry15,98110,966 s686 ms
Yandex Ads39,33010,689 s272 ms
Moat14,33710,154 s708 ms
OpenX10,7298,974 s836 ms
Beeketing61,1798,473 s138 ms
Hotjar91,0368,395 s92 ms
Weebly35,0978,062 s230 ms
Criteo64,5477,496 s116 ms
Integral Ads24,5327,477 s305 ms
ShareThis32,3187,405 s229 ms
JSDelivr CDN24,6277,007 s285 ms
MediaVine9,8016,915 s706 ms
WordAds32,2956,844 s212 ms
Popads5,0096,451 s1288 ms
Adobe Analytics32,1735,885 s183 ms
Histats14,7065,739 s390 ms
Intercom16,8095,614 s334 ms
CreateJS CDN1,7575,370 s3056 ms
Wistia20,6335,294 s257 ms
Lucky Orange6,1135,098 s834 ms
Jivochat23,6285,084 s215 ms
Amazon S332,2055,008 s156 ms
Distil Networks11,3134,254 s376 ms
Taboola23,8534,190 s176 ms
Olark12,2583,902 s318 ms
AppNexus14,6943,888 s265 ms
Yandex Share29,5553,772 s128 ms
Mailchimp22,9923,357 s146 ms
Shareaholic13,2683,135 s236 ms
Optimizely13,4823,135 s232 ms
OneSignal37,1653,075 s83 ms
Tealium14,4222,990 s207 ms
YouTube22,0932,370 s107 ms
Brightcove4,9332,173 s441 ms
Yahoo Ads9,5782,158 s225 ms
Dealer23,8852,158 s90 ms
Parking Crew4,5422,093 s461 ms
Amazon Ads22,0902,079 s94 ms
LiveChat20,4331,786 s87 ms
FontAwesome CDN15,6611,599 s102 ms
LinkedIn12,2601,594 s130 ms
Sizmek4,0111,501 s374 ms
PayPal6,4671,478 s229 ms
Segment6,9981,406 s201 ms
Hubspot14,1481,287 s91 ms
Pinterest14,3311,245 s87 ms
DoubleVerify1,9881,193 s600 ms
MGID10,3171,174 s114 ms
Albacross1,3821,004 s727 ms
Cloudflare5,190989 s191 ms
Blogger17,943839 s47 ms
Pubmatic3,140707 s225 ms
Hotmart854670 s785 ms
Market GID3,873592 s153 ms
Adobe TypeKit4,519590 s131 ms
Drift4,073575 s141 ms
Baidu Analytics7,041550 s78 ms
Mixpanel5,462420 s77 ms
VK6,342414 s65 ms
Rubicon Project3,905413 s106 ms
Disqus741374 s504 ms
Scorecard Research3,578369 s103 ms
Stripe4,751334 s70 ms
Vox Media704321 s456 ms
Adroll3,198301 s94 ms
Yandex CDN2,020249 s123 ms
App Dynamics1,929240 s124 ms
Snowplow2,492193 s77 ms
RD Station2,517176 s70 ms
OptinMonster1,129149 s132 ms
Freshdesk909127 s140 ms
New Relic2,334126 s54 ms
Listrak963123 s128 ms
Help Scout627103 s164 ms
Bootstrap CDN1,38367 s48 ms
Alexa1,26563 s50 ms
Media Math66245 s68 ms
Crazy Egg45541 s89 ms

Future Work

  1. Introduce URL-level data for more fine-grained analysis, i.e. which libraries from Cloudflare/Google CDNs are most expensive.
  2. Expand the scope, i.e. include more third parties and have greater entity/category coverage.

FAQs

I don't see entity X in the list. What's up with that?

This can be for one of several reasons:

  1. The entity does not have at least 100 references to their origin in the dataset.
  2. The entity's origins have not yet been identified. See How can I contribute?

How is the "Average Impact" determined?

The HTTP Archive dataset includes Lighthouse reports for each URL on mobile. Lighthouse has an audit called "bootup-time" that summarizes the amount of time that each script spent on the main thread. The "Average Impact" for an entity is the total execution time of scripts whose domain matches one of the entity's domains divided by the total number of occurences of those scripts.

Average Impact = Total Execution Time / Total Occurences

How does Lighthouse determine the execution time of each script?

Lighthouse's bootup time audit attempts to attribute all toplevel main-thread tasks to a URL. A main thread task is attributed to the first script URL found in the stack. If you're interested in helping us improve this logic, see Contributing for details.

The data for entity X seems wrong. How can it be corrected?

Verify that the origins in data/entities.json are correct. Most issues will simply be the result of mislabelling of shared origins. If everything checks out, there is likely no further action and the data is valid. If you still believe there's errors, file an issue to discuss futher.

How can I contribute?

Only about 90% of the third party script execution has been assigned to an entity. We could use your help identifying the rest! See Contributing for details.

Contributing

Updating the Entities

The domain->entity mapping can be found in data/entities.json. Adding a new entity is as simple as adding a new array item with the following form.

{
    "name": "Facebook",
    "homepage": "https://www.facebook.com",
    "categories": ["social"],
    "domains": [
        "www.facebook.com",
        "connect.facebook.net",
        "staticxx.facebook.com",
        "static.xx.fbcdn.net",
        "m.facebook.com"
    ]
}

Updating Attribution Logic

The logic for attribution to individual script URLs can be found in the Lighthouse repo. File an issue over there to discuss further.

Updating the Data

The query used to compute the origin-level data is in sql/origin-execution-time-query.sql, running this against the latest Lighthouse HTTP Archive should give you a JSON export of the latest data that can be checked in at data/YYYY-MM-DD-origin-scripting.json.

Updating this README

This README is auto-generated from the templates lib/ and the computed data. In order to update the charts, you'll need to make sure you have cairo installed locally in addition to yarn install.

# Install `cairo` and dependencies for node-canvas
brew install pkg-config cairo pango libpng jpeg giflib

Keywords

FAQs

Package last updated on 14 Mar 2019

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc