Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

Engineering

Introducing Dependency Divergence GitHub Action

Socket discusses the results of using different package managers to install your packages and introduces a GitHub action to expose those differences.

Introducing Dependency Divergence GitHub Action

Bradley Meck Farias

October 25, 2023


JavaScript has a plethora of package managers these days; it always has to some extent once package management started being normal for front end development. With this we see competition over time, disk usage, memory usage, convenience, etc.

A newer category of tool is the pull request (PR) bot that is starting have the same level of proliferation. Dependabot, renovate, etc. all rule this area currently by making surgical PRs that modify the minimal amount possible generally through lockfile modification.

Both of these tools are starting to surface a curious effect we call "dependency divergence." This effect causes massive amounts of both churn and confusion at times and can be explained as follows:

Dependency divergence is when 2 different dependencies specifying the same constraint differ in what they install.

When installing dependencies, these tools may choose to update versions automatically, may keep a minimal matching version, may refer to a tooling specific lockfile, etc. the choice is theirs to make and all could be valid given a constraint that spans multiple versions.

Dependency divergence gets more interesting because they often don't support the same overall features, package deduplication, or even the same lockfile format. The lack of interoperability is useful to maximize the reason to use one tool over another, but it can lead to lockfiles from one package manager not being respected by another with no easy way to convert between the two tools.

Divergence Emerges#

Dependency divergence is a nightmare for analysis that we at Socket have known about for some time. We have seen plenty of organizations with hundreds of versions of `babel` installed across their repositories. The question then came to us as to why there were so many versions. We saw the cause coming mostly from different forms of lockfiles being modified in a way that was not synchronized.

For illustration purposes we will consider a set of 2 services in separate repositories, "Web Scraper" and "Emailer", that both have a dependency on "typescript@5.x.x". This would be satisfied by any version of TypeScript with the major version matching `5`. These tools we talked about can choose anything in those matching versions to satisfy the constraint.

Tools generally will pin dependencies using lockfiles so that "5.x.x" so that constraint resolution will be persisted across installs. A tool may choose any version that satisfies the constraint and may pin the dependency to a version like "5.0.1" or "5.1.0". Tools then would not solve the constraint "5.x.x" during a clean install and instead only install the specific version specified in the lockfile.

Tools create PRs that update lockfiles and are the cause of this problem generally. They will see a new CVE or version etc. for a package and bump a version of Typescript from "5.0.1" to "5.0.2" for example. However, these PRs are merged separately for different repositories. So you could merge the PR for "Emailer" but not "Web Scraper" for whatever reason. Doing so means that instead of your organization module graph having 1 version of Typescript ("5.0.1"), it may have 2 ("5.0.1" and "5.0.2")!

The sheer number of dependencies in an organization in modern programming efforts means that you can easily have divergence across potentially hundreds of repositories and hundreds of services.

Exposing the data#

Since package managers have become quite popular to talk about and are the main source of this and easy to illustrate the concern. Today we are releasing a tool that will help expose both some comparison of the different package manager benchmarks with a very clear check on the dependency divergence between them. As open source maintainers ourselves, we know that if you don't check that the results are the same, you might not see that you are comparing apples to oranges.

Running our new Dependency Divergence GitHub Action will expose when installations differ in your project. This can be used to assert that moving to a new shiny package manager or a battle tested package manager won't alter your dependencies in an unexpected way or introduce problematic packages.

Same Repository Divergence#

Same repository divergence is a kind of dependency divergence where multiple versions of a single declared dependency exist within a single repository. It can occur from dependency deduplication and/or depending on hoisting of packages by tools. It most clearly occurs when a repository is generally switching to/from lockfiles or between package managers.

Monorepos can both alleviate and exacerbate this divergence. If the workspaces in a monorepo are sharing the overall installation version constraints they can force all workspaces to by default share the same version of a dependency using different mechanisms like peer dependencies or a cascading path mechanism like `node_modules`. However, if the monorepo has many different lockfiles, often one for each workspace, then it falls into the same category as having multiple repositories in your organization with divergence.

Organizational Divergence#

Tools like Dependabot give PRs for updating the codebase of a single repository. This is great for managing a single repository but causes some chaos when having many repositories:

  • This can be a large number of PRs all managed separately.
  • This can affect dependency divergence due to shared dependencies.
  • This increases the churn of package versions increasing dependency divergence likelihood.

Updating these dependencies 1 PR at a time means if you have 10 repositories at different maintenance intervals you can easily reach 10 different versions of a package. Having 10 different versions of a package means that each can be considered at different levels of security auditing. This need to consider them separate can come from a few different viewpoints:

  • Each version of a package may have different transitive/indirect dependencies that are not installed by other versions causing more divergence.
  • Each version of a package may be a specific version introducing a vulnerability or threat that isn't present in other versions.
  • Each version of a package simply may have different edge cases (unexpected breakage) causing potential differences in behaviors across services.

What is next?#

We are working on starting to create tooling to work at an organizational level rather than being limited to individual repositories. We see the challenge of supply chain security as developer's ourselves and that motivates us to build tools developers want to use.

Subscribe to our newsletter

Get notified when we publish new security blog posts!

Try it now

Ready to block malicious and vulnerable dependencies?

Install GitHub AppBook a demo

Related posts

Back to all posts
SocketSocket SOC 2 Logo

Product

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc