Travis Paul

Thoughts no one asked for and more.

Committing Dependencies into Source Control

Electronic components organizer.
Photo by Raymond Rasmusson on Unsplash

It may come as a surprise to some that there is a debate about whether or not you should commit your project dependencies into source control. When I say "dependencies" I'm referring to language and project-specific dependencies such as node modules (Javascript) or vendor libraries (PHP.) In the case of Rust (and probably some other languages) this debate has resolved itself as Cargo installs dependencies outside the project's directory (For example, in $HOME/.cargo.) Like anyone else who gives this debate any thought I have an opinion! However, I wanted to consider both sides, so I created a pros/cons list:

Pros of committing dependencies:

  • Internet access is not required to build a usable project.
    • Useful in air-gapped and high-security environments.
    • Public repositories such as Packagist and NPM can and do suffer outages.
  • Requires review of all files of all dependencies and any changes.
  • If reviewed thoroughly can help mitigate malicious package updates.
  • Not susceptible to packages disappearing from public repositories (such as when leftpad was pulled from NPM.)
  • No additional build steps are required before deployment.
  • No additional steps are required for developers to fetch dependencies or be mindful of when they need to install new packages.
  • Higher guarantee of reproducible deployment artifacts.

Cons of committing dependencies:

  • Files in vendor or node_modules can sometimes be modified locally and checked into git. When those files are later updated, things can break unexpectedly. Yet this never should occur if we're closely reviewing all updates to these files, right? Well, often not in practice.
    "sometimes a codebase can be so large it might as well be a binary blob." -- Ron Minnich
  • Pull requests containing updates or additions to dependency directories can contain so many files that they're difficult to review and cause code review web interfaces to lock up and even become unusable in some cases.
  • Dependency directories contain external and often 3rd party code and should not be edited directly so it's likely not a good target for source control. Extra files cause larger than necessary git repo size, slow clones, and pushes.
  • Lock files (such as package-lock.json, yarn.lock, and composer.lock) provide a snapshot of the exact versions installed packages, improving reproducability significantly without committing thousands of files into the project's source control repository.

So, what's my take?

It depends...

There are going to be some very specific, though uncommon, scenarios where committing dependencies into source control makes sense. Though, for the common scenario, I'd say you're better off not checking them in, especially if you've got a caching repository server setup (such as Verdaccio.)

That's just my 2 centavos after working in projects with both approaches.