On Repl.it, working with packages is made easy. You can simply type import flask in your Python code, and Flask will automatically be installed for you. Or, if you’re more the browsing type, you can search for packages and install them through a graphical interface.

In Repl.it tradition, once you know how to do package management in one language, you know how to do it in every language. You can use the same interface to install packages in Node.js, simply type require(“express”) to get Express up and running and so on.

Today we’re excited to release several months of work on improving the package management experience. Here are the highlights:

You can take out UPM on your computer. Check out the Installation section on GitHub for full instructions for your system.

Here is a quick demo of the CLI on Repl.it. (You can open a shell in the workspace with ctrl-shift-s on mac command-shift-s).

Let’s dive into some of the technical aspects of UPM and Repl.it’s new package management.

Different kinds of package managers

There’s more than one kind of package manager. Broadly speaking, I like to define two categories: system package managers, and project package managers.

System package managers:

Project package managers:

In other words, system package managers are meant to administrate your system and install the tools that you use everywhere on your machine, whereas project package managers are meant to help develop and package new software. These are very different use cases, and so the resulting package managers are very different.

You might ask, what about tools like Pip, RubyGems, and cabal-install? These tools occupy a middle ground: by default, they install packages globally (making them unsuitable for project package management); yet they are also limited to a specific programming language (making them also unsuitable for system package management). As package management ecosystems evolved, using these tools directly is no longer recommended; rather, for system package management you should use a system package manager which packages the software you want to install globally, and for project package management you should use a tool which wraps Pip (e.g. Pipenv or Poetry), RubyGems (Bundler), or cabal-install (Stack) to provide isolation and reproducibility.

How should project package managers behave?

Here’s how we visualize project package management as working in an ideal world: source → specfile → lockfile → installed packages. Let’s break that down in detail:

This one-directional information flow from source to specfile to lockfile to installed packages neatly separates the different functions of a project package manager. Each stage having less human involvement than the last.

How do project package managers actually behave?

Not well, it turns out. While building the package management infrastructure at Repl.it, we discovered a laundry list of language-specific limitations, quirks, and design mistakes. This is what inspired us to create UPM: we want to make package management as easy as it should be.

Here are some of our favorite quirks:

If you use UPM, you don’t have to worry about any of this!

UPM abstractions

The basic principle of UPM is to define a sensible internal API which can be implemented for each language, and then define the user-facing command-line interface in terms of this API. This way, all of the business logic of UPM is guaranteed to be language-independent.

Some parts of the API are simple constants: the names of the specfile and lockfile, and what filenames correspond to the language. These are used for project language autodetection. Other parts implement the core UPM operations: add or remove packages, list the specfile or lockfile, search project source code for possible dependencies to install. In addition to guaranteeing language-independence, this API/CLI split makes it easier to implement language backends. For example, ‘upm add flask’ will first list the specfile and filter out Flask if it’s already been added. This means the implementation of LanguageBackend.Add for the Python backend of UPM can just invoke poetry add, without needing to worry about the fact that Poetry throws an error if you try to add the same package twice.

One of the main challenges in designing UPM’s language backend API was the fact that different package managers act quite differently. In an ideal world, each package manager would implement three separate operations: add to or remove from the specfile, generate the lockfile from the specfile, and install packages from the lockfile. In reality, some package managers force you to do two or even three steps at once. In UPM, we deal with this by having each language backend declare a set of “quirks”, like AddRemoveAlsoLocks and LockAlsoInstalls. The implementation of upm add will run the Add backend method, and will then follow it up with the Lock backend method unless AddRemoveAlsoLocks is included in the backend’s quirks configuration (indicating that the lockfile was already generated in addition to the specfile being modified).

Even worse than some package managers combining steps, some package managers don’t have any concept of a lockfile at all! For example, the standard package manager for Emacs Lisp (package.el, wrapped by Cask) has no support at all for installing a specific version of a package, so the idea of a lockfile is really a non-starter. (Aside: this annoyed me so much that I wrote my own package manager for Emacs, which was part of the reason I got hired to improve the package management infrastructure at Repl.it!)

The approach of UPM to this problem is to preserve the spirit of the specfile/lockfile abstraction as much as possible. For Emacs Lisp, UPM will install directly from the specfile, then generate a lockfile from what is installed (listing exact versions and transitive dependencies, of course).

Caching and dependency guessing

At Repl.it, we care about performance, because nobody wants to wait for their code to run. That means our package management needs to be as fast as possible, especially when there isn’t actually anything that needs to be installed. Since we want UPM to be as useful a standalone tool as possible, we opted to implement all of the performance optimizations directly in UPM. All of the package management code in Repl.it is essentially just a wrapper around UPM:

You might ask how it isn’t incredibly slow to do a code search on every run. (Not to mention making sure the lockfile and installed packages are up to date, since you’re allowed to edit the specfile directly at any time if you want to!)

The answer is that UPM transparently keeps track of some information in a hidden JSON file behind the scenes. It looks something like this:

{
  "version": 2,
  "languages": {
    "python-python3-poetry": {
      "specfileHash": "361e6bddc6a34f1696e71227be88b4b4",
      "lockfileHash": "f208ad0efc93d51f52e04326406816cf",
      "guessedImports": [
        "Flask",
        "selenium"
      ],
      "guessedImportsHash": "8952e87cf73e21ef3313c4e9c98718a7"
    }
  }
}

After a successful operation, UPM will automatically record hashes of the specfile and lockfile. That way, it can tell if the specfile has changed since the last time the lockfile was generated. If it hasn’t, then upm lock is a very fast no-op. Similarly, if the lockfile hasn’t been changed since the last time packages were installed, then upm install can be optimized away.

UPM also optimizes dependency guessing by means of a two-step search. First, it uses a fast regexp match to heuristically find things that might be import or require statements. Then it converts the deterministically generated sequence of matches into a hash. If this hash matches what was recorded in the JSON file last time a search was done, then the list of guessed packages from last time (also in the JSON file) can be reused. This is very fast. Otherwise, the language backend is asked to do a more advanced search, usually involving AST parsing.

Closing

We hope you enjoy faster, more modern, and more open package management support on Repl.it. Now that we’ve aggregated all of the language-specific code into a single place, we hope it will be much easier to add package management support for new languages, like Emacs Lisp. Check out UPM on GitHub and see what it would take to add your favorite package manager to Repl.it! (Or, if Repl.it doesn’t have your favorite programming language yet, check out our other open-source projects, Polygott and Prybar, to help us add it.)