In software development, build systems are the unsung heroes, quietly orchestrating the complex process of transforming source code into executable programs. Among these, Nix could stand out as a unique and powerful tool. With its ability to decompose derivations into DAGs, its robust caching system, and the elegance of the Nix language, it could promise a world of efficient and reproducible builds.

I believe it is able to deliver - but it doesn't really try. It makes no claim of being a build tool, and hardly anyone uses it as such anyway. The main use case of Nix seems to be orchestrating other build tools, rather than replacing them. I hope that will change.

Nix Basics

Nix is a language, a package manager, in many cases even a package manager manager, an operating system, and a community. I discovered Nix in 2021, and I was blown away by all of it. The language is beautiful, and is lazily-evaluated, declarative, and functional. The package repository is vast - larger than AUR and debian etc. The community is fantastic, supportive, and enthusiastic. The operating system, which I switched to when Jupiter Broadcasting's Linux Unplugged podcast ran a NixOS Challenge, is now my daily driver.

You can use the Nix language for various things, but the main use by far is to construct environments. With it, you can describe "derivations", which could be programs, libraries, text files, directories, you name it. These derivations are wholly defined by their inputs: source files, dependencies (themselves derivations), the build environment, and instructions on piecing it all together. Under the hood, source files and build environments are also derivations.

A derivation cannot depend on something that it isn't told to depend on. If I don't specify that I depend on GCC13 and the fmt library, I can't use them, even if they exist on my computer. After all, in most operating systems or build environments, having GCC "installed" simply means that there is a directory in your PATH environment variable that a binary called "gcc" resides in: this might be in /bin, /usr/bin, a virtual environment, or whatever. Typically your PATH will have a few directories in it. In Nix, GCC is a stored in a special store directory, and its  /nix/store/[hash]-gcc-x.y.z/bin/gcc and this is only set in your $PATH when you tell it to be - Nix is, after all, an environment manager.

The same holds for dependencies. C++ packages, python packages, rust packages, you name it - they all get lumped into /nix/store/[hash]-[package]-[version]. Nix knows which one you want by the way that you specify it.

When you build projects that depend on, e.g., a certain version of gcc and a certain version of fmt, Nix first checks the store. If the one you're after isn't available, it checks online binary caches to download them. If they aren't available, it builds them manually according to the instructions that came with the package.

There's a name for this kind of structure - where one thing depends on other things, which may in turn depend on other things. It's called a DAG, or a Directed Acyclic Graph. "Directed" because the order of dependencies flows logically from depended-on to dependant. "Acyclic" because A can't depend on B that depends on A. Graph because you can represent the structure with dots and lines.

Everything is already there

DAGs are the bread and butter of build systems. In a python project, you always depend on a specific python version. You may depend on a module, which depends on another module, and your chosen package manager (pip, conda, etc) will ensure they're all in place. In a C++ project, you may depend on a specific compiler and certain dependencies, and you might want to build object files as intermediates, then link them together.

Caches are also the bread and butter of build systems. If you had to rebuild the universe every time you changed the last step of the build, you wouldn't be happy.

Files are also the bread and butter of build systems. Knowing when one has changed, and how it affects other things down the line, is key in order to work effectively with the cache.

If I were to build a build system from scratch, these would be my main concerns. It wouldn't be GCC invocations, it would be working with caches, files, and DAGs. That's messy business - but Nix already has all of that sorted.

Working with other build systems

Unfortunately, Nix doesn't have a build system of its own. Most packages are simply GitHub pulls (some with patches, extra configuration, or wrapper scripts), and the source repositories already come with a tooling required to build them in whatever build system that the author decided was best. For example, the remarkable fmt library ships with a CMakeLists.txt, and this is precisely what the corresponding nixpkgs derivation relies upon. This is DRY in practice, and in general it is a good thing.

Another common build system is Bazel. I've used Bazel for several years, and it is a decent approach to building software - it isn't perfect, and I have a big list of things I'd change, but overall it is good at what it is designed for.

The typical approach to building with these systems in Nix is the same as anything with Nix. Simplified:

  • A minimal, sandbox environment is set up, with environment variables such as PATH set so that the build dependencies can be found (and nothing else)
  • The source of the project you're building is unpacked into this environment
  • The build is run - in one or many steps, depending on what you're doing.
  • You rip out the build artifacts you want, converting any hardcoded paths (e.g. if a shell script has #!/bin/bash shebang, this gets replaced with #!/nix/store/[some-hash]-bash-[version]/bin/bash, because there is no /bin/bash in NixOS) along the way, and place them into the Nix store.
  • The next time the build is run with identical inputs, dependencies and build tools, the cached version is used instead.

Herein lies the rub. If you change one measly character in an insignificant comment in a file that doesn't even get used in the build, that's a new source. That's a new derivation. If you add a library to the build environment (for later use) and it doesn't get used, it's still a new derivation. After all, Nix doesn't know what CMake, Bazel etc are going to do with the tools, with the libraries, and with the source code. It doesn't know that the derivation is the same.

The end result? One insignificant change will trigger a rebuild of the affected project, and anything that depends on it, when you want to use it. I'm sitting here writing this two hours into a full rebuild of a large third-party ML library, pulled from nixpkgs, due to an update which I know has only impacted a couple of build artifacts. But because the dependency in nixpkgs is defined as "here's the link to the source, here's the sha256, here's a command to build it with", it doesn't know that it's only a couple of build artifacts that have changed. Two hours. And this is on a Threadripper - I feel sorry for anyone who has fewer threads to rip.

The only way for Nix to know which dependencies are important, which files are important, which tools are important, is if you declare them upfront, restricting them to the bare minimum set. Then, if nothing from that set changes, the derivation is unchanged and no rebuilding gets done.

I've done this, and let me save you some time: you basically start to create your own build system. The invocation of Bazel or Nix becomes a mere formality: it can't (...shouldn't) do any downloading anyway, it already knows where all the dependencies are, and all you're doing is using a beautiful system to configure the shortest path for an inferior system (IMO) to do the least possible effort.

This is the very reason I created Nozzle (youtube demo/rant). It's not a fantastic project, but I've found it quite handy.

Discussion

In the long term, I think Nix is going to need to come to terms with this issue. The cost of rebuilds can be high. Online caches can be handy, but they find themselves rebuilding from the ground up too, and this is likely at considerable expense.

Perhaps the solution is to have a Nix-native build system, and (optionally, but with good incentive) define nixpkgs using that. Perhaps the solution is to hijack the cmake and bazel tools, get them to express a DAG, then rip out the details and build via Nix. Perhaps the solution is to stick with the current course and simply put up with the time, power and storage requirements of rebuilding the universe. I don't have the answers, or the community sway to provide the answers if I even had them.

What I do know is that some people are excited by this idea, and some people hate it. I've had at least one person tell me "Nix isn't a build system, it's a package manager, and it wouldn't work as a very good build system anyway". I've proven them wrong with Nozzle - Nozzle is a bit naff, but it works, and Nix makes it work well. Imagine what would happen if a community took on such a project.  

Nix needs a native build system