Nix: what are fixed-output derivations and why use them?

Guess who’s back… 😉

Jack Nicholson bursting through a door from The Shining

It’s been a few weeks since my last post and “some” of you may have been wondering if that was it for my short-lived, but (I like to think) illustrious blogging career. Fear not dear reader, for I have returned.

Our journey together is not yet over. I like to think we haven’t even gotten to the best part yet 💋.

In all seriousness though, as tends to happen from time to time life got in the way these past few weeks. Between trips to visit friends, illness and work, I failed to maintain my weekly posting schedule. For that you have my “sincerest” apologies.

With that out of the way, I would like to turn your attention to the chosen topic for today’s post: fixed-output derivations.

Capturing Dependencies

There is a perennial problem that every Nix developer is faced with when packaging an application: nix’ifying the application dependencies.

Those dependencies can fall into many categories such as source code, build tools and compilers, native libraries and so on.

But the category I want to focus on today is what I think of as build dependencies. It’s a terrible term I know, which overlaps with everything else, but I couldn’t think of a better name.

So when I say build dependencies, what I’m talking about specifically in this context are dependencies that get brought in by a language-specific build tool:

Node.js has npm or yarn
Rust has Cargo
Golang has Go Modules
Java has Maven or Gradle and so on

Irrespective of the build tool being used, when constructing a derivation the problem facing Nix developers is always the same: how do we capture the build dependencies in a reproducible fashion?

This is where fixed-output derivations come to the rescue.

In the rest of this post we are going to explore what those are and how they can be managed. I’ll be focusing on Gradle-based applications since it’s what I’ve had my head in recently.

Much of what I’ll be talking about however is language-agnostic.

Network Not found

So we have a Gradle-based application, and we want to package it with Nix. The derivation for that might look something like this:

{
  stdenv,
  makeWrapper,
  jdk,
  gradle,
}: stdenv.mkDerivation {
    pname = "my-app";
    version = "1.0.0";
    src = ./.;

    nativeBuildInputs = [gradle makeWrapper];

    buildPhase = ''
        export $GRADLE_USER_HOME=$(mktemp -d)
        gradle --no-daemon installDist
    '';

    installPhase = ''
        mkdir -p $out/share/my-app
        cp -r app/build/install/app/* $out/share/my-app
        makeWrapper $out/share/vanilla/bin/app $out/bin/app \
            --set JAVA_HOME ${jdk}
    '';
}

We build the application with Gradle, copy it’s output and create a wrapper for the binary. Job done.

Well except for one problem: derivations don’t have network access.

If you were to try and build this derivation you would have Gradle complaining that it can’t resolve any dependencies. We have to work around this limitation, and we do that with a fixed-output derivation.

Whilst normal derivations are not allowed network access, fixed-output derivations are a compromise for situations like this. They are provided with network access but in return they must declare in advance a hash of their contents. By declaring a content hash up front, Nix can check the build output matches what was expected the first time it builds, throwing an error if not.

In the case of a cached build, Nix will not try to build a fixed-output derivation again unless the output hash is changed.

If the source has been modified, and you want to force a new build, I suggest ‘breaking’ the output hash which will force a fresh build in which Nix will output a hash mismatch error, telling you what the new hash should be.

So what does a fixed-output derivation look like? Going back to our example:

{
  stdenv,
  makeWrapper,
  jdk,
  gradle,
}: let
    # create a fixed-output derivation which captures our dependencies
    deps = stdenv.mkDerivation {
        pname = "my-app-deps";
        version = "1.0.0";
        src = ./.;

        # run the same build as our main derivation to ensure we capture the correct set of dependencies
        buildPhase = ''
            export GRADLE_USER_HOME=$(mktemp -d)
            gradle --no-daemon installDist
        '';

        # take the cached dependencies and convert them into a maven repo structure
        installPhase = ''
            find $GRADLE_USER_HOME/caches/modules-2 -type f -regex '.*\.\(jar\|pom\)' \
                | LC_ALL=C sort \
                | perl -pe 's#(.*/([^/]+)/([^/]+)/([^/]+)/[0-9a-f]{30,40}/([^/\s]+))$# ($x = $2) =~ tr|\.|/|; "install -Dm444 $1 \$out/$x/$3/$4/$5" #e' \
                | sh
            '';

        # specify the content hash of this derivations output
        outputHashAlgo = "sha256";
        outputHashMode = "recursive";
        outputHash = "sha256-Om4BcXK76QrExnKcDzw574l+h75C8yK/EbccpbcvLsQ=";
    };
in stdenv.mkDerivation {
    pname = "my-app";
    version = "1.0.0";

    # elided for brevity

    # bring in the fixed-output derivation as a depdendency of our main build
    nativeBuildInputs = [ gradle makeWrapper deps ];

}

Essentially what we have done here is to create an offline Maven repository. This allows us to run Gradle in offline mode by overriding its build settings with an init script:

  # elided for brevity

in stdenv.mkDerivation {

    # elided for brevity

    # Point to our local deps repo
    gradleInit = pkgs.writeText "init.gradle" ''
      logger.lifecycle 'Replacing Maven repositories with ${deps}...'
      gradle.projectsLoaded {
        rootProject.allprojects {
          buildscript {
            repositories {
              clear()
              maven { url '${deps}' }
            }
          }
          repositories {
            clear()
            maven { url '${deps}' }
          }
        }
      }
      settingsEvaluated { settings ->
        settings.pluginManagement {
          repositories {
            maven { url '${deps}' }
          }
        }
      }
    '';

    buildPhase = ''
      export GRADLE_USER_HOME=$(mktemp -d)
      gradle --offline --init-script ${gradleInit} --no-daemon installDist
    '';
}

Now when we run our build Nix is happy. Our derivation cannot change without the dependency hash needing to be updated.

Helpers

At this point you may be thinking that this is a lot of boilerplate. And you would be right in thinking that. Which is why you will often find one or more tools for managing the creation and updating of fixed-output derivations for each language:

Node.js has yarn2nix, node2nix, npmlock2nix and more
Rust has crate2nix, cargo2nix and more
Golang has gomod2nix
Java has mvn2nix and gradle2Nix

These tools make it easier to manage the steps mentioned above with varying degrees of success. I say varying, because it is not always possible to capture the dependencies accurately. It all depends on the underlying build tool.

Take Gradle2nix for example.

A Moving Target

Due to limitations in Gradle’s Artifact Query Resolution API, it is difficult to properly resolve all the dependencies for a given build. The author of Gradle2Nix talks more about it here.

Other edge cases exist such as handling version ranges.

If you have a dependency which can be between versions 1.2 and 1.7,
you will find that, after capturing the dependencies as part of a fixed-output derivation, your offline maven repository will only contain version 1.7.

But when Gradle runs the build in offline mode, it will make another attempt to determine what the correct version should be. In doing so it will request metadata for version 1.2, which will not be there.

Ultimately it is a moving target, and depending on your project you will find yourself spending time to try and coax your underlying build tools to correctly output all of your compile time and runtime dependencies, so they can then be captured.

But what if you can’t?

The Heretic’s Way

Sometimes you just need Nix to get out of your way.

And whilst it shouldn’t be your first choice, nor your second or even your third choice, there is an escape route: __noChroot = true;.

When combined with nixConfig.sandbox = "relaxed";, this attribute allows a specific derivation access to the internet.

🚨 Use with caution. 🚨

I think it goes without saying that relaxing the sandbox goes against the core tenets of Nix and its aspiration to keep everything reproducible.

But if you would like to know more about packaging the heretical way I suggest you read zimbatm’s article.

Summary

In this post I’ve touched on what fixed-output derivations are and why they are useful. I’ve also provided a brief summary of the different tools that are out there for managing them when working with various languages.

In particular, we’ve looked at how we can use fixed-output derivations when packaging Gradle apps. And what to do when all else fails.

All the code samples featured can be found in this Github repository.

P.S. Gradle2Nix seems to be un-maintained at the moment, so I pulled various fixes from some open PR’s into this fork: numtide/gradle2nix.

The original author has mentioned that there is a better way to implement gradle2nix and is looking for people who are interested in helping.

— Edit: 2023-10-31 11:24 —

Removed some incorrect information about how output hashes work and clarifying their behaviour with respect to caching.