The Nix and NATS logos together

For the past month or so, I’ve been experimenting with Nits, a different approach to NixOS deployments that is pull rather than push-based. And as part of that effort, I needed to address how exactly I would push NixOS system closures to the machines under management.

Typically, in a push-based deployment approach, you can copy the system closure directly via SSH whenever you’re deploying. But with Nits, the agent process running on the target machine needs to be able to connect and download the latest closure on demand, sometimes days or weeks after the deployment was triggered.

“Use a Binary Cache”, I hear you say. And yes, I did. But instead of spinning up an instance of Harmonia, configuring an S3 Bucket or hitting up Domen over at Cachix, instead, I decided to roll my own.

Keep It Simple (Stupid)

Ideally, I want to keep the number of moving parts and things needing to be configured to a minimum. And the idea of running a separate standalone binary cache or adding a third-party dependency at this stage doesn’t appeal to me.

Initially, I want to keep everything within NATS as much as possible. I’m not ruling out running a standalone binary cache in the future, but I only want to do that when it feels necessary, not convenient.

In a perfect world, the agent would be configured on the target machine, make an outbound connection to NATS, and that’s it. Users who want to update a target machine copy the closure into NATS first and then trigger the deployment.

This architecture is simple and easy to reason about, and the security boundary is quite clear.

In a previous post, I mentioned that NATS is a one-stop shop for all your architectural needs. So, when I decided to implement a binary cache, I naturally started giving the KV and Object Store a closer look.

What is a Binary Cache?

In the simplest of terms, a binary cache is just a lookup table for nix store paths. You have a Nar Info file which describes the contents of a store path and its dependencies, and then you have a Nar file, which is an archive with the contents of the store path.

For example, here is the Nar Info file for /nix/store/s549276qyxagylpjvzpcw7zbjqy3swj6-hello-2.12.1:

StorePath: /nix/store/s549276qyxagylpjvzpcw7zbjqy3swj6-hello-2.12.1
URL: nar/1490jfgjsjncd20p6ppzinj33xfpc22barvm1lqbl0bqshw8m5cs.nar.xz
Compression: xz
FileHash: sha256:1490jfgjsjncd20p6ppzinj33xfpc22barvm1lqbl0bqshw8m5cs
FileSize: 50096
NarHash: sha256:18xzh7bjnigcjqbmsq61qakcw1yai9b4cm4qfs5d50hjhyfpw8gz
NarSize: 226488
References: s549276qyxagylpjvzpcw7zbjqy3swj6-hello-2.12.1 yaz7pyf0ah88g2v505l38n0f3wg2vzdj-glibc-2.37-8
Deriver: zjh5kllay6a2ws4w46267i97lrnyya9l-hello-2.12.1.drv
Sig: cache.nixos.org-1:irA3FgcTZxqKvT+L4ibE9ZE6Mylzf6NuDKmTkwg6IcFajm5cf/5vvjBP+bTpA+whJr3mLc6fGHkW67ZeFjuWAw==

Some things worth pointing out:

  • URL is a relative path to the corresponding Nar archive.
  • References indicates dependencies of this store path.
  • Deriver points to the derivation used to generate this store path.
  • Sig is a list of domain:signature entries used for verifying the origin of Nar Info files.

As you can see from the single Sig entry in the example above, this Nar Info has been taken from https://cache.nixos.org, a binary cache run by the Nix Foundation. To check this for yourself, you can use the following URL:

Using a binary cache allows Nix to substitute store paths rather than building them. By default, https://cache.nixos.org is enabled and provides cached builds for Nixpkgs. At the time of writing, it uses 425 TiB of storage and serves ~1500 TiB of requests a month!

In addition, companies, communities or individuals will operate their own binary caches for private or custom builds. Nix Community for example maintains https://nix-community.cachix.org/.

To use a binary cache, it’s public key must first be added to trusted public keys in your nix.conf.

Http Proxy

Whilst the examples above all use http, you can use an S3 compatible bucket as the backing store for your binary cache, a local store path or even a remote store via ssh.

However, I decided to stick with http for my use case. It’s straightforward writing http services in Go, and after looking into it, there’s only a handful of routes which need to be supported:

  • GET /nix-cache-info
  • GET /nar/{hash:[a-z0-9]+}.nar.{compression:*}
  • HEAD /nar/{hash:[a-z0-9]+}.nar.{compression:*}
  • PUT /nar/{hash:[a-z0-9]+}.nar.{compression:*}
  • GET /{hash:[a-z0-9]+}.narinfo
  • HEAD /{hash:[a-z0-9]+}.narinfo
  • PUT /{hash:[a-z0-9]+}.narinfo

GET /nix-cache-info returns some metadata about the cache in question. Here is one taken from https://cache.nixos.org:

StoreDir: /nix/store
WantMassQuery: 1
Priority: 40

All the other routes are for reading, writing and checking for the existence of Nar Info and Nar files, which in our case means reading and writing into NATS KV and Object stores.

Specifically, we will use a KV store for our Nar Info files and an Object store for our Nar archives for reasons which will soon become apparent.

KV Store

To create a new Key-Value (KV) store in NATS from the terminal, you can run nats kv add nar-info:

❯ nats kv add nar-info
Information for Key-Value Store Bucket nar-info created 2023-06-16T15:25:52+01:00

Configuration:

          Bucket Name: nar-info
         History Kept: 1
        Values Stored: 0
   Backing Store Kind: JetStream
          Bucket Size: 0 B
  Maximum Bucket Size: unlimited
   Maximum Value Size: unlimited
     JetStream Stream: KV_nar-info
              Storage: File

  Cluster Information:

                Name: 
              Leader: ND7LZI2ZOSXPZLWV532IDY77MTC2LMLCP6OVNDK5RA3ZGUSJOVTOTHYY

Note: you must enable JetStream, the NATS persistence layer before creating any persistent store or stream.

By default, the newly created store is backed by file storage with no history, replicas, Time-To-Live (TTL) and no restrictions on the overall store size. If desired, however, you can configure the store to be kept in memory, with each entry replicated across multiple servers and expiring after a given time.

These are only a few of the configuration options available, but for the purposes of this article, you can agree NATS KV stores are pretty flexible. What I found most interesting about them after digging into the internals is that the KV stores are nothing more than some API sugar on top of the Streams functionality.

When creating a Stream for example, you can configure JetStream to AllowRollup at the Stream level or on an individual subject level.

If configured at a subject level for a hierarchy of foo.bar.>, whenever you publish a message (with the Nats-Rollup header included) on foo.bar.baz and foo.bar.hello, JetStream will keep track of the latest values for foo.bar.baz and foo.bar.hello.

When you then want to retrieve the latest value for foo.bar.baz, you can use the JetStream API to fetch it directly.

To see for yourself, use nats stream ls with the --all flag to include system streams in the output. Any streams related to a KV store will be prefixed with KV_:

❯ nats stream ls --all
╭───────────────────────────────────────────────────────────────────────────────────────────╮
│                                          Streams                                          │
├──────────────────────┬─────────────┬─────────────────────┬──────────┬──────┬──────────────┤
│ Name                 │ Description │ Created             │ Messages │ Size │ Last Message │
├──────────────────────┼─────────────┼─────────────────────┼──────────┼──────┼──────────────┤
│ KV_nar-info          │             │ 2023-06-16 16:04:45 │ 00 B  │ never        │
╰──────────────────────┴─────────────┴─────────────────────┴──────────┴──────┴──────────────╯

Object Store

Message sizes within NATS are limited to 1 MB by default. If desired, this can be increased to 64 MB, but keeping it to 8 MB or less is highly recommended.

This introduces a fundamental limitation on the size of values that can be placed into a KV store since, ultimately, it’s just a message in a Stream. While suitable for Nar Info files, they are unsuitable for our Nar archive files, which can be much larger than 64MB.

To overcome this limitation, we can use an Object store. This builds on the pattern we’ve seen for KV stores but uses two subjects instead of one.

❯ nats stream ls --all        
╭───────────────────────────────────────────────────────────────────────────────────────────╮
│                                          Streams                                          │
├──────────────────────┬─────────────┬─────────────────────┬──────────┬──────┬──────────────┤
│ Name                 │ Description │ Created             │ Messages │ Size │ Last Message │
├──────────────────────┼─────────────┼─────────────────────┼──────────┼──────┼──────────────┤
│ OBJ_nar              │             │ 2023-06-16 16:04:45 │ 00 B  │ never        │
╰──────────────────────┴─────────────┴─────────────────────┴──────────┴──────┴──────────────╯
❯ nats stream info OBJ_nar    
Information for Stream OBJ_nar created 2023-06-16 16:04:45

             Subjects: $O.nar.C.>, $O.nar.M.>
             Replicas: 1
              Storage: File
              
Options:

            Retention: Limits
     Acknowledgements: true
       Discard Policy: New
     Duplicate Window: 2m0s
           Direct Get: true
    Allows Msg Delete: true
         Allows Purge: true
       Allows Rollups: true

Limits:

     Maximum Messages: unlimited
  Maximum Per Subject: unlimited
        Maximum Bytes: unlimited
          Maximum Age: unlimited
 Maximum Message Size: unlimited
    Maximum Consumers: unlimited


State:

             Messages: 0
                Bytes: 0 B
             FirstSeq: 0
              LastSeq: 0
     Active Consumers: 0              

The first subject, $O.nar.C.>, is a hierarchy for storing a series of chunks representing files of arbitrary size. A unique id is appended to this subject when uploading a file to the Object store. Then a stream of messages is published to it, each representing a chunk of the uploaded file.

Once complete, a message is sent to the second subject, $O.nar.M.>, a metadata hierarchy that operates like a KV store. This is used to record the subject of the uploaded chunk stream against a given key and other metadata, such as a digest of the file stream.

This two-phase approach helps overcome the size limitation of a single message.

Wiring it all up

Having decided upon a KV store for our Nar Info files and an Object store for our Nar archives, wiring them up to some HTTP handlers was straightforward.

package cache

import (
	"fmt"
	"io"
	"net/http"
	"strconv"
	"time"

	log "github.com/inconshreveable/log15"

	"github.com/go-chi/chi/v5"
	"github.com/go-chi/chi/v5/middleware"
	"github.com/nats-io/nats.go"
	"github.com/nix-community/go-nix/pkg/narinfo"
)

const (
	RouteNar       = "/nar/{hash:[a-z0-9]+}.nar.{compression:*}"
	RouteNarInfo   = "/{hash:[a-z0-9]+}.narinfo"
	RouteCacheInfo = "/nix-cache-info"

	ContentLength      = "Content-Length"
	ContentType        = "Content-Type"
	ContentTypeNar     = "application/x-nix-nar"
	ContentTypeNarInfo = "text/x-nix-narinfo"
)

func (c *Cache) createRouter() {
	router := chi.NewRouter()

	router.Use(middleware.RequestID)
	router.Use(middleware.RealIP)
	router.Use(middleware.Timeout(60 * time.Second))
	router.Use(requestLogger(c.log))
	router.Use(middleware.Recoverer)

	router.Get(RouteCacheInfo, c.getNixCacheInfo)

	router.Head(RouteNarInfo, c.getNarInfo(false))
	router.Get(RouteNarInfo, c.getNarInfo(true))
	router.Put(RouteNarInfo, c.putNarInfo())

	router.Head(RouteNar, c.getNar(false))
	router.Get(RouteNar, c.getNar(true))
	router.Put(RouteNar, c.putNar())

	c.router = router
}

func requestLogger(logger log.Logger) func(handler http.Handler) http.Handler {
	return func(next http.Handler) http.Handler {
		fn := func(w http.ResponseWriter, r *http.Request) {
			startedAt := time.Now()
			reqId := middleware.GetReqID(r.Context())

			ww := middleware.NewWrapResponseWriter(w, r.ProtoMajor)

			defer func() {
				entries := []interface{}{
					"status", ww.Status(),
					"elapsed", time.Since(startedAt),
					"from", r.RemoteAddr,
					"reqId", reqId,
				}

				switch r.Method {
				case http.MethodHead, http.MethodGet:
					entries = append(entries, "bytes", ww.BytesWritten())
				case http.MethodPost, http.MethodPut, http.MethodPatch:
					entries = append(entries, "bytes", r.ContentLength)
				}

				logger.Info(fmt.Sprintf("%s %s", r.Method, r.RequestURI), entries...)
			}()

			next.ServeHTTP(ww, r)
		}
		return http.HandlerFunc(fn)
	}
}

func (c *Cache) getNixCacheInfo(w http.ResponseWriter, r *http.Request) {
	if err := c.Options.Info.Write(w); err != nil {
		c.log.Error("failed to write cache info response", "error", err)
	}
}

func (c *Cache) putNar() http.HandlerFunc {
	return func(w http.ResponseWriter, r *http.Request) {
		hash := chi.URLParam(r, "hash")
		compression := chi.URLParam(r, "compression")

		name := hash + "-" + compression
		meta := &nats.ObjectMeta{Name: name}

		_, err := c.nar.Put(meta, r.Body)
		if err != nil {
			w.WriteHeader(500)
			_, _ = w.Write(nil)
		}

		w.WriteHeader(http.StatusNoContent)
	}
}

func (c *Cache) getNar(body bool) http.HandlerFunc {
	return func(w http.ResponseWriter, r *http.Request) {
		hash := chi.URLParam(r, "hash")
		compression := chi.URLParam(r, "compression")

		name := hash + "-" + compression
		obj, err := c.nar.Get(name)

		if err == nats.ErrObjectNotFound {
			w.WriteHeader(404)
			return
		}
		if err != nil {
			w.WriteHeader(500)
			return
		}

		info, err := obj.Info()
		if err != nil {
			w.WriteHeader(500)
			return
		}

		h := w.Header()
		h.Set(ContentType, ContentTypeNar)
		h.Set(ContentLength, strconv.FormatUint(info.Size, 10))

		if !body {
			w.WriteHeader(http.StatusNoContent)
			return
		}

		written, err := io.CopyN(w, obj, int64(info.Size))
		if written != int64(info.Size) {
			log.Error("Bytes copied does not match object size", "expected", info.Size, "written", written)
		}
	}
}

func (c *Cache) putNarInfo() http.HandlerFunc {
	return func(w http.ResponseWriter, r *http.Request) {
		hash := chi.URLParam(r, "hash")

		var err error
		var info *narinfo.NarInfo

		info, err = narinfo.Parse(r.Body)
		if err != nil {
			w.WriteHeader(http.StatusBadRequest)
			_, _ = w.Write([]byte("Could not parse nar info"))
		}

		sign := true
		for _, sig := range info.Signatures {
			if sig.Name == c.Options.Name {
				// no need to sign
				sign = false
				break
			}
		}

		if sign {
			sig, err := c.Options.SecretKey.Sign(nil, info.Fingerprint())
			if err != nil {
				c.log.Error("failed to generate nar info signature", "error", err)
				w.WriteHeader(500)
				return
			}
			info.Signatures = append(info.Signatures, sig)
		}

		_, err = c.narInfo.Put(hash, []byte(info.String()))
		if err != nil {
			w.WriteHeader(500)
			_, _ = w.Write(nil)
		}

		w.WriteHeader(http.StatusNoContent)
	}
}

func (c *Cache) getNarInfo(body bool) http.HandlerFunc {
	return func(w http.ResponseWriter, r *http.Request) {
		hash := chi.URLParam(r, "hash")
		entry, err := c.narInfo.Get(hash)

		if err == nats.ErrKeyNotFound {
			w.WriteHeader(404)
			return
		}
		if err != nil {
			w.WriteHeader(500)
			return
		}

		h := w.Header()
		h.Set(ContentType, ContentTypeNarInfo)
		h.Set(ContentLength, strconv.FormatInt(int64(len(entry.Value())), 10))

		if !body {
			w.WriteHeader(http.StatusNoContent)
			return
		}

		_, err = w.Write(entry.Value())
		if err != nil {
			c.log.Error("failed to write response", "error", err)
		}
	}
}

Most of the code is boilerplate, or HTTP CRUD, and the API for reading and writing to NATS is trivial. In fact, the only unusual bit of logic I needed to handle was signing the Nar Info files on PUT, but thanks to Go Nix, even that was simple.

And now that we have an HTTP proxy for a NATS-backed Binary Cache, we can use nix copy to copy Nix closures into and out of NATS.

In Nits, for example, deploying a closure to a target machine looks a bit like this:

  1. Build the system closure.
  2. nix copy --to http://guvnor/\?compression\=zstd <closure_path> -v --refresh to copy into NATS through the HTTP proxy exposed by the guvnor process.
  3. Trigger a deployment (details in another blog post).
  4. The agent process on the target machine starts an in-process HTTP proxy for the binary cache.
  5. The agent process triggers a nix copy --from http://localhost:14256/\?compression\=zstd <closure_path> -v --refresh to copy the system closure from NATS into it’s local Nix store. Once complete, the agent process stops the in-process HTTP proxy and applies the new system closure.

Summary

In this article, I’ve briefly introduced how a Nix Binary Cache works and how to implement one using NATS. That being said, what I have presented is a basic implementation which still needs critical features such as garbage collection.

I have some intriguing ideas for making it more robust and am keen to see how much mileage I can get from this approach. But as I said at the beginning of this article, however, if I feel it necessary or beneficial to transition to separate cache infrastructure at some stage, I will gladly do so.

In the meantime, I will continue to have fun exploring the problem as part of Nits’s broader remit and discover just how far I can push NATS.