NATS: re-using ssh host keys for authentication

Authenticating with a NATS server can be done in various ways. For example, you can use a token, username/password or a TLS certificate. But when I first looked at NATS, one mechanism stood out: stateless authentication and authorization using NKeys and JSON Web Tokens (JWT).

Operator > Account > User

Before going any further, let me explain how NATS breaks down its security model.

an Operator runs a server
an Operator must authorize an Account to operate on their server
an Account must authorize a User

Isolation is at the account level: each Account has its own independent subject namespace, and you control the import/export of both streams of messages and services between accounts.

Authenticating with NKeys

When a client first connects to a NATS server, it may issue a challenge. This challenge takes the form of a nonce field which must be signed and returned to the server before a given timeout.

❯ telnet localhost 4222
Connected to localhost
INFO {"server_id":"ND2AJHHR4BRU4E2NTGDKXHFWPK6YUQFUSIFUWSGSASKWWKLOQZTE4DEK","server_name":"ND2AJHHR4BRU4E2NTGDKXHFWPK6YUQFUSIFUWSGSASKWWKLOQZTE4DEK","version":"2.9.17","proto":1,"go":"go1.20.4","host":"127.0.0.1","port":4222,"headers":true,"auth_required":true,"max_payload":1048576,"jetstream":true,"client_id":78,"client_ip":"127.0.0.1","nonce":"CKTNF4RB6EXHkKQ"}

The NATS client relies upon a User NKey to sign this challenge, essentially an annotated Ed25519 key pair with a checksum.

nk -gen user -pubout

SUACSSL3UAHUDXKFSNVUZRF5UHPMWZ6BFDTJ7M6USDXIEDNPPQYYYCU3VY  // private seed
UDXU4RCSJNZOIQHZNWXHXORDPRTGNJAHAHFRGZNEEJCPQTT2M7NLCNF4    // public key

With a valid signature, the NATS server can reliably tie a client connection to a given NKey. From here, it can move on to the next step: authorization.

Authorization with JSON Web Tokens

After successfully returning the authentication challenge, the NATS client must present a JSON Web Token (JWT) to the server. This token describes the various permissions associated with the client’s connection and must have been signed by a trusted Account NKey.

In turn, the Account that signed the User JWT must have its own JWT registered with the server describing Account level resource limitations and default permissions. This Account JWT must have been signed by the server’s Operator NKey.

This approach represents a chain of trust, starting from the User and going all the way up to the Operator. Consequently, the server needs only be aware of Account level JWTs to validate User permissions hence why this approach can be thought of as stateless or decentralized.

(For a more in-depth understanding of the mechanism I’ve described above, I highly recommend this guide.)

Re-using SSH Host Keys

Recently I’ve been building an agent process as part of a new project called Nits. This agent process is designed to connect to a NATS cluster and listen for system closures to apply on a NixOS host.

As you can imagine, from day one, this process needs to connect securely to the NATS cluster. This means each agent needs to be issued with a User NKey and User JWT.

It didn’t take long, though, before I started to think about re-using an existing NKey, which should already be available on a server: the ssh Ed25519 host key.

After all, it’s just an NKey lacking a bit of annotation and a checksum.

func (a *Agent) connectNats() error {
	nc := a.Options.NatsConfig

	var natsOpts []nats.Option

	if nc.Seed != "" {
		// use provide nkey seed
		natsOpts = append(natsOpts, nats.UserJWTAndSeed(nc.Jwt, nc.Seed))
	}

	if nc.HostKeyFile != nil {
        // use the host ssh key instead
        b, err := io.ReadAll(nc.HostKeyFile)
        if err != nil {
            return nil, errors.Annotate(err, "failed to read host key file")
        }

		// parse the private key
		signer, err := ssh.ParsePrivateKey(b)
		if err != nil {
			return err
		}

		// add a custom signer to the NATS connection options
		natsOpts = append(natsOpts, nats.UserJWT(
			func() (string, error) {
				return nc.Jwt, nil
			}, func(bytes []byte) ([]byte, error) {
				sig, err := signer.Sign(rand.Reader, bytes)
				if err != nil {
					return nil, err
				}
				return sig.Blob, err
			}))

	}

	conn, err := nats.Connect(nc.Url, natsOpts...)
	if err != nil {
		return errors.Annotate(err, "failed to connect to NATS")
	}

	js, err := conn.JetStream()
	if err != nil {
		return errors.Annotate(err, "failed to create a jet stream context")
	}

	a.conn = conn
	a.js = js

	return nil
}

// PublicKeyForSigner is used to convert the ssh public key to a User NKey for use in generating JWT tokens
func PublicKeyForSigner(signer ssh.Signer) (string, error) {
    key := signer.PublicKey()
    
    marshalled := key.Marshal()
    seed := marshalled[len(marshalled)-32:]
    
    encoded, err := nkeys.Encode(nkeys.PrefixByteUser, seed)
    if err != nil {
        return "", err
    }
    
    return string(encoded), nil
}

As you can see, the NATS Go client has been designed with custom signature methods in mind when connecting to a NATS server. And since the underlying signature mechanism is Ed25519-based, we can use a host ssh key to sign the nonce instead of a generated NKey.

For my particular use case, this helps create a more natural association between the agent process and the server it is running on and has the happy side-effect of being one less NKey to manage 😎.

— Edit: 2023-06-12 20:00 —

Added an example of how to convert the ssh public key into a User NKey for use in JWT generation.