TLS Docker jobs with Forgejo Actions (systemd)

While Forgejo runner (FJ) runs jobs and actions with a container in most cases, running container-based jobs within such is quite tricky. Especially when aiming for TLS-support in docker¹ combined with FJ runner running via systemd on the host. This combination requires mounting the DIND TLS certs into various places across nested environments.

A dedicated Forgejo documentation page about using docker with Forgejo runner exists, though at the time of writing this article, it did not yet cover TLS support. Additionally, another example showcasing a TLS-supported setup with FJ runner running in a container exists here. While these were helpful throughout the configuration process (including a conversation with the always helpful @viceice), they did not cover all the details needed to make this setup work.

This post aims at closing this gap and saving others some time when going down the same path.

Tip

To install Forgejo runner, we use our forgejo_runner role in the devxy.cicd Ansible Collection.

1. Create a DIND container

There are two important points here:

The DIND container must run with TLS enabled on port 2367
A valid hostname must be set which can be referenced later within the job container

Here’s an Ansible docker-compose definition for the DIND container:

community.docker.docker_container:
  name: docker
  # this triggers dind to generate certs with this SAN entry
  hostname: dind_container
  image: docker:dind
  state: started
  restart_policy: always
  privileged: true
  env:
    DOCKER_TLS_CERTDIR: /certs
  volumes:
    - /mnt/docker_certs:/certs
  ports:
    - "127.0.0.1:2376:2376"

Note

Regarding the use of a bind mount instead of a volume mount for the /certs directory: a bind mount caused access issues when starting the FJ runner because the runner is running as the forgejo_runner user while the mount is owned by root.

2. Connect `systemd` FJ runner to DIND daemon

The idea is to run all jobs via the DIND container and not via the host’s docker daemon. This provides a secure and isolated environment for running jobs. Otherwise the host docker daemon would need to be used, which would result in container jobs being able to possibly access the host filesystem or network.

We need to set some env vars in the service file to tell Forgejo runner to talk to the DIND daemon using TLS. Also we need to tell it where to find the TLS certs.

Environment="DOCKER_TLS_VERIFY=1"
Environment="DOCKER_CERT_PATH=/mnt/docker_certs/client"

The /mnt/docker_certs/client mount contains the TLS certs from the DIND container which are needed to connect to the DIND daemon (see also the diagram below for a graphical representation).

With this setting, all containers will be started within the DIND container - hence you will not see them on the host anymore via docker ps.

3. Mounting DIND TLS certs into the job containers

Now when a new container job starts, a child container is spawned for each. These containers must also have the DIND TLS certs available.

In addition, the DOCKER_HOST env var must be set so the job containers know where to find the docker daemon. But now comes the tricky part: while it was possible to use a (direct) bind mount for the DIND container and FJ runner via the host filesystem, care must be taken now with respect to source and destination as the job containers are nested inside the DIND container!

This means that from the PoV of the job containers, the “host” is the DIND container itself. Our goal is to mount the DIND TLS certs which live under /certs. The solution is a bind mount between the DIND container and the job container using /certs:/certs. This can be achieved by adding "--volume /certs:/certs" to the container.options setting in the FJ runner config file.

To verify this approach, we can start a build, let it “sleep”, exec into the DIND container and from there exec into the job container. There we should find the following in /certs:

tree certs/
 
certs/
├── ca
│   ├── cert.pem
│   ├── cert.srl
│   └── key.pem
├── client
│   ├── ca.pem
│   ├── cert.pem
│   ├── csr.pem
│   ├── key.pem
│   └── openssl.cnf
└── server
    ├── ca.pem
    ├── cert.pem
    ├── csr.pem
    ├── key.pem
    └── openssl.cnf

🎉️

4. Telling the job containers about the Docker daemon

Last step: after mounting the TLS certs, we need to tell the job containers where to find the docker daemon. We can do so via an env var DOCKER_HOST=tcp://dind_container:2376 coupled with DOCKER_TLS_VERIFY=1 and DOCKER_CERT_PATH=/certs/client. These can be set in the runner.envs section in the FJ runner config.

And now the last bit: by default, the job containers are not able to resolve this hostname properly. To enable this, we need to add --add-host=dind_container:host-gateway to container.options.

That’s it! 😅️

Architecture

As the wiring got somewhat complex, I’ve put together a diagram to illustrate the architecture:

Architecture Diagram

Docker has been pushing for TLS-backed connections for some time and will remove non-TLS connections soon. While this has been announced several major versions ago, it will (really) happen at some point, so putting the effort in is worth it. TLS-backed connections means that clients connecting to the Docker daemon will need to authenticate using TLS certificates. These certificates are generated by the DIND daemon during startup and must be provided by clients during the connection process. ↩

TLS Docker jobs with Forgejo Actions (systemd)

1. Create a DIND container

2. Connect systemd FJ runner to DIND daemon

3. Mounting DIND TLS certs into the job containers

4. Telling the job containers about the Docker daemon

Architecture

Footnotes

2. Connect `systemd` FJ runner to DIND daemon