How to use Docker layer caching in GitHub Actions

Kyle Galbraith - Jun 28 '22 - - Dev Community

As we went through in our recent post Fast Dockerfiles: theory and practice, it's important to write a Dockerfile that builds quickly. Once you have one, the next step is to actually build it in your CI environment like GitHub Actions, CircleCI, Travis CI, etc.

In this post, we will focus on how to build a Docker image as quickly as possible in GitHub Actions by leveraging layer caching. We will touch on what layer caching is, why it's important, and how we can leverage it in GitHub Actions to achieve faster builds.

Docker layer cache

A Docker layer is the output of running a step defined in your Dockerfile. It is built off the previous layer before it (the parent) and contains the filesystem changes your step defined, files added, modified, or deleted. A final Docker image is just a series of Docker layers laid one after another, plus some associated metadata, as the build moves from the top of your Dockerfile to the bottom.

The benefit of caching the layers that make up a final image is that, rather than building them again, you can reuse layers that have not changed from previous builds. Needing to do less work in a build makes the build faster.

For a deeper dive into how layers get stacked up based on the contents of your Dockerfile, see our fast Dockerfiles theory and practice post.

Building Docker images in GitHub Actions

If you have never built a Docker image via GitHub Actions, this section is for you. If you already know how to build images with Actions, feel free to jump to the next section, where we discuss caching layers in Actions.

Building a Docker image via GitHub Actions requires creating a new folder and file in your repository. You need to create a .github directory at the root of your repo, followed by a workflows directory inside it. Then you are going to add a YAML file called ci.yml.

mkdir .github
mkdir .github/workflows
touch .github/workflows/ci.yml
Enter fullscreen mode Exit fullscreen mode

Inside the new YAML file, we'll define a job to build your Docker image.

name: Build Docker image

on:
  push: {}

jobs:
  build-with-docker:
  name: Build with Docker
  runs-on: ubuntu-20.04
  steps:
    - uses: actions/checkout@v3
    - uses: docker/setup-buildx-action@v3
    - uses: docker/build-push-action@v5
      with:
        context: .
Enter fullscreen mode Exit fullscreen mode

This is a complete GitHub Actions workflow that consists of a job called Build with Docker that will build your image based on the Dockerfile defined at the root of your repository. Note that if your Dockerfile is in a subdirectory, you'll need to specify the path to the Dockerfile as an additional file field.

The setup-buildx-action configures Docker Buildx to create a builder instance for running the image build. The following step build-push-action uses that instance to build your Docker image. The build-push-action supports all of the features provided by BuildKit out of the box. In our simple example, we are only specifying the Docker context, but more advanced features like SSH, secrets, and build args are supported.

If you commit the new ci.yml file, you should see a Docker build completed via GitHub Actions.

build with docker

This is functional, and you can build images via GitHub Actions in addition to your local machine with this simple workflow. But if you run the build above more than once without making any code changes, you may notice a problem — build steps are recomputed every time, and every step in the Dockerfile needs to be re-run. With this basic workflow, we are not caching any Docker layers, so we must recompute each layer for every build.

Let's take a look at how we can add layer caching.

Docker layer caching in GitHub Actions

To cache the layers produced by a docker build in GitHub Actions, we need to add a few more arguments to our build-push-action step. We will add the cache-from and cache-to arguments.

name: Build Docker image

on:
  push: {}

jobs:
  build-with-docker:
    name: Build with Docker
    runs-on: ubuntu-20.04
    steps:
      - uses: actions/checkout@v3
      - uses: docker/setup-buildx-action@v3
      - uses: docker/build-push-action@v5
        with:
          context: .
          cache-from: type=gha
          cache-to: type=gha,mode=max
Enter fullscreen mode Exit fullscreen mode

Below our context, there is now cache-from and cache-to; both are configured to use the cache type gha. This is an experimental cache exporter for GitHub Actions provided by buildx and BuildKit. It uses the GitHub Cache API to fetch and load the Docker layer cache blobs across builds.

docker cache

By adding remote cache loading and saving, you can reuse your Docker layers from previous builds — as we see with the CACHED hits above. This is a nice improvement if you're building images in GitHub Actions, but it does come with limitations:

  1. The GitHub Cache API only supports a maximum size of 10 GB for the entire repository
  2. Loading and saving cache is network bound, meaning the loading and saving could negate any performance benefits of using the cached layers for simple image builds
  3. The cache is locked to GitHub Actions and can't be used in other systems or on local machines

A managed solution

We built Depot to eliminate the limitations above, not only in GitHub Actions but in all CI providers.

We manage a fleet of remote builders, supporting x86 and Arm architectures, tied to a project you provision in Depot. These remote builders come with higher specs than traditional CI provider VMs, with 16 CPUs, 32 GB memory, and a persistent 50 GB NVMe cache disk that can be expanded up to 500 GB.

You don't need to think about saving and loading the Docker layer cache, as we persist it for you across builds automatically via a local SSD. As it's saved to a local disk, it's available instantly during builds, with no need to save or load cached layers from the network. It's even shared with anyone who has access to the project, so a developer who runs a build locally can just reuse the cached layers that CI computed.

If you are interested in trying out Depot in your GitHub Actions workflow, check out our GitHub Actions integration guide.

. . . . . . . . .