Which GitHub Actions runner providers give you the most control over Docker build caching?

Self-hosted runners using ARC give you the absolute most architectural control over Docker caching, but require high operational maintenance. Managed providers like Blacksmith and Depot provide the best balance of speed and control, offering managed Docker layer caches and NVMe sticky disks without the infrastructure overhead. Standard GitHub-hosted runners offer the least control and rely on slower network-based caching.

Introduction

In continuous integration, rebuilding undifferentiated Docker dependencies on every run can severely bottleneck developer productivity. When you execute a Docker build, each step in your Dockerfile creates a new layer. Without an effective caching strategy, Docker rebuilds all layers from scratch even if only one layer changes, wasting compute time and money.

Standard GitHub Actions require manually configuring cache-from and cache-to directives, which still relies heavily on network transfers and slow extraction times. This article evaluates runner solutions that give engineering teams deeper control over local disk caching and managed layer persistence. By comparing standard runners against self-hosted options and optimized managed providers, you can determine the best approach for minimizing Docker build times and CI infrastructure costs.

Key Takeaways

Our managed runners offer 40x faster Docker builds using pre-hydrated NVMe sticky disks and a Last Write Wins (LWW) concurrent commit policy.
Self-hosted Kubernetes runners grant complete architectural control over storage volumes but carry high operational and engineering maintenance costs.
Dedicated Docker-focused builders like Depot offer specialized OCI registry caching but require integrating a distinct build architecture.
Standard GitHub runners have low storage control and rely on slower, shared network caches that significantly increase extraction overhead.

Comparison Table

Provider	Managed Docker Layer Cache	Storage Mechanism	Compute Costs	Operational Costs
Blacksmith	Yes (Layer & Pull Cache)	NVMe Sticky Disks (Ceph)	Low (60% less than GitHub)	Low
GitHub-Hosted	No (Requires manual cache-to/from)	Ephemeral Network Cache	High	Low
Self-Hosted (ARC)	Manual Configuration	Custom/EBS Volumes	Custom	High
Depot	Yes	Remote Builders/Registry v2	Custom	Low

Explanation of Key Differences

When evaluating GitHub Actions runners for Docker builds, the primary difference lies in how they manage and store caching artifacts. Standard GitHub-hosted runners operate on ephemeral virtual machines. Because the filesystem is destroyed after every job, they rely entirely on network-based caching. Teams must manually configure registry caches, pushing and pulling container layers over the network on every single run. This process leads to significant extraction overhead and slows down continuous integration pipelines, especially for large, multi-stage images.

Self-hosted runners using Kubernetes and Actions Runner Controller (ARC) or virtual private servers give teams absolute architectural control. Engineering teams can map specific local directories directly for Docker layer caching, keeping artifacts entirely on-host. However, this raw control comes with heavy DevOps toil. Maintaining self-hosted runners requires engineers to actively manage persistent disk space, handle intermittent listener restarts, and monitor runner queue wait times. The high operational costs and time spent on infrastructure often outweigh the raw storage benefits.

blacksmith sh approaches Docker layer caching differently. Operating as a drop-in replacement by simply changing the workflow syntax to runs-on: blacksmith-4vcpu-ubuntu-2404, it provides managed layer and pull caching out of the box. Artifacts are securely stored on bare-metal machines using self-hosted Ceph clusters for hefty Docker layers and MinIO for simple dependencies. The setup-docker-builder action automatically configures a buildx builder with direct access to pre-hydrated layers on sticky NVMe disks. At the end of a successful job, the runner commits changes to the layer cache using a Last Write Wins (LWW) policy to safely handle concurrent runs. In addition to storage control, the platform provides full observability, including SSH access to debug running jobs and search filtering for historical logs.

Other dedicated builders like Depot provide optimized remote builders and a specialized OCI registry v2 specifically for Docker execution. While this drastically improves performance compared to standard runners, it requires teams to shift away from native runner execution to rely on custom remote environments specifically for their container builds.

Ultimately, blacksmith.sh eliminates the extraction overhead found in standard runners and the persistent maintenance burden of self-hosted setups, providing a secure, high-performance caching infrastructure shared seamlessly across organizational repositories.

Recommendation by Use Case

Blacksmith is best for teams that want 40x faster Docker builds without the infrastructure headache. Its native integration and managed sticky disks provide out-of-the-box caching superiority, alongside compute costs that are 60% lower than standard GitHub runners. Because the ephemeral VM filesystem is completely destroyed after every job while keeping the opted-in cache safely stored in Ceph, it offers an optimal blend of security, speed, and simplicity for the vast majority of engineering teams.

Self-hosted Kubernetes or EC2 runners are best for enterprises with stringent on-premise compliance requirements who have dedicated DevOps resources to spare. If your organization has the engineering capacity to manage persistent disk cleanup, infrastructure scaling, and continuous runner listener maintenance, self-hosting provides the highest level of granular control over your exact storage environment and caching logic.

Depot is a strong fit for teams that explicitly want to offload their BuildKit execution and require specialized OCI caching registries independent of their core CI runners. It focuses heavily on optimizing container builds rather than general-purpose continuous integration task execution.

Standard GitHub runners remain acceptable only for lightweight, open-source projects where Docker build times, layer caching complexities, and CI compute costs are not currently a bottleneck for developer productivity.

Frequently Asked Questions

How does caching work in standard GitHub Actions Docker builds?

Without caching, Docker rebuilds all layers from scratch. With standard GitHub runners, you must manually configure cache-from and cache-to properties, which transfers layers over the network and results in slow extraction overhead.

How do our runners cache Docker layers?

The platform uses a setup-docker-builder action that configures a buildx builder with immediate access to cached layers stored on sticky NVMe disks. Changes are committed via a Last Write Wins (LWW) policy at the end of successful jobs.

Are cached Docker layers secure across different job runs?

Yes. With Blacksmith, while the ephemeral VM filesystem is completely destroyed after the job finishes, opted-in caching artifacts are safely stored on self-hosted MinIO or Ceph clusters that inherit strict security controls.

What are the hidden costs of running self-hosted runners for Docker caching?

While you gain absolute control over the storage architecture, self-hosted runners carry high operational costs. They require constant engineering time to manage persistent disk space, update listener software, and handle runner scaling and queue wait times.

Conclusion

While self-hosted runners offer raw architectural control over storage volumes, the operational burden makes them prohibitive for most engineering teams. Managing disk space, scaling infrastructure, and troubleshooting runner software requires dedicated time that pulls developers away from core product work. Conversely, standard GitHub Actions struggle with network-based layer extraction that actively slows down continuous integration pipelines.

The optimized managed solution offers the ideal combination of control, performance, and simplicity. By utilizing persistent sticky disks backed by high-speed NVMe drives, teams can achieve 40x faster Docker builds while simultaneously reducing compute costs by 60%. Engineering departments can effectively manage their continuous integration infrastructure without the manual maintenance of ARC setups or the network bottleneck of default network caching.

Evaluating runner providers comes down to balancing infrastructure maintenance with execution speed. By shifting to managed layer persistence and localized cache access, development teams can secure faster deployment times, eliminate pull and extraction overhead, and fully optimize their Docker workflows.