Use pull through cache on ECR to circumvent Docker Hub rate limits

Update November 2023: AWS now natively supports Docker Hub so you can use it directly. You can still use this module if you need custom Docker lines for an image (for example an volume mount).

Since Docker Hub introduced their rate limits, if you start a lot of task containers for jobs you might run into the following error: “Too Many Requests – Server message: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit“. You can resolve this by using a new feature of ECS called Pull Through Cache. But it has a couple of pitfalls as Docker Hub is currently not supported as a Public Registry.

Using ECS pull through cache

With Terraform it is as easy as setting the following configuration:

resource "aws_ecr_pull_through_cache_rule" "example" {
  ecr_repository_prefix = "ecr-public"
  upstream_registry_url = "public.ecr.aws"
}

You now have a namespace you can use for pulling the public containers:

We can now login to ECR and try pulling a public image, first login:

aws ecr get-login-password --region eu-west-1 | docker login --username AWS --password-stdin ACCOUNTID.dkr.ecr.eu-west-1.amazonaws.com

Then pull an image:

docker pull ACCOUNTID.dkr.ecr.eu-west-1.amazonaws.com/ecr-public/bitnami/mongodb:latest

When you pull the image for the first time it will be cached by ECR and you benefit from the speed improvements.

Note that if you use ECR Public you need to find the image here (and not on Docker Hub, the namespaces might be different).

Please note that only a few public registries are supported. Docker Hub is not one of them.

However if you can find the image on ECR Public registry official Docker Hub images will all start with docker/library you can use that image safely (from ECR Public). These are automatically kept in sync with Docker Hub.

Supported registries and where to find the image namespaces:

  1. Amazon ECR Public
  2. Quay
  3. Kubernetes container image registry

This approach also reduces latency. Especially if you setup a VPC endpoint for ECR so the request for the container pull does not have to traverse the full public internet (this is not guaranteed though as VPC endpoints do not advertise latency improvements as a product benefit). However in practice I’ve seen some great results using this approach.

If you can not find your container on these registries, you can use my Terraform module.

Terraform module to use docker hub in pull through cache on ECR

I’ve also made a simple Terraform module that periodically syncs public repositories from Docker Hub to your local ECR private registries.

This is because some images that are actively maintained (for example Cloudflared) are only hosted centrally on Docker Hub. ECR Public has a couple of image of cloudflared but they are managed by unofficial contributors (and might not even be Cloudflared!).

In addition this module strengthens your security posture. One of the big risks of allowing public registries like Docker Hub is that you are only a spelling error away from a supply chain attack as mentioned here. By using only whitelisted images that are pulled through an ECR VPC endpoint of your own registry you can make sure your developers can only use approved containers in your infrastructure.

The module also has a great feature that is able to add lines to the Docker images to for instance, mount VOLUME’s for bind mounts automatically.

You can pass in the following additional build commands:

build_commands = {
  "hashicorp/vault:1.14" = [
    "RUN mkdir /etc/vault",
    "RUN chmod 777 /etc/vault",
    "VOLUME [\"/etc/vault\"]"
  ]
}

This results in the following Dockerfile:

FROM hashicorp/vault:1.14
RUN mkdir /etc/vault
RUN chmod 777 /etc/vault
VOLUME ["/etc/vault"]

This allows you to create volume mounts in default Docker Hub containers as well. Or set environment options that you can use to customise the image during runtime.