Why Your AWS ECS Task is Stuck in Pending—And What to Do About It
When an AWS ECS task is stuck in the pending state, the first instinct is often to blame ECS itself. However, more often than not, the real culprit lies outside ECS—from networking misconfigurations to image pull failures. If you find your ECS tasks hanging indefinitely in pending, the key is to dig into the underlying infrastructure. Let’s break down the most common reasons why your ECS task won’t transition to RUNNING and how to fix them.
It’s Not ECS—It’s Your Infrastructure
The biggest misconception when troubleshooting ECS issues is assuming that ECS is at fault. In reality, ECS is just a scheduler—it orchestrates containers but relies heavily on other AWS components to function. The most common reasons for an ECS task getting stuck in pending are:
- A large container image that takes too long to pull
- ECS cannot access Amazon ECR (or Docker Hub) due to networking issues
- Your cluster has insufficient resources (CPU, memory, or Fargate quotas)
- IAM roles are misconfigured, preventing ECS from assuming the task execution role
If your ECS tasks are stuck, the first thing you should check is not ECS itself, but your infrastructure setup.
1. Your Container Image is Too Large
One of the most overlooked issues is a large container image. ECS needs to pull the image before launching a container, and if the image is too big, this process can take time—or even fail.
How to Check
Run the following command to see your image size:
docker images
If your image is over 1GB, that’s a red flag. Large images significantly slow down the deployment process, especially when pulled over a slow network.
Solution
- Optimize your Docker image by using multi-stage builds or lightweight base images like
alpine
. - Use Amazon ECR image caching to avoid unnecessary pulls by setting:
ECS_IMAGE_PULL_BEHAVIOR=prefer-cached
in/etc/ecs/ecs.config
.
2. ECS Can’t Access Your Image Repository
ECS needs to pull container images from Amazon ECR, Docker Hub, or another registry. If it lacks network access, your task will stay stuck in pending indefinitely.
How to Check
If you're using Fargate, check if your subnet has internet access:
- Public subnets: Must have an Internet Gateway.
- Private subnets: Must have a NAT Gateway or AWS PrivateLink configured.
For EC2-based ECS clusters, SSH into an instance and try pulling an image manually:
docker pull <your-image-url>
If this fails, ECS can’t reach the registry.
Solution
- For Fargate, ensure your task is in a subnet with a NAT Gateway or AWS PrivateLink.
- For EC2, update your security groups and IAM policies to allow outbound internet access.
3. Your Cluster is Out of Resources
If your cluster is running low on CPU or memory, ECS might not be able to schedule new tasks.
How to Check
For Fargate, check your account limits:
aws service-quotas list-service-quotas --service-code fargate
For EC2-based clusters, describe your instances:
aws ecs list-container-instances --cluster your-cluster-name aws ecs describe-container-instances --cluster your-cluster-name --container-instances <instance-id>
If your instance doesn’t have enough memory or CPU available, ECS won’t schedule the task.
Solution
- For Fargate, request a quota increase in AWS Service Quotas.
- For EC2, scale up your cluster by adding more instances.
4. IAM Role Misconfigurations
ECS tasks need an execution role to pull images and launch containers. If this role is missing permissions, the task stays in pending.
How to Check
Run:
aws ecs describe-tasks --cluster your-cluster-name --tasks <task-id>
If you see an error related to IAM permissions, your execution role is likely misconfigured.
Solution
Ensure your ECS task execution role has these policies attached:
arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy
Update your IAM role with:
aws iam attach-role-policy --role-name ecsTaskExecutionRole --policy-arn arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy
5. Your Essential Containers are Failing Health Checks
ECS won’t move a task to RUNNING if an essential container is unhealthy. If a non-essential container fails, the essential container waits indefinitely in pending.
How to Check
Run:
aws ecs describe-tasks --cluster your-cluster-name --tasks <task-id>
If your task is waiting on a dependency container that is not HEALTHY, that’s your issue.
Solution
- Define health check grace periods in your task definition.
- Ensure your non-essential containers don’t block essential ones.
Conclusion
If your AWS ECS task is stuck in pending, the issue is rarely ECS itself. Instead, the most common problems stem from network misconfigurations, large images, resource shortages, or IAM role issues. The key to resolving this quickly is focusing on underlying AWS services rather than ECS itself.
Next time you’re debugging a stuck task, remember: ECS isn’t broken—your infrastructure is.
Fargate: Debugging Pending Tasks Without a Shell
Troubleshooting AWS Fargate tasks stuck in pending is fundamentally different from debugging ECS on EC2 instances. With EC2-based clusters, you can SSH into the instance, check logs, and manually pull images. With Fargate, you have no direct access to the underlying infrastructure—which means you must diagnose issues using AWS logs, task metadata, and network configurations.
If your Fargate task is stuck in pending, the two most common culprits are:
- Large container images taking too long to pull.
- Network misconfigurations preventing Fargate from reaching Amazon ECR (or Docker Hub).
Let’s go through how to diagnose and resolve both issues.
1. Large Container Images Cause Delays
Unlike EC2 instances, which often cache container images, every Fargate task must pull the image from scratch. If your container image is over 1GB, the pull process takes longer—sometimes long enough for ECS to time out and leave the task stuck in pending.
How to Check Image Size
If your task definition uses Amazon ECR, check your image size:
aws ecr describe-images --repository-name your-repo-name --image-ids imageTag=latest
If you're using Docker Hub or another registry, inspect your local image size:
docker images | grep your-image-name
How to Fix It
- Optimize your Docker image by using a lightweight base image (e.g.,
alpine
instead ofubuntu
). - Use multi-stage builds to eliminate unnecessary dependencies.
- Enable Fargate SOCI caching - see our blog article here
2. Network Issues: Fargate Can’t Pull Images
For Fargate to run a task, it must pull the container image from Amazon ECR, Docker Hub, or another registry. If your VPC configuration is incorrect, the task will never reach the registry—keeping it in pending forever.
How to Check Networking Issues
- Verify your task is in the right subnet
- Fargate requires private subnets with outbound internet access (via a NAT Gateway or AWS PrivateLink).
- Run:
aws ecs describe-tasks --cluster your-cluster-name --tasks <task-id>
If networking is the issue, thestoppedReason
might reference network failures.
- Check VPC route tables
- If your subnet doesn’t have a NAT Gateway or VPC endpoint for Amazon ECR, Fargate can’t pull the image.
- Run:
aws ec2 describe-route-tables --filters Name=vpc-id,Values=<your-vpc-id>
- Ensure there's an outbound route to either:
- An Internet Gateway (for public subnets).
- A NAT Gateway (for private subnets).
- A VPC Endpoint for Amazon ECR (if you’re blocking outbound internet).
- Confirm security groups and IAM permissions
- Your task execution role must include:
{ "Effect": "Allow", "Action": [ "ecr:GetAuthorizationToken", "ecr:BatchCheckLayerAvailability", "ecr:GetDownloadUrlForLayer", "logs:CreateLogStream", "logs:PutLogEvents" ], "Resource": "*" }
- Ensure security groups allow outbound HTTPS traffic to ECR or Docker Hub.
- Your task execution role must include:
3. Debugging with Fargate-Specific Logs
Since you can’t SSH into a Fargate task, CloudWatch Logs and Task Events are your only debugging tools.
Check CloudWatch Logs
- Enable logging in your task definition:
"logConfiguration": { "logDriver": "awslogs", "options": { "awslogs-group": "/ecs/your-task-name", "awslogs-region": "us-east-1", "awslogs-stream-prefix": "ecs" } }
- View logs in CloudWatch:
aws logs tail /ecs/your-task-name --follow
If your task never writes logs, it likely failed before starting—meaning networking or IAM permissions are the issue.
Check Task Events
Run:
aws ecs describe-services --cluster your-cluster-name --services your-service-name
Look for events[]
—they often contain error messages like:
"Task failed to start due to image pull failure"
→ Check ECR or network access."Task stopped because essential container exited"
→ Check image compatibility or environment variables.
Final Thoughts: Debugging Fargate is Different
The biggest challenge with Fargate troubleshooting is that you can’t SSH into the underlying infrastructure. Instead, you must rely on:
- CloudWatch logs to check if your container started.
- ECS task events for failures in networking or execution.
- AWS CLI commands to validate image size, IAM roles, and VPC configurations.
If your Fargate task is stuck in pending, chances are it’s either:
✅ Too big (image pull issues) → Optimize your container size.
✅ Misconfigured (networking issues) → Check your NAT Gateway or VPC Endpoints.
✅ Lacking permissions → Ensure ECS can assume the execution role.
By focusing on these areas, you can fix pending tasks faster—without needing direct shell access.

Is your ECS task stuck in PENDING? Here are some solutions