Kubernetes ImagePullBackOff

Encountering ImagePullBackOff means Kubernetes cannot pull the specified container image from its registry; this guide explains how to fix it.

As a platform engineer, few sights are as common and frustrating as a pod stuck in ImagePullBackOff. It's a clear signal that your application isn't going to start, and often, it means troubleshooting an issue outside the application code itself. I've encountered this countless times, from development clusters to production environments, and while the underlying cause can vary, the diagnostic process remains largely the same.

What This Error Means

When a Kubernetes pod enters an ImagePullBackOff state, it signifies that the Kubelet on the node tasked with running the pod has failed to retrieve the specified container image from the container registry. The "BackOff" part means that Kubelet is retrying the pull operation, but with increasing delays, to prevent hammering the registry or network unnecessarily.

This error is a precursor to a pod failing to start. You'll typically see your pod transition through Pending (if scheduled but not yet created), then ContainerCreating (while Kubelet attempts to pull), and finally settling on ImagePullBackOff if the pull fails. Until the image is successfully pulled, the container will not run, and your application will not deploy correctly.

Why It Happens

At its core, ImagePullBackOff happens because the Kubelet cannot get the image it needs. This isn't just one problem; it's a symptom that points to several potential underlying issues related to image identification, access, or network connectivity. The Kubelet, acting on behalf of the pod, simply reports that it tried to pull an image and failed. It's up to us to dig into why that failure occurred.

In my experience, the most frequent culprits boil down to either the image not existing where Kubernetes expects it, or Kubernetes lacking the necessary permissions or network path to reach it. It's rarely a complex application-level bug and more often a configuration or infrastructure hiccup.

Common Causes

Let's break down the typical scenarios that lead to ImagePullBackOff:

Incorrect Image Name or Tag: This is by far the most common cause. A simple typo in the image name (my-app:v1 vs. myapp:v1) or an incorrect tag (latest when the image was pushed as dev) will prevent Kubelet from finding the image.
Image Does Not Exist: The image might have been deleted from the registry, pushed to a different registry than specified, or never successfully pushed in the first place.
Private Registry Authentication Failure: If you're pulling from a private registry (like Docker Hub private repos, AWS ECR, GCP GCR, Azure ACR, or a self-hosted registry), Kubernetes needs credentials. If imagePullSecrets are missing, incorrect, expired, or not correctly linked to the pod/service account, authentication will fail.
Network Connectivity Issues: The Kubernetes node might not be able to reach the container registry. This can be due to:
- Firewall rules (egress rules on the node, or network security groups in cloud environments).
- DNS resolution failures for the registry's hostname.
- Proxy configuration issues on the nodes.
- Registry being down or unreachable.
Registry Rate Limiting: Public registries like Docker Hub impose rate limits, especially for anonymous pulls. If you perform too many pulls within a given timeframe without authenticating, subsequent pulls will fail.
Image Architecture Mismatch (Less Common for Pull Failure): While less likely to cause an ImagePullBackOff directly (it usually fails later during container creation if the architecture is incompatible), a registry might sometimes fail to resolve a manifest if no compatible image for the node's architecture exists, especially in more complex multi-arch scenarios.

Step-by-Step Fix

When you encounter ImagePullBackOff, follow this methodical approach. I've found this process to be reliable in pinpointing the root cause.

Inspect the Pod Status and Events:
The first step is always to check the detailed status of the affected pod. This gives you invaluable clues in the Events section.

bash kubectl get pods <pod-name> -n <namespace> kubectl describe pod <pod-name> -n <namespace>

Look for lines like Failed to pull image "myregistry/myimage:mytag": rpc error: code = NotFound desc = failed to pull and unpack image ... or Failed to pull image "myregistry/myimage:mytag": rpc error: code = Unknown desc = Error response from daemon: Get "https://myregistry/v2/myimage/manifests/mytag": unauthorized: authentication required. These messages directly point to the specific reason.
Verify Image Name and Tag:
Double-check the image name and tag in your pod's YAML manifest. It's surprisingly easy to have a typo, an outdated tag, or a mismatch between what's deployed and what's actually in the registry.

```yaml

Example Pod manifest snippet

spec:
containers:
- name: my-app
image: your-registry/your-image:your-tag # <--- Check this line carefully
# ...
```

Compare your-registry/your-image:your-tag precisely with what you pushed to the registry.
Check Registry Accessibility (Manual Test from a Node):
If the image name and tag are correct, the next step is to ensure the node can actually reach the registry. SSH into one of the Kubernetes nodes that is trying to pull the image and attempt a manual pull.

```bash

Try to log in (if it's a private registry)

docker login your-registry.io

Then try to pull the specific image

docker pull your-registry.io/your-image:your-tag
```

This manual test will often reveal network issues (e.g., dial tcp: lookup your-registry.io on 10.x.x.x:53: no such host for DNS problems, or TLS handshake timeout for firewall issues) or authentication problems (e.g., Error response from daemon: unauthorized: authentication required). If docker pull works from the node, the issue is likely with Kubernetes' configuration (like imagePullSecrets).
Validate imagePullSecrets (for Private Registries):
If you're using a private registry, Kubernetes needs a Secret of type kubernetes.io/dockerconfigjson containing your registry credentials. This secret must then be referenced in the pod's imagePullSecrets or associated with the ServiceAccount used by the pod.

```bash

Check the secret exists and its type

kubectl get secret -o yaml -n

Ensure the secret is referenced in your pod/deployment

Example snippet:

spec:
containers:
- name: my-app
image: your-registry/your-image:your-tag
# ...
imagePullSecrets:
- name: # <--- This must match
```

I've seen situations where the secret exists but is in the wrong namespace, or the pod isn't referencing it correctly.
Review Network Configuration:
If the docker pull command from the node failed with network errors, investigate:
- Firewalls: Check ingress/egress rules on the nodes or cloud security groups. Is port 443 (for HTTPS) open to the registry's IP range?
- DNS: Can the node resolve the registry's hostname? nslookup your-registry.io from the node.
- Proxies: If your environment uses an HTTP/HTTPS proxy, ensure it's correctly configured for the Docker daemon and Kubelet on your nodes.
Check Registry Status:
Occasionally, the registry itself might be experiencing downtime or issues. Check the status page for public registries (e.g., Docker Hub Status) or your internal registry's health monitors.

Code Examples

Here are some concise, copy-paste ready code examples for common scenarios:

1. Describing a Pod with ImagePullBackOff:

kubectl describe pod my-app-deployment-xxxxx-yyyyy -n my-namespace

Output excerpt:

Name:         my-app-deployment-xxxxx-yyyyy
Namespace:    my-namespace
...
Containers:
  my-app:
    Container ID:
    Image:         myregistry.com/my-app:v2.1
    Image ID:
    Port:          8080/TCP
    Host Port:     0/TCP
    State:         Waiting
      Reason:      ImagePullBackOff
    Last State:    Terminated
      Reason:      ContainerCreating
    Ready:         False
...
Events:
  Type     Reason                 Age    From                     Message
  ----     ------                 ----   ----                     -------
  Normal   Scheduled              2m     default-scheduler        Successfully assigned my-app-deployment-xxxxx-yyyyy to node-1
  Normal   Pulling                1m     kubelet, node-1          Pulling image "myregistry.com/my-app:v2.1"
  Warning  Failed                 1m     kubelet, node-1          Failed to pull image "myregistry.com/my-app:v2.1": rpc error: code = NotFound desc = failed to pull and unpack image "myregistry.com/my-app:v2.1": no such image: myregistry.com/my-app:v2.1
  Warning  Failed                 1m     kubelet, node-1          Error: ImagePullBackOff
  Normal   BackOff                50s    kubelet, node-1          Back-off pulling image "myregistry.com/my-app:v2.1"
  Warning  Failed                 35s    kubelet, node-1          Error: ImagePullBackOff

2. Example Pod YAML with imagePullSecrets:

apiVersion: v1
kind: Pod
metadata:
  name: private-image-pod
spec:
  containers:
  - name: my-private-app
    image: my-private-registry.com/my-repo/my-private-image:latest
  imagePullSecrets:
  - name: my-registry-secret

3. Creating a dockerconfigjson Secret:

First, log in to your registry to generate the .docker/config.json file:

docker login my-private-registry.com
# Enter username and password when prompted

Then, use kubectl create secret to create the Kubernetes secret from this file:

kubectl create secret generic my-registry-secret \
    --from-file=.dockerconfigjson=/path/to/.docker/config.json \
    --type=kubernetes.io/dockerconfigjson \
    -n <namespace>

Note: Replace /path/to/.docker/config.json with the actual path on your machine.

Environment-Specific Notes

The nuances of ImagePullBackOff often depend on your Kubernetes environment.

Cloud Providers (AWS ECR, GCP GCR, Azure ACR):
- IAM Roles/Permissions: In cloud environments, nodes often use IAM roles (AWS), service accounts (GCP), or managed identities (Azure) to authenticate with container registries. Ensure the IAM role attached to your EC2 instances (EKS nodes), GKE nodes, or AKS node pools has the necessary ecr:GetAuthorizationToken, containerregistry.reader, or AcrPull permissions respectively. I've often seen missing permissions as the culprit in these setups.
- VPC Endpoints/Private Link: If your nodes are in a private network (VPC) without internet access, you'll need VPC endpoints or Private Link to reach the cloud registry securely.
- kube2iam or kiam: These tools allow pods to assume specific IAM roles, which can be used to grant pull access at a more granular pod level without giving all nodes blanket access.
Docker Desktop / Minikube (Local Development):
- Local Images: If you build an image locally and want to use it directly with Minikube without pushing to a registry, you must point your Docker CLI to Minikube's daemon: eval $(minikube docker-env). Then docker build and ensure imagePullPolicy: Never or IfNotPresent is set in your pod manifest. Otherwise, Minikube will try to pull from Docker Hub.
- Authentication: For private registries, you still need imagePullSecrets, but the underlying Minikube VM needs to be able to reach the registry.
On-Premises / Air-gapped Environments:
- Internal Registries: These environments exclusively use internal registries. All nodes must be able to reach this registry over the network.
- Proxies: Proxy configurations are extremely common. Ensure the HTTP_PROXY, HTTPS_PROXY, and NO_PROXY environment variables are correctly set for Docker, Kubelet, and any other relevant components on your nodes. Missing NO_PROXY for the internal registry can cause issues.
- Certificates: If your internal registry uses custom or self-signed TLS certificates, these root CAs must be trusted by the Docker daemon on all Kubernetes nodes.

Frequently Asked Questions

Q: My pod is stuck in ContainerCreating and then goes ImagePullBackOff. What gives?
A: ContainerCreating is the transient state while Kubelet prepares the container, which includes pulling the image. If the image pull fails during this phase, the pod will transition to ImagePullBackOff. So, it's a natural progression of failure.

Q: How do I pull an image from a private registry if my nodes don't have docker login configured?
A: You don't configure docker login directly on nodes for Kubernetes pulls. Instead, you create imagePullSecrets in Kubernetes, which contain the registry credentials. Kubelet then uses these secrets to authenticate on behalf of the pod when performing the pull.

Q: Can ImagePullBackOff happen if the image exists but is corrupted?
A: Typically, no. ImagePullBackOff indicates a failure to retrieve the image's manifest or layers. If an image were corrupted but successfully pulled, the failure would usually manifest later during container startup (e.g., CrashLoopBackOff or a container error), not during the initial pull attempt.

Q: What if I specified imagePullPolicy: Never but it still fails?
A: imagePullPolicy: Never tells Kubelet to never attempt to pull an image if it's already present on the node. However, if the image is not found locally on the node, Kubelet will still treat it as a missing image and the pod will go into ImagePullBackOff. It prevents unnecessary pulls, not all pulls for a missing image.

Q: I'm seeing net/http: TLS handshake timeout in events. What does that mean?
A: This specific error message strongly suggests a network connectivity issue from your Kubernetes node to the container registry's endpoint on port 443. This could be due to an active firewall blocking outbound traffic, a misconfigured proxy, or a DNS resolution problem preventing the TLS handshake from completing.

Kubernetes ImagePullBackOff

What This Error Means

Why It Happens

Common Causes

Step-by-Step Fix

Example Pod manifest snippet

Try to log in (if it's a private registry)

Then try to pull the specific image

Check the secret exists and its type

Ensure the secret is referenced in your pod/deployment

Example snippet:

Code Examples

Environment-Specific Notes

Frequently Asked Questions

Related Errors