docs: add k8s image GC known issue to debug checklist

Kubernetes image garbage collection silently deletes the nanoclaw-agent
image when disk usage is high because ephemeral containers don't
protect the image from GC. Documents symptoms, cause, fix, and diagnosis.
This commit is contained in:
James Schindler
2026-03-27 08:28:58 -04:00
parent 8f91d3be57
commit 8935e4f636

View File

@@ -11,6 +11,34 @@ Both timers fire at the same time, so containers always exit via hard SIGKILL (c
### 3. Cursor advanced before agent succeeds
`processGroupMessages` advances `lastAgentTimestamp` before the agent runs. If the container times out, retries find no messages (cursor already past them). Messages are permanently lost on timeout.
### 4. Kubernetes image garbage collection deletes nanoclaw-agent image
**Symptoms**: `Container exited with code 125: pull access denied for nanoclaw-agent` — the container image disappears overnight or after a few hours, even though you just built it.
**Cause**: If your container runtime has Kubernetes enabled (Rancher Desktop enables it by default), the kubelet runs image garbage collection when disk usage exceeds 85%. NanoClaw containers are ephemeral (run and exit), so `nanoclaw-agent:latest` is never protected by a running container. The kubelet sees it as unused and deletes it — often overnight when no messages are being processed. Other images (docker-compose services) survive because they have long-running containers referencing them.
**Fix**: Disable Kubernetes if you don't need it:
```bash
# Rancher Desktop
rdctl set --kubernetes-enabled=false
# Then rebuild the container image
./container/build.sh
```
**Diagnosis**: Check the k3s log for image GC activity:
```bash
grep -i "nanoclaw" ~/Library/Logs/rancher-desktop/k3s.log
# Look for: "Removing image to free bytes" with the nanoclaw-agent image ID
```
Check NanoClaw logs for image status:
```bash
grep -E "image found|image NOT found|image missing" logs/nanoclaw.log
```
If you need Kubernetes enabled, set `CONTAINER_IMAGE` to an image stored in a registry that the kubelet won't GC, or raise the GC thresholds.
## Quick Status Check
```bash