On this page:
Pod Not Hibernating
# Check idle timeout
kubectl get pod <pod-name> \
-o jsonpath='{.metadata.annotations.architect\.loopholelabs\.io/scaledown-durations}'
# Verify container is managed
kubectl get pod <pod-name> \
-o jsonpath='{.metadata.annotations.architect\.loopholelabs\.io/managed-containers}'
# Check status label
kubectl get pod <pod-name> \
-o jsonpath='{.metadata.labels.status\.architect\.loopholelabs\.io/<container-name>}'
# Review daemon logs
kubectl logs -n architect -l app.kubernetes.io/name=architectd | grep <pod-name>The Architect Console also shows per-pod events, timings, and detailed debugging info.
Pod Not Waking
# Test wake via exec
kubectl exec -it <pod-name> -- /bin/sh -c "echo test"
# Test wake via network
kubectl port-forward <pod-name> <port>:<port>
curl localhost:<port>
# Check events
kubectl describe pod <pod-name>
# Verify daemon is running on the pod's node
kubectl get pod <pod-name> -o wide
kubectl get pods -n architect -o wide | grep <node-name>Scale Down and Wake
Health probes wake the container
If a managed container with health-check-proxy configured still wakes
whenever kubelet probes it:
# Confirm the sidecar was added
kubectl get pod <pod-name> \
-o jsonpath='{.spec.containers[*].name}'
# Confirm probe ports target the shadow port, not the app port
kubectl get pod <pod-name> \
-o jsonpath='{.spec.containers[?(@.name=="<container>")].livenessProbe}'
# Check the admission controller didn't skip the sidecar
kubectl logs -n architect -l app=architect-admission-controller \
| grep -i 'health check proxy'The first command lists every container in the pod; you should see
architect-health-check-proxy alongside your application container, e.g.:
my-app architect-health-check-proxyChecklist:
- The probe's
portfield on each managed container must reference theshadowPort, not theappPort. Probes that still target the application port bypass the sidecar entirely. - Both
managed-containersandnetwork-monitorannotations must be present. Without either, the admission controller logs a warning and skips sidecar injection. - The sidecar (
architect-health-check-proxy) must be present inspec.containers. If it isn't, check admission controller logs.
Scrape traffic wakes the container
If a Prometheus scrape (or other external poller) wakes a managed container
that has shadow-ports configured:
# Confirm the shadow port is on the container spec
kubectl get pod <pod-name> \
-o jsonpath='{.spec.containers[?(@.name=="<container>")].ports}'
# Check the admission controller didn't skip the shadow ports
kubectl logs -n architect -l app=architect-admission-controller \
| grep -i 'shadow ports'The first command lists the container's ports; the shadow port appears with
a shadow- name prefix, e.g.:
[{"containerPort":9090} {"containerPort":29090,"name":"shadow-29090","protocol":"TCP"}]Checklist:
- The scraper must target the
shadowPort, not theappPort. Verify yourServiceMonitor,PodMonitor, orscrape_configsreferences the shadow port (namedshadow-<port>on the container spec). - Both
managed-containersandnetwork-monitorannotations must be present. Without either, the admission controller logs a warning and skips injection. - If you can't move the scraper to a new port, swap
shadow-portsforignore-activity-portsso the existing app port is exempted from activity tracking.
Sidecar fails to inject
If health-check-proxy is set but no sidecar appears on the pod:
kubectl logs -n architect -l app=architect-admission-controller \
| grep -i 'health check proxy\|shadow ports'Checklist:
- The annotation JSON must parse — invalid JSON is logged and the feature is skipped.
managed-containersmust list the container referenced in each mapping.network-monitormust be set on the pod.- All ports must be in the 1–65535 range; mappings outside the range are dropped with a warning.
- Duplicate
shadowPortvalues across mappings are dropped with a warning. Only the first mapping per shadow port is used.
High Wake Times
If wake times exceed 50ms:
- Check node CPU and memory availability — contention slows restore
- Large memory footprints produce larger checkpoints
- Verify no resource contention on the node
- Check daemon logs or the Architect Console for per-pod restore timings:
kubectl logs -n architect -l app.kubernetes.io/name=architectd --tail=500 \
| grep -E "checkpoint|restore|error"Checkpoint Failures
- GPU workloads are not supported yet
- Checkpoints use 50-200MB per pod; check node disk space:
kubectl get nodes \
-o custom-columns=NAME:.metadata.name,DISK:.status.allocatable.ephemeral-storage- Verify
runtimeClassNameis set and the node has thearchitect.loopholelabs.io/node=truelabel - Check the Architect Console for checkpoint error details
Runtime Class Errors After Uninstall
Pods still referencing runc-architect or runsc-architect will error.
Remove runtimeClassName from affected workloads. See
Uninstalling.