The 3am Alert You shipped the image pipeline months ago. It worked in staging. It worked the first week in production. Then one morning at 3am, your phone buzzes.
The container is dead. Memory usage spiked to 4 GB, Kubernetes killed the pod, and the queue backed up to 12,000 unprocessed images. You SSH in, restart the service, and go back to bed.
Two days later, it happens again. Different trigg
