Technology & Science

Ollama on Kubernetes: Recreate Strategy and Single-GPU Deadlock

Guatu·Dev.to·2h ago·1 min read

Ollama on Kubernetes: Recreate Strategy and Single-GPU Deadlock

Guatu·Dev.to·2h ago · Tuesday, April 21, 2026·1 min read

I deployed Ollama on Kubernetes, and the GPU worker node locked up mid-rollout. No logs, no error, just a dead pod that wouldn’t terminate and a new one that wouldn’t schedule. It wasn’t a crash.

It wasn’t a timeout. It was a deadlock I’d never seen before. I expected a smooth rollout. Ollama is a single-container, single-GPU workload. I set up a Deployment with a single replica, used a Persistent

Continue reading on Dev.to

This article was sourced from Dev.to's RSS feed. Visit the original for the complete story.

Read full article