When I started building with LLMs, I kept running into terms I didn't fully understand. Quantization, KV cache, top-k sampling, temperature. Every time I looked one up, I got either a textbook definition or a link to a paper.

That told me what the term is. It didn't tell me what to do with it. What decision does it affect? What breaks if I ignore it?

What tradeoff am I making? So I started keeping