I owe the local LLM community an apology.
We, at XDA, absolutely love local LLMs. That "we" didn't really include me for the longest time because I was perfectly happy letting cloud-based models do all the heavy lifting. Why wrestle with quantized weights and a fiddly setup when the results would always feel like a downgrade from what a cloud model hands you for free? So, after trying out a local LLM and being disappointed, I let the first impression be my last for a long, long time.