A diffusion language model generates text by starting from masked or otherwise corrupted tokens and iteratively restoring them. In this MacBook Air M2 demo, that idea shows up in its smallest, most hackable form: a toy character-level model that learns to recover missing characters from Karpathy's tiny Shakespeare dataset. The GitHub project Encrux/simple_dlm really is small and direct. The author
66 Tokens Make a Diffusion Language Model Look Easy
Simon Paxton·Dev.to··1 min read
D
Continue reading on Dev.to
This article was sourced from Dev.to's RSS feed. Visit the original for the complete story.