Tag
architecture
2 dispatches
Google's DiffusionGemma Generates Text Sideways
Google released DiffusionGemma, a 26B open model that generates entire text blocks in parallel using diffusion, not token-by-token prediction. It's 4x faster and worse. That's the interesting part.
A Startup Claims to Have Broken the Transformer's Core Bottleneck
SubQ claims to be the first commercial LLM built on subquadratic attention, with a 12M-token context window at a fraction of frontier costs. The numbers are extraordinary. The scrutiny hasn't landed yet.
