Nvidia Eats Its Own Competition
Nvidia didn't beat Groq. Nvidia bought Groq. There's a difference, and it tells you everything about where the AI hardware race actually stands.
At GTC 2026, Jensen Huang unveiled the Groq 3 LPU — the first chip to emerge from Nvidia's $20 billion acquisition of the inference startup it couldn't outperform on latency. The messaging was pure Nvidia: seamless integration, complementary architecture, the Vera Rubin platform is now complete. What they're not saying is that they spent the GDP of a small nation because their own silicon couldn't do what a startup's could.
The specs are genuinely absurd. The Groq 3 LPU is an SRAM-based inference accelerator — 512MB of on-chip solid-state RAM per unit, 150 TB/s of memory bandwidth versus 22 TB/s for the HBM4 on Vera Rubin's own GPUs. Samsung 4nm process. The Groq 3 LPX rack crams 256 of these LPUs together with 128GB of combined SRAM, delivering up to 1,500 tokens per second for agentic AI workloads. Nvidia claims 35x throughput per megawatt compared to Blackwell alone.
Ships Q3 2026. If you believe shipping timelines from a company that just absorbed an entire startup's engineering team. I'll start the timer.
The architecture play is clever. The LPU sits beside the Vera Rubin R200 GPU as a decode-phase co-processor — the GPU handles the heavy prefill computation, then hands off to the LPU for the fast token generation that agentic systems demand. Ian Buck, Nvidia's VP of AI, described it as "optimizing the decode." Jensen Huang himself said low-latency premium token generation should represent "somewhere on the order of 25 percent of the compute in an AI cluster."
Twenty-five percent of the compute. From a technology Nvidia didn't build.
That's the pattern worth watching. Nvidia hired Groq founder Jonathan Ross and his team. They absorbed the architecture. They integrated the roadmap. The Rubin CPX — Nvidia's own planned answer to this problem — appears to have quietly exited the roadmap entirely. Why build when you can buy?
This is how monopolies work in 2026. You don't outcompete the disruptors. You write a check large enough to make them part of your platform. Groq was getting traction. Cerebras was getting traction. SambaNova was getting traction. So Nvidia picked the best one and folded it into the stack.
The moat was never the chip. The moat is CUDA, the ecosystem, the developer lock-in, and now — apparently — the willingness to spend $20 billion to make sure nobody else gets a foothold.
The Groq 3 LPU is a genuinely important piece of silicon. The 35x efficiency claim, if it holds in production, matters for the energy economics of AI inference at scale. But this is acquisition dressed in a keynote. Nvidia didn't solve the inference latency problem. They bought the company that did and put their logo on it.
Start the countdown to the next startup that gets too good at something Nvidia needs. The playbook is written now.
Sources:
- How Nvidia's $20 billion Groq 3 LPU deal reshapes the Nvidia Vera Rubin Platform — Tom's Hardware, 2026-03-19
- Nvidia Finally Admits Why It Shelled Out $20 Billion for Groq — The Next Platform, 2026-03-17
- Nvidia's Groq 3 LPU targets agentic AI inference at GTC 2026 — Techzine, 2026-03-17
Source: GTC 2026 / Nvidia Groq Acquisition