OpenAI, Broadcom unveil Jalapeño, first custom inference chip aimed at cutting AI compute costs

The two companies said the OpenAI-designed accelerator, co-developed in nine months, is targeted for initial deployment by the end of 2026 alongside gigawatt-scale data centers with Microsoft.

OpenAI and Broadcom unveiled Jalapeño on Tuesday, an ASIC purpose-built for large language model inference and pitched by Broadcom Chief Executive Hock Tan as delivering roughly 50 percent cost savings versus typical AI GPUs. It’s OpenAI’s first custom silicon, and the company is calling it an “Intelligence Processor.”

The framing matters as much as the part. By branding Jalapeño around inference rather than training, OpenAI is conceding what its financial profile already implied: the company’s compute bill is now dominated by serving models, not building them. Cutting the unit economics of a token is the only path to operating leverage at the scale OpenAI is targeting.

Tan said the chip will arrive alongside “the deployment of gigawatt scale data centers with Microsoft and other partners beginning in 2026,” with what he described as “small prototype development” in late 2026 before scaling. Initial deployment is targeted for the end of 2026.

The development timeline is its own story. OpenAI and Broadcom say Jalapeño went from initial design to manufacturing tape-out in nine months, which they characterize as potentially the fastest ASIC cycle in high-performance semiconductors. Engineering samples are already running GPT-5.3-Codex-Spark at production target frequency and power in OpenAI’s lab. The companies claim performance per watt is “substantially better than current state-of-the-art.”

OpenAI President Greg Brockman told CNBC the company used its own models to accelerate parts of the design process. “The degree to which our models have been able to accelerate it was very surprising to us,” he said. The recursive subtext is hard to miss: the models are now optimizing the silicon that runs the models.

Jalapeño is the first node in a multi-generation platform combining OpenAI silicon, Broadcom’s Tomahawk switches, and Celestica’s manufacturing support. A detailed performance report is expected later this year. For now, the partnership reads as Nvidia’s customers building their way around Nvidia’s margins.

OpenAI, Broadcom unveil Jalapeño, first custom inference chip aimed at cutting AI compute costs

Sources