Jensen Huang unveils Nvidia’s upcoming AI chip on GTC: What to Know

NVIDIA (NVDA) unveiled two upcoming GPU architectures: Blackwell Ultra and Rubin at its annual developer conference in San Jose, California today (March 18). In its opening keynote, CEO Jensen Huang put forward the ambitious vision of chipmakers to push AI into the era of industrial-scale computing. Huang called GTC “the Super Bowl of AI” and took a key difference: “In this Super Bowl, everyone won,” he said.
The Blackwell Ultra is expected to be launched in the second half of 2025. Its successor, Rubin, is scheduled to be in late 2026, followed by the more advanced Rubin Ultra in 2027.
Knowledge about Blackwell Super and Rubin
Huang said that currently in production, NVIDIA’s Blackwell Ultra line is an advanced iteration of Blackwell chips revealed at last year’s GTC, which is 40 times stronger than the previous generation of hopper fries.
Blackwell Ultra will combine eight stacks of 12-HI HBM4E memory to provide 288GB of on-board memory. The architecture will feature NVLINK 72, an upgraded high-speed interconnect technology designed to facilitate communication between GPU and CPU, which is critical to handling the large amount of data sets needed for AI training and reasoning.
“NVLink connects multiple GPUs and turns them into one GPU,” Huang explained. “It solves the scaling problem by implementing large-scale parallel computing.”
Additionally, the company launched the NVIDIA RTX Pro 6000 Blackwell Server Edition, a version designed for enterprise workloads such as multi-modal AI inference, immersive content creation, and scientific computing. With 96GB of GDDR7 memory and support for multi-named GPU technology, the RTX PRO 6000 is designed to power advanced AI development.
Blackwell’s successor Rubin was named after the astronomer Vera Rubin, who discovered dark matter in space. The initial version of the Rubin chip is expected to achieve 50-speed Petaflops during the execution of the AI model. The more powerful Rubin Ultra can deliver up to 100 PETAFLOPs, which Huang calls “forward” in AI processing and performance power.
Early iterations of Blackwell chips and shelves It is reportedly facing overheatinglead some customers to reduce orders. The newly introduced liquid-cooled Grace Blackwell 200 NVL72 system solves these problems, and real-time inference of trillion-parameter large language models is up to 30 times faster and up to four times faster than NVIDIA’s previous generation H100 GPUs. It can generate up to 12,000 tokens per second (the basic unit processed by the AI model), gradually speeding up training and reasoning.
“If you want your AI to be smarter, it has to generate more tokens. This requires a lot of bandwidth, floating point operations and memory,” Huang said. He also explained that inference AI models such as DeepSeek’s R1 model require 20 times the tokens and 105 times the computing power.
NVIDIA is also working closely with TSMC in Taiwan to develop data packaging technology for data centers, a move that can greatly improve the computing efficiency and thermal requirements of future GPU generations.
NVIDIA’s roadmap goes beyond Rubin
Huang also outlines the roadmap for Nvidia outside Rubin. “The next generation architecture after Rubin will be named Feynman,” he said, confirming that Feynman HBM is already in development and is scheduled to be released in 2028. Named by Richard Feynman, famous for his contributions to quantum mechanics, this upcoming building is expected to push AI ai performance to an unproven level.
Announced to follow Nvidia’s expected first-quarter revenue resultsDriven by a surge in demand for its GPUs. Despite growing competition from competitors such as AMD and geopolitical uncertainty, including export restrictions on semiconductors, NVIDIA currently dominates the global GPU market with an estimated 80% market share. Nasdaq.