At Google Cloud Next ’25, Google unveiled its next-gen AI accelerator chip, Ironwood. (Image Credit: Google Cloud/YouTube)
Earlier this month, Google introduced its best-performing and scalable custom AI accelerator, Ironwood, the company’s 7th-gen Tensor Processing Unit (TPU), at Google Cloud Next ‘25. When deployed at scale, Google says this chip outperforms today's fastest supercomputer, providing over 24 times the computing power in comparison. Google's newest custom AI accelerator is a major game changer in the company's AI chip development strategy. Although earlier TPU models handle training and inference, Ironwood is designed for inference.
“Ironwood represents a significant shift in the development of AI and the infrastructure that powers its progress. It’s a move from responsive AI models that provide real-time information for people to interpret, to models that provide the proactive generation of insights and interpretation. This is what we call the “age of inference” where AI agents will proactively retrieve and generate data to collaboratively deliver insights and answers, not just data,” Google wrote.
Ironwood boasts some impressive specs. It scales to 9,216 liquid-cooled chips connected to advanced Inter-Chip Interconnect (ICI) networking “spanning nearly 10 MW.” Ironwood is part of Google Cloud’s AI Hypercomputer architecture that enables hardware and software to work better together for intensive AI workloads. This AI chip works well with Google’s Pathways software stack, allowing developers to “harness the combined computing power of tens of thousands of Ironwood TPUs.”
How Ironwood compares to other Cloud TPU products. (Image Credit: Google)
When scaled, the chip provides 42.5 exaflops of computing power, significantly higher than El Captain's 1.7 exaflops. In addition, one Ironwood chip delivers 4,614 teraflops of peak power. It also has a larger amount of memory and bandwidth than Trillion, Google's previous-gen TPU. While an individual chip features 192GB of High Bandwidth Memory (HBM), its memory bandwidth achieves 7.2 terabits per second. This is 4.5x the performance improvement of Trillium.
IronWood comes with an improved SparseCore, a special accelerator designed to handle extremely large datasets used in advanced ranking and recommendation workloads. The enhanced capability enables it to speed up various workloads, including those in finance and science.
Ironwood is almost 30 times more power efficient than Google's first cloud TPU. The chip also delivers double the performance per watt than Trillium.
Have a story tip? Message me at: http://twitter.com/Cabe_Atwell