Nvidia turns to liquid cooling to reduce big tech’s energy use

Nvidia has announced its new plan for reducing the energy use of data centers crunching massive amounts of data or training AI models: liquid-cooled graphics cards. The company announced at Computex that it’s introducing a liquid-cooled version of its A100 compute card, and says that it consumes 30 percent less power than the air-cooled version. Nvidia’s also pledging that this isn’t just a one-off, it’s already got more liquid-cooled server cards on its roadmap, and hints at bringing the tech to other applications like in-car systems that need to keep cool in enclosed spaces. Of course, Tesla’s recent recall for overheating chips shows how tricky that can be, even with liquid cooling.

According to Nvidia, reducing the energy needed to perform complex computations could make a big impact — the company says data centers use over one percent of the world’s electricity, and 40 percent of that is down to cooling. Reducing that by almost a third would be a big deal, though it is worth noting that graphics cards are only one part of the equation; CPUs, storage, and networking equipment also draw power and need cooling as well. Nvidia’s claim is that with liquid cooling, GPU-accelerated systems would be far more efficient than CPU-only servers on AI and other high-performance tasks.

Nvidia’s roadmap for liquid cooled appliances and cards.
Image: Nvidia

There’s a reason liquid-cooling is popular in high-performance use cases, from supercomputers to custom gaming PCs and even a few phones: liquids absorb heat better than air, according to Asetek, a major manufacturer of water cooling systems. And once you have warm liquid, it’s relatively easy to transfer it elsewhere so it can cool off, compared to trying to cool down the air in an entire building or increase airflow to the specific components on a card that are dumping out all the heat.

Besides the energy efficiency, liquid-cooled cards have another bonus over their air-cooled counterparts — they take up significantly less room, meaning you can fit more of them in the same amount of space.

Nvidia’s push to lower energy use via liquid-cooling comes at a time when a lot of companies are considering the amounts of energy their servers use. While data centers are far from the only source of carbon emissions and pollution for big tech, they’re a piece of the puzzle that can’t be ignored, and critics have noted that offsetting energy use through credits isn’t as impactful as reducing consumption altogether. Companies like Microsoft have experimented with submerging servers in liquid completely and even putting whole data centers in the ocean in a bid to use less energy and water.

Of course, those solutions are rather exotic — while the type of liquid-cooling Nvidia’s offering isn’t necessarily the norm for data centers, it’s not as out there as putting your servers in the ocean (though so far Microsoft’s experiments with that have been shockingly successful). Nvidia’s explicitly marketing its liquid-cooled GPUs as being for “mainstream” servers, rather than as a bleeding-edge solution.

This does raise the question of whether we could see Nvidia try to take liquid-cooling even more mainstream by building liquid-cooling into the reference designs for its gaming-focused cards. The company doesn’t mention any plans to do that, only saying that it plans to “support liquid cooling in our high-performance data center GPUs” for the “foreseeable future.”

However, server tech trickles down to home PC tech all the time, and gaming cards coming straight from the factory with an all-in-one liquid cooler isn’t something that’s completely unheard of — AMD’s had a few reference designs that included a liquid-cooling loop, and third parties have sold liquid-cooled Nvidia cards before. As Nvidia’s cards continue to draw more and more power (a stock 3090 Ti can draw up to 450 watts), I wouldn’t be surprised if Nvidia announces an RTX 5000-series card that comes stock with a liquid cooler.

As for Nvida’s data center-focused cards, the company says companies like ASRock, Asus, and Supermicro will incorporate liquid-cooled cards into their servers “later this year,” and that slot-in PCIe A100 cards are coming in Q3 of this year. A liquid-cooled PCIe version of its just-announced H100 card (which is the next-gen version of the A100) is due in “early 2023.”

https://www.theverge.com/2022/5/24/23138928/nvidia-liquid-cooling-a100-server-graphics-cards-computation-ai