0.7 C
United States of America
Thursday, November 30, 2023

Nvidia introduces the H200, an AI-crunching monster GPU which will velocity up ChatGPT Specific Instances

Must read


Enlarge / The Nvidia H200 GPU coated with a whimsical blue explosion that figuratively represents uncooked compute energy bursting forth in a glowing flurry.

Nvidia | Benj Edwards

On Monday, Nvidia introduced the HGX H200 Tensor Core GPU, which makes use of the Hopper structure to speed up AI purposes. It is a follow-up of the H100 GPU, launched final 12 months and beforehand Nvidia’s strongest AI GPU chip. If extensively deployed, it might result in much more highly effective AI fashions—and sooner response occasions for current ones like ChatGPT—within the close to future.

In response to consultants, lack of computing energy (usually known as “compute”) has been a serious bottleneck of AI progress this previous 12 months, hindering deployments of current AI fashions and slowing the event of latest ones. Shortages of highly effective GPUs that speed up AI fashions are largely accountable. One method to alleviate the compute bottleneck is to make extra chips, however you can too make AI chips extra highly effective. That second method might make the H200 a lovely product for cloud suppliers.

What is the H200 good for? Regardless of the “G” within the “GPU” identify, information middle GPUs like this sometimes aren’t for graphics. GPUs are perfect for AI purposes as a result of they carry out huge numbers of parallel matrix multiplications, that are mandatory for neural networks to operate. They’re important within the coaching portion of constructing an AI mannequin and the “inference” portion, the place individuals feed inputs into an AI mannequin and it returns outcomes.

“To create intelligence with generative AI and HPC purposes, huge quantities of knowledge have to be effectively processed at excessive velocity utilizing giant, quick GPU reminiscence,” stated Ian Buck, vp of hyperscale and HPC at Nvidia in a information launch. “With Nvidia H200, the business’s main end-to-end AI supercomputing platform simply acquired sooner to resolve a few of the world’s most vital challenges.”

For instance, OpenAI has repeatedly stated it is low on GPU assets, and that causes slowdowns with ChatGPT. The corporate should depend on charge limiting to offer any service in any respect. Hypothetically, utilizing the H200 may give the present AI language fashions that run ChatGPT extra respiratory room to serve extra prospects.

4.8 terabytes/second of bandwidth

The Nvidia H200 GPU.
Enlarge / The Nvidia H200 GPU.

Nvidia

In response to Nvidia, the H200 is the primary GPU to supply HBM3e reminiscence. Due to HBM3e, the H200 affords 141GB of reminiscence and 4.8 terabytes per second bandwidth, which Nvidia says is 2.4 occasions the reminiscence bandwidth of the Nvidia A100 launched in 2020. (Regardless of the A100’s age, it is nonetheless in excessive demand as a result of shortages of extra highly effective chips.)

Nvidia will make the H200 accessible in a number of type elements. This consists of Nvidia HGX H200 server boards in four- and eight-way configurations, appropriate with each {hardware} and software program of HGX H100 methods. It would even be accessible within the Nvidia GH200 Grace Hopper Superchip, which mixes a CPU and GPU into one package deal for much more AI oomph (that is a technical time period).

Amazon Internet Providers, Google Cloud, Microsoft Azure, and Oracle Cloud Infrastructure would be the first cloud service suppliers to deploy H200-based cases beginning subsequent 12 months, and Nvidia says the H200 might be accessible “from international system producers and cloud service suppliers” beginning in Q2 2024.

In the meantime, Nvidia has been enjoying a cat-and-mouse sport with the US authorities over export restrictions for its highly effective GPUs that restrict gross sales to China. Final 12 months, the US Division of Commerce introduced restrictions meant to “preserve superior applied sciences out of the incorrect palms” like China and Russia. Nvidia responded by creating new chips to get round these obstacles, however the US not too long ago banned these, too.

Final week, Reuters reported that Nvidia is at it once more, introducing three new scaled-back AI chips (the HGX H20, L20 PCIe, and L2 PCIe) for the Chinese language market, which represents 1 / 4 of Nvidia’s information middle chip income. Two of the chips fall beneath US restrictions, and a 3rd is in a “grey zone” that is likely to be permissible with a license. Anticipate to see extra back-and-forth strikes between the US and Nvidia within the months forward.


- Advertisement -spot_img

More articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisement -spot_img

Latest article