NVIDIA unveils new HGX H200 for enhanced AI computing
Mon 13 Nov 2023
NVIDIA has announced its new NVIDIA HGX H200, based on its Hopper architecture. This AI computing platform, featuring the NVIDIA H200 Tensor Core GPU, is designed to handle massive data requirements for generative AI and high-performance computing workloads.
The H200 is notable for being the first GPU to incorporate HBM3e, offering faster and larger memory capabilities. This development means the H200 can deliver 141GB of memory at a speed of 4.8 terabytes per second, significantly outpacing its predecessor – the NVIDIA A100.
NVIDIA’s H200 will be available in various form factors, including the NVIDIA HGX H200 server boards and the NVIDIA GH200 Grace Hopper Superchip with HBM3e. This flexibility allows for deployment in diverse data centre environments, such as on-premises, cloud, hybrid-cloud, and edge.
Global cloud service providers, including Amazon Web Services, Google Cloud, Microsoft Azure, and Oracle Cloud Infrastructure, are slated to deploy H200-based instances starting next year.
Ian Buck, NVIDIA’s Vice President of Hyperscale and HPC, said: “To create intelligence with generative AI and HPC applications, vast amounts of data must be efficiently processed at high speed using large, fast GPU memory. With NVIDIA H200, the industry’s leading end-to-end AI supercomputing platform just got faster to solve some of the world’s most important challenges.”
The NVIDIA Hopper architecture, which underpins the H200, has already marked significant performance improvements over previous models. These gains are further augmented by continuous software enhancements to the H100, including recent powerful open-source library releases like NVIDIA TensorRT-LLM.
With the introduction of H200, further performance leaps are anticipated, including a near doubling of inference speed on Llama 2, a 70 billion-parameter large language model (LLM), when compared to the H100.
NVIDIA’s partner ecosystem comprises server makers like ASRock Rack, ASUS, Dell Technologies, and others, who are set to update their systems with H200.
The H200 platform, bolstered by NVIDIA NVLink and NVSwitch high-speed interconnects, delivers optimal performance for various workloads, including LLM training and inference for models exceeding 175 billion parameters. An eight-way HGX H200 system offers over 32 petaflops of FP8 deep learning compute and 1.1TB of aggregate high-bandwidth memory, setting a new standard in generative AI and HPC applications.
In combination with NVIDIA Grace CPUs, the H200 forms the GH200 Grace Hopper Superchip with HBM3e, an integrated module catered to large-scale HPC and AI applications.
NVIDIA’s accelerated computing platform is complemented by a suite of powerful software tools, including the NVIDIA AI Enterprise suite, facilitating the development and acceleration of applications ranging from AI to HPC.
The NVIDIA H200 is expected to become available from global system manufacturers and cloud service providers starting in the second quarter of 2024.
The company also has plans to launch new chips in China following restrictions imposed by the US.