Nov 26, 2024 3 min read

NVIDIA GB200 NVL72 Liquid-to-Air Rack Solutions

AMAX LiquidMax® ALC-B4872 GB200 NVL72 AI POD

NVIDIA GB200 NVL72 Liquid-to-Air Rack Solutions
Table of Contents

Overview

The LiquidMax® ALC-B4872 GB200 NVL72 AI POD is pre-configured with 36 Grace Blackwell Superchips featuring 72 NVIDIA Blackwell GPUs and 36 Grace CPUs, interconnected by 5th-generation NVLink. This advanced solution is designed to deliver cutting-edge performance for workloads such as:

  • Large Language Model (LLM) Inference
  • Retrieval-Augmented Generation (RAG)
  • High-speed data processing

With its scale-out, single-node NVIDIA MGX architecture, AMAX’s LiquidMax® ALC-B4872 GB200 NVL72 AI POD enables a wide variety of system designs and networking options to integrate into existing data center infrastructure.

In addition, the LiquidMax® ALC-B4872 GB200 NVL72 AI POD includes NVIDIA BlueField®-3 data processing units to enable cloud network acceleration, composable storage, zero-trust security and GPU compute elasticity in hyperscale AI clouds. The LiquidMax® ALC-B4872 GB200 NVL72 AI POD provides up to a 30x performance increase compared to the same number of NVIDIA H100 Tensor Core GPUs for LLM inference workloads, and reduces cost and energy consumption by up to 25x.

💡
Contact Us to learn how AMAX can power your next NVIDIA GB200 AI deployment.

POD Configuration

LiquidMax® ALC-B4872 GB200 NVL72 AI POD

Key Features

  • Up to 30x Performance Increase for LLM inference workloads
  • 25x Reduction in Cost and Energy Consumption compared to NVIDIA H100
  • Scale-Out, Single-Node NVIDIA MGX Architecture for flexible deployment options
  • Efficient Liquid-to-Air (L2A) Cooling for high-density AI and HPC workloads

Per Rack

ComponentDetails
Compute Trays18
Switch Trays9
CoolingLiquid-to-Air (L2A)

Per Tray

ComponentDetails
NVIDIA GPUs4 NVIDIA Blackwell GPUs
CPUs2 Grace CPUs
Memory960GB LPDDR5X + 288GB HBM3e
Networking2x NVIDIA ConnectX®-7 NIC (400GbE) + 2x BlueField®-3 DPU NICs (400GbE)

Grace Blackwell Superchip Performance

30X

LLM Inference

vs. NVIDIA H100 Tensor Core GPU

4X

LLM Training

vs. H100

25X

Energy Efficiency

vs. H100

18X

Data Processing

vs. CPU

Blackwell GPU Performance

MetricPer RackPer Tray
FP4 Tensor Core1,440 PFLOPS40 PFLOPS
FP8/FP6 Tensor Core720 PFLOPS20 PFLOPS
INT8 Tensor Core720 POPS20 POPS
FP16/BF16 Tensor Core360 PFLOPS10 PFLOPS
TF32 Tensor Core180 PFLOPS5 PFLOPS
FP326,480 TFLOPS180 TFLOPS
FP643,240 TFLOPS90 TFLOPS

Grace CPU Specifications

MetricPer RackPer Tray
CPU Core Count2,592 Arm® Neoverse V2 cores72 cores
CPU MemoryUp to 17 TB LPDDR5XUp to 480GB LPDDR5X
Memory BandwidthUp to 18.4 TB/sUp to 512 GB/s

Per Rack Memory and Bandwidth

  • Up to 13.5 TB HBM3e with bandwidth up to 576 TB/s
  • NVLink Bandwidth: Up to 130 TB/s

Liquid-to-Air Cooling Solution

Liquid-to-Air Cooling Diagram

AMAX’s LiquidMax® ALC-B4872 GB200 NVL72 AI POD with Liquid-to-Air (L2A) Cooling provides an efficient thermal management solution for high-performance data centers.

The system circulates liquid coolant to absorb heat from high performance components within the rack. This heat is then transferred to the air via a sidecar cooling unit, which expels the hot air into the Computer Room Air Conditioner (CRAC) for final dissipation. This efficient design provides a cost-effective and scalable cooling solution for modern data centers.

Liquid-to-Air CDU

Liquid-to-Air (L2A) CDU

The LiquidMax® ALC-B4872 GB200 NVL72 AI POD uses an innovative Liquid-to-Air (L2A) cooling solution:

  • Cooling Capacity: Up to 240 kW for 2-4 racks
  • No need for facility liquid integration
  • Scalable and cost-effective for modern data centers
Specification CDU Option 1 CDU Option 2
Power Consumption 11.32 kVA 11.32 kVA
Racks Managed Up to 2 Up to 4

Why Choose AMAX?

AMAX combines innovative engineering with industry-leading hardware such as the NVIDIA GB200 NVL72 to deliver scalable solutions tailored to your unique needs. From design to deployment, AMAX provides expert support every step of the way.

  • 10+ Years Experience with HPC Liquid Cooling
  • On-Site Installation & Cluster Bring-up
  • Network Topology 
  • Bi-directional Logistics
  • Testing & Validation
  • Maintenance & Upgrades
  • Troubleshooting & Repair Services
💡
Contact Us to learn how AMAX can power your next NVIDIA GB200 AI deployment.