improvements in performance and efficiency. These workloads span from the rapidly growing generative AI market to enterprise inferencing, product design, visualization, and to the intelligent edge. Supermicro has built a portfolio of workload-optimized systems for optimal GPU performance and efficiency across this broad spectrum of workloads.
Use Cases
• Large Language Models (LLMs)
• Autonomous Driving Training
• Recommender Systems
Opportunities and Challenges
• Continuous growth of data set size
• High performance everything: GPUs, memory, storage and network fabric
• Pool of GPU memory to fit large AI models and interconnect bandwidth for fast training
Key Technologies
• NVIDIA HGX H100 SXM 8-GPU/4-GPU
• GPU/GPU interconnect (NVLink and NVSwitch), up to 900GB/s – 7x greater than PCIe 5.0
• Dedicated high performance, high capacity GPU memory
• High throughput networking and storage per GPU enabling NVIDIA GPUDirect RDMA and Storage.
Solution Stack
• DL Frameworks: TensorFlow, PyTorch
• Transformers: BERT, GPT, Vision Transformer
• NVIDIA AI Enterprise Frameworks (NVIDIA Nemo, Metropolis, Riva, Morpheus, Merlin
• NVIDIA Base Command (infrastructure software libraries, workload orchestration, cluster management)
• High performance storage (NVMe) for training cache
• Scale-out storage for raw data (data lake)