Built on the 7 nm process, and based on the GA100 graphics processor, the card supports DirectX 12 Ultimate. Datasheet. The large model size of BERT requires a huge amount of memory, and each DGX A100 provides 320 GB of high bandwidth GPU memory.  NVIDIA interconnect technologies like NVLink, NVSwitch, and Mellanox networking bring all GPUs together to work as one on large AI models with high-bandwidth communication for efficient scaling. To help organizations overcome these challenges and succeed in a world that desperately needs the power of AI to solve big challenges, NVIDIA designed the world’s first family of systems purpose-built for AI—NVIDIA DGX systems. These drives use CacheFS to increase the speed at which workloads access data and to reduce network data transfers.MIG can be enabled selectively on any number of GPUs in the DGX A100 system—not all GPUs need to be MIG-enabled. NVIDIA DGXperts complement your in-house AI expertise and let you combine an enterprise-grade platform with augmented AI-fluent talent to achieve your organization’s AI project goals.The most common methods of moving  data to and from the GPU involve leveraging the on-board storage and using the Mellanox ConnectX-6 network adapters through remote direct memory access (RDMA). However, the enterprise requires a platform for AI infrastructure that improves upon traditional approaches, which historically involved slow compute architectures that were siloed by analytics, training, and inference workloads. This latest generation in the DGX A100 uses larger matrix sizes, improving efficiency and providing twice the performance of the V100 Tensor Cores along with improved performance for INT4 and binary data types.

Customer Testimonials. MIG uses spatial partitioning to carve the physical resources of a single A100 GPU into as many as seven independent GPU instances. NVIDIA DGX A100 is the universal system for all AI workloads, offering unprecedented compute density, performance and flexibility in the world’s first 5 petaFLOPS AI system. This provides a maximum amount of bandwidth to communicate across GPUs over the links. The NVIDIA A100 GPUs are connected to the PCI switch infrastructure over x16 PCI Express Gen 4 (PCIe Gen4) buses that provide 31.5 GB/s each for a total of 252 GB/s, doubling the bandwidth of PCIe 3.0/3.1. Using the MIG capability in A100 GPU, you can assign resources that are right-sized for specific workloads.The Mellanox ConnectX-6 I/O cards offer flexible connectivity as they can be configured as HDR Infiniband or 200-Gb/s Ethernet. The DGX A100 incorporates a one-to-one relationship between the I/O cards and the GPUs, which means each GPU can communicate directly with external sources without blocking other GPU access to the network. Organizations of all kinds are incorporating AI into their research, development, product, and business processes. Download the NVIDIA DGX A100 data sheet. DGX A100 is available now.The first-generation Tensor Cores used in the NVIDIA DGX-1 with V100 provided accelerated performance with mixed-precision MMA in FP16 and FP32.

However, traditional compute infrastructures aren’t suitable for AI due to slow CPU architectures and varying system requirements for different workloads and project phases.