NVIDIA’s DGX A100: Unprecedented Performance

NVIDIA’s new DGX A100 offers a substantial increase in performance compared to previous generations. The new GPU architecture, Ampere, delivers an output of a massive 5 petaFLOPS of AI performance across a single 8X GPU system. Coupled with AMD Rome processing and Mellanox networking, the A100 is a universal system for all AI workloads. With eight (8) new A100 Tensor Core GPUs, there is a total capacity of 320GB of memory for training AI datasets, utilizing NVSwitches and Mellanox ConnectX-6 networking.


Powered by AMD Rome 7742 CPUs, 1TB of System Memory, and over 15TB of storage, the full specifications of the DGX A100 can be seen below

Source: Nvidia 

Comparing A100 vs V100 GPU

Comparing the A100 GPU to the V100 GPU, there is a dramatic increase in performance. The A100 packs 19.5 TFLOPS of double-precision performance, compared to only 7.8 TFLOPS in the V100, as well as more CUDA Cores, coming in at 6912 vs 5120, respectively. The DGX A100 also has 40GB compared to 32GB of GPU Memory for the V100, providing a significant increase when performing trainings and inferencing.

DGX A100 Features

Multi-Instance GPU (MIG)

A new feature built into the A100 is GPU splicing, or Multi-Instance GPU (MIG). This allows the user to slice the eight (8) GPUs into as many as fifty-six (56) instances per system. These instances can be used for training, inferencing, and any other task that a Data Scientist may want.

NVLink and NVSwitch

Connecting the GPUs together are the third generation NVLink and NVSwitch. The NVLink has a direct GPU-to-GPU bandwidth up to 600 GBs while the NV Switch used for interconnecting DGX A100s, is 2X times faster than previous generations.


Included in the DGX A100 is Mellanox ConnectX-6 VPI HDR InfiniBand/Ethernet adapters. These adapters provide up to 200 GB/s across the network for large-scale AI workloads

New Generation of Technology


This 6U system begins with the end of life sales of the DGX-1 and DGX-2. NVIDIA has announced that the last date to order NVIDIA® DGX-1™, DGX-2™, DGX-2H systems and Support Services SKUs is June 27, 2020. After that date, the DGX-1 and DGX-2 will continue to be supported by NVIDIA Engineering. The DGX A100, being an all-in-one platform for AI development, comes with a new generation of Tensor Cores, Gen3. The third generation Tensor Core can achieve up to 6X higher out-of-the-box performance with TF32 on AI training, as seen here.

It provides support for various data types, including FP16, BF16, TF32, FP64, INT8, INT4, and Binary.  In combination with NVIDIA’s GPU Cloud (NGC), there is seamless integration with training scripts, models, helm charts and HPC Applications that can be quickly installed and ran.

The NVIDIA DGX A100 is now available via our DGX POC Program. Within a short period of time from setting one up, thanks to NVIDIA’s hardware and software stack, one can immediately start training their models, improving accuracy or run any AI workload.

Ben Siegel is Groupware’s Technical Support Engineer for Data & AI.