The Newest MLPerf AI Inference Benchmark Test Results, MLCommons Has Made Them Public

Today, MLCommons released the latest results of its MLPerf Inference benchmark test, which compares the speed of artificial intelligence systems from different hardware makers.

MLCommons is a group in the business world that makes free AI tools. As part of its work, the organization runs benchmark tests to compare the speeds of different AI-optimized hardware systems. Benchmark tests from MLCommons help people who run data centers compare how well the products from different suppliers work before they buy new hardware.

Today, MLCommons shared the results of its most recent MLPerf Inference test. MLPerf Inference is made to compare how well a data center system does inference, which is the task of running a trained AI model.

In the most recent round of the test, more than 20 companies took part. Intel Corp., which makes the most chips for data centers, and Nvidia Corp., which makes the most graphics processing units for data centers, were among the companies that took part.

The companies compared how fast their AI systems worked by having them use six neural networks to make inferences. Each of the six neural networks is designed for a different use case: image classification, object detection, medical image segmentation, speech-to-text, language processing, and e-commerce recommendations.

In the MLPerf Inference test, the participants came up with 5,300 individual performance results, which is 37% more than in the last round. During the inference process, participants also took 2,400 measurements of how much electricity their systems used.

During the test, the H100, which is Nvidia’s best data center GPU, set a number of performance records. The H100 (shown in the picture) can do some inference tasks up to 30 times faster than Nvidia’s previous top-of-the-line data center GPU, which was called the Tesla P100. It has more than 80 billion transistors and a number of improvements for machine learning that wasn’t in the company’s earlier products.

Dave Salvator, a senior product marketing manager at Nvidia, wrote in a blog post today, “When NVIDIA H100 Tensor Core GPUs made their debut on the MLPerf industry-standard AI benchmarks, they set world records in inference on all workloads, delivering up to 4.5 times more performance than previous-generation GPUs.” “The H100, also known as Hopper, set a new standard for performance per accelerator for all six neural networks in the round.”

When running the BERT-large neural network, the H100 gave the biggest performance boost compared to Nvidia’s flagship GPU from the previous generation. BERT-large is a neural network that is good at processing natural language. It’s based on the Transformer architecture, which is a popular way to build AI models in the field of natural language processing.

The H100 chip from Nvidia has a module that is made to run AI models that are based on the Transformer architecture. Nvidia says that the module cuts down on the amount of data that neural networks have to go through to get results. The less information a neural network has to go through to do a calculation, the faster it can make a decision.

As part of the MLPerf Inference test, Nvidia looked at more than just the H100. The company also tested the speed of its Jetson Orin system-on-chip, which is a processor made for robots that use little power. The processor was five times faster than the best product from Nvidia’s previous generation and used only half as much power.

For more news like this stay tuned with