MLPerf Client Archives - MLCommons

MLCommons Releases MLPerf Client v1.0: A New Standard for AI PC and Client LLM Benchmarking

MLCommons — Wed, 30 Jul 2025 15:02:04 +0000

MLCommons®, the consortium behind the industry-standard MLPerf® benchmarks, today announced the release of MLPerf Client v1.0, a benchmark that sets a new standard for measuring the performance of large language models (LLMs) on PCs and other client-class systems. This release marks a major milestone in the effort to bring standardized, transparent AI performance metrics to the fast-emerging AI PC market.

MLPerf Client v1.0 introduces an expanded set of supported models, including Llama 2 7B Chat, Llama 3.1 8B Instruct, and Phi 3.5 Mini Instruct. It also adds Phi 4 Reasoning 14B as an experimental option to preview the next generation of high-reasoning-capable LLMs. These additions allow the benchmark to reflect real-world use cases across a broader range of model sizes and capabilities.

The benchmark also expands its evaluation scope with new prompt categories. These include structured prompts for code analysis and experimental long-context summarization tests using roughly 4,000- and 8,000-token inputs, representing workloads increasingly relevant to both developers and advanced users.

Hardware and platform support have also expanded significantly. MLPerf Client v1.0 now supports AMD NPUs and GPUs working together via the ONNX Runtime and the Ryzen AI SDK. Intel NPUs and GPUs are supported through OpenVINO. GPUs from AMD, Intel, and NVIDIA are supported across the board through the ONNX Runtime GenAI with DirectML, offering wide compatibility for GPU-equipped systems. Qualcomm Technologies NPUs and CPUs are supported in hybrid operation using Qualcomm Genie and the QAIRT SDK. Also, Apple Mac GPUs are supported through MLX.

Additionally, the benchmark offers early, experimental support for several other acceleration paths. Intel NPUs and GPUs are supported via Microsoft Windows ML using the OpenVINO execution provider. NVIDIA GPUs are supported via llama.cpp with CUDA, and Apple Mac GPUs are supported via llama.cpp with Metal.

For users and developers, MLPerf Client v1.0 provides both command-line and graphical user interfaces. The newly developed GUI offers intuitive, cross-platform benchmarking with key usability enhancements, such as real-time readouts of compute and memory usage, persistent results history, comparison tables across test runs, and CSV exports for offline analysis. The CLI enables easy automation and scripting for regression testing or large-scale evaluations.

MLPerf Client v1.0 is the result of collaboration among major industry stakeholders, including AMD, Intel, Microsoft, NVIDIA, Qualcomm Technologies, and leading PC OEMs. The benchmark is available now as an open and free download from mlcommons.org, and it will continue to evolve alongside the rapidly growing AI PC ecosystem.

“MLPerf Client v1.0 is a major step forward for benchmarking AI capabilities on consumer systems,” said Ramesh Jaladi, co-chair of the MLPerf Client working group at MLCommons. “It provides a reliable, vendor-neutral standard that OEMs, silicon providers, reviewers, and end users can trust.”

About MLCommons

MLCommons is an open engineering consortium with a mission to make machine learning better for everyone. The organization produces industry-leading benchmarks, datasets, and best practices that span the full range of ML applications—from massive cloud training to resource-constrained edge devices. Its MLPerf benchmark suite has become the de facto standard for evaluating AI performance.

Learn more at www.mlcommons.org.

The post MLCommons Releases MLPerf Client v1.0: A New Standard for AI PC and Client LLM Benchmarking appeared first on MLCommons.

MLCommons Releases MLPerf Client v0.6 With Expanded Hardware Support for AI PCs

MLCommons — Mon, 28 Apr 2025 15:04:31 +0000

Today, MLCommons^® is announcing the release of MLPerf^® Client v0.6, an update to the MLPerf Client consumer AI performance benchmark. This release extends support to a broader range of hardware and platforms, including AI PCs with dedicated neural processing units (NPUs), while enhancing usability.

MLPerf Client v0.6 builds on the foundation established by version 0.5, which debuted with LLM-focused performance tests using the Llama 2 7B model from Meta. With this latest release, MLCommons continues its mission to provide a transparent, standardized, and vendor-neutral way to measure AI performance across a growing range of PC hardware.

MLPerf Client represents a collaboration among leaders in the consumer computing space, including AMD, Intel, Microsoft, NVIDIA, Qualcomm Technologies, Inc., and top PC OEMs. These stakeholders have pooled resources and expertise to create a standardized performance benchmark for key consumer AI workloads.

Key Updates in v0.6:

Expanded Hardware Support:
New support for NPUs from Intel, alongside continued GPU acceleration via AMD, Intel, and NVIDIA hardware. This milestone makes MLPerf Client the first open benchmark to span both GPU and NPU acceleration on consumer platforms.
Improved Device Selection:
New device enumeration options help users better target and test systems with multiple capable accelerators, such as PCs equipped with multiple GPUs.
Updated Software Stack:
Includes the latest versions of ONNX Runtime, ONNX Runtime GenAI, and Intel OpenVINO, offering performance and compatibility improvements across supported platforms.

“MLPerf Client v0.6 reflects the rapid evolution of the AI PC landscape with the inclusion of NPU evaluation” said Yanni Minkdakis, co-chair of the MLPerf Client working group. “With expanded support and more flexible testing, it’s now easier than ever for the industry and consumers alike to evaluate real-world AI performance on next-generation devices.”

MLPerf Client v0.6 is available now as a free download at mlcommons.org.

About MLCommons

MLCommons is the world’s leader in AI benchmarking. An open engineering consortium supported by over 125 members and affiliates, MLCommons has a proven record of bringing together academia, industry, and civil society to measure and improve AI. The foundation for MLCommons began with the MLPerf benchmarks in 2018, which rapidly scaled as a set of industry metrics to measure machine learning performance and promote transparency of machine learning techniques. Since then, MLCommons has continued using collective engineering to build the benchmarks and metrics required for better AI – ultimately helping to evaluate and improve AI technologies’ accuracy, safety, speed, and efficiency.

MLPerf Client participation requires an MLCommons membership. For more information and details on becoming a member, please visit MLCommons.org or contact participation@mlcommons.org.

The post MLCommons Releases MLPerf Client v0.6 With Expanded Hardware Support for AI PCs appeared first on MLCommons.

MLCommons Introduces MLPerf Client v0.5

MLCommons — Wed, 11 Dec 2024 15:55:00 +0000

MLCommons®, the leading open engineering consortium dedicated to advancing machine learning (ML), is excited to announce the public release of the MLPerf® Client v0.5 benchmark. This benchmark sets a new standard for evaluating consumer AI performance, enabling users, press, and the industry to measure how effectively laptops, desktops, and workstations can run cutting-edge large language models (LLMs).

A Collaborative Effort by Industry Leaders

MLPerf Client represents a collaboration among technology leaders, including AMD, Intel, Microsoft, NVIDIA, Qualcomm Technologies, Inc., and top PC OEMs. These stakeholders have pooled resources and expertise to create a standardized benchmark, offering new insight into performance on key consumer AI workloads.

“MLPerf Client is a pivotal step forward in measuring consumer AI PC performance, bringing together industry heavyweights to set a new standard for evaluating generative AI applications on personal computers,” said David Kanter, Head of MLPerf at MLCommons.

Key Features of MLPerf Client v0.5

AI model: The benchmark’s tests are based on Meta’s Llama 2 7B large language model, optimized for reduced memory and computational requirements via 4-bit integer quantization.
Tests and metrics: Includes four AI tasks—content generation, creative writing, and text summarization of two different document lengths—evaluated using familiar metrics like time-to-first-token (TTFT) and tokens-per-second (TPS).
Hardware optimization: Supports hardware-accelerated execution on integrated and discrete GPUs via two distinct paths: ONNX Runtime GenAI and Intel OpenVINO.
Platform support: This initial release supports Windows 11 on x86-64 systems, with future updates planned for Windows on Arm and macOS.
Freely accessible: The benchmark is freely downloadable from MLCommons.org, empowering anyone to measure AI performance on supported systems.

Future Development

While version 0.5 marks the benchmark’s debut, MLCommons plans to expand its capabilities in future releases, including support for additional hardware acceleration paths and a broader set of test scenarios incorporating a range of AI models.

Availability

MLPerf Client v0.5 is available for download now from MLCommons .org. See the website for additional details on the benchmark’s hardware and software support requirements.

For more information and details on becoming a member, please visit MLCommons.org or contact participation@mlcommons.org.

The post MLCommons Introduces MLPerf Client v0.5 appeared first on MLCommons.

Announcing the New MLPerf Client Working Group

MLCommons — Wed, 24 Jan 2024 15:55:00 +0000

Today we are announcing the formation of a new MLPerf Client working group. Its goal is to produce machine learning benchmarks for client systems such as desktops, laptops, and workstations based on Microsoft Windows and other operating systems. The MLPerf suite of benchmarks is the gold standard for AI benchmarks in the data center, and we are now bringing our collaborative, community-focused development approach and deep technical understanding of machine learning (ML) towards creating a consumer client systems benchmark suite.

As the impact of AI grows and offers new capabilities to everyone, it is increasingly an integral part of the computing experience. Silicon for client systems incorporates AI-specific hardware acceleration capabilities of various types, and OS and application vendors are adding AI-driven features into software to boost productivity and to unleash the creativity of millions of end users. As these hardware and software capabilities proliferate, many ML models will execute locally on client systems. The industry will require reliable, standard ways to measure the performance and efficiency of AI acceleration solutions on client systems.

The MLPerf Client benchmarks will be scenario-driven focusing on real end-user use cases and grounded in feedback from the community. The first benchmark will focus on a large language model, specifically, the Llama 2 LLM. The MLCommons community has already navigated many of the challenges LLMs present in client systems, such as balancing performance against output quality, licensing issues involving datasets and models, and safety concerns through the incorporation of Llama 2-based workloads in the MLCommons training and inference benchmark suites. This learning will help jump-start this new client work.

Initial MLPerf Client working group participants include representatives from AMD, Arm, ASUSTeK, Dell Technologies, Intel, Lenovo, Microsoft, NVIDIA, and Qualcomm Technologies, Inc. among others.

“The time is ripe to bring MLPerf to client systems, as AI is becoming an expected part of computing everywhere,” said David Kanter, Executive Director at MLCommons®. “Large language models are a natural and exciting starting point for our MLPerf Client working group. We look forward to teaming up with our members to bring the excellence of MLPerf into client systems and drive new capabilities for the broader community.”

We’re happy to announce that Ramesh Jaladi, Senior Director of Engineering in the IP Performance group at Intel; Yannis Minadakis, Partner GM, Software Development at Microsoft; and Jani Joki, Director of Performance Benchmarking at NVIDIA have agreed to serve as co-chairs of the MLPerf Client working group. Additionally, Vinesh Sukumar, Senior Director, AI/ML Product Management at Qualcomm, has agreed to lead a benchmark development task force within the working group.

“Good measurements are the key to advancing AI acceleration,” said Jaladi. “They allow us to set targets, track progress, and deliver improved end-user experiences in successive product generations. The whole industry benefits when benchmarks are well aligned with customer needs, and that’s the role we expect the MLPerf Client suite to play in consumer computing.”

“Microsoft recognizes the need for quality benchmarking tools tailored to the AI acceleration capabilities of Windows client systems, and we welcome the opportunity to collaborate with the MLCommons community to tackle this challenge,” said Minadakis.

“The MLPerf benchmarks have served as a measuring stick for substantial advances in machine learning performance and efficiency in data center solutions,” said Joki. “We look forward to contributing to the creation of benchmarks that will serve a similar role in client systems.”

“Qualcomm is proud to advance the client ecosystem and looks forward to the innovative benchmarks that this MLPerf Working Group will establish for machine learning,” said Sukumar. “Benchmarks remain an important tool in the development and fine tuning of silicon, and MLCommons’ focus on end-user use cases will be key to on-device AI testing.”

We encourage all interested parties to participate in our effort. For more information on the MLPerf Client working group, including information on how to join and contribute to the benchmarks, please visit the working group page or contact the chairs via email at client-chairs@mlcommons.org.

The post Announcing the New MLPerf Client Working Group appeared first on MLCommons.