Benchmark Suite Results
MLPerf Automotive
The MLPerf Automotive benchmark suite measures the performance of computers intended for automotive, both for Advanced Driving Assistance System/Autonomous Driving (ADAS/AD) and In-Vehicle Infotainment (IVI) embedded systems. The main KPI is latency as automotive is a real-time, and often, functional safe system.
The MLPerf Automotive benchmark suite is a collaboration between MLCommons and Autonomous Vehicle Compute Consortium (AVCC). The benchmark suite is based on Technical Reports (TR003, TR004, TR007) published by AVCC and developed by MLCommons.
Results
MLCommons results are shown in an interactive table to enable you to explore the results. You can apply filters to see just the information you want and click across the top tabs to view the results visually. To see all result details, expand the columns by clicking on the “+” icon, which appears when you hover over “System Name” and subsequent columns.
Published results are sometimes modified or invalidated for various reasons. The change log contains information about changes made to any results after their initial publication.
Scenarios & Metrics
To enable representative testing of a wide variety of automotive platforms and use cases, MLPerf has defined two different scenarios as described below. A given scenario is evaluated by…
Scenario | Query Generation | Duration | Samples/query | Latency Constraint | Tail Latency | Performance Metric |
---|---|---|---|---|---|---|
Single stream | LoadGen sends next query as soon as SUT completes the previous query | 6636 queries | 1 | None | 99.9% | 99.9%-ile measured latency |
Constant Stream | LoadGen sends a new query at a constant rate | 100,000 queries | 1 | Benchmark specific | 99.9% | 99.9%-ile measured latency |
Benchmarks
Each benchmark is defined by a Dataset and Quality Target. The following table summarizes the benchmarks in this version of the suite (the rules remain the official source of truth):
Area | Task | Model | Dataset | QSL Size | Quality | Latest Version Available |
---|---|---|---|---|---|---|
Perception | 2D object detection | SSD | Cognata | 128 | 99.9% of FP32 | v0.5 |
Perception | 2D semantic segmentation | DeepLabv 3+ | Cognata | 128 | 99.9% of FP32 | v0.5 |
Perception | 3D object detection | Bevformertiny | nuScenes | 256 | 99% of FP32 | v0.5 |
Each benchmark requires the following scenarios:
Area | Task | Required Scenarios |
---|---|---|
Perception | 2D object detection | Single Stream, Constant Stream |
Perception | 2D semantic segmentation | Single Stream, Constant Stream |
Perception | 3D object detection | Single Stream, Constant Stream |
Divisions
MLPerf aims to encourage innovation in software as well as hardware by allowing submitters to reimplement the reference implementations. There are two Divisions that allow different levels of flexibility during reimplementation:
- The Closed division is intended to compare hardware platforms or software frameworks “apples-to-apples” and requires using the same model as the reference implementation.
- The Open division is intended to foster innovation and allows using a different model or retraining.
Category
MLPerf divides benchmark results into Categories based on availability.
- Available systems contain only components that are available for purchase or for rent in the cloud.
- Preview systems must be submittable as Available in the next submission round.
- Research, Development, or Internal (RDI) contain experimental, in development, or internal-use hardware or software.
The focus for the categories in automotive are on the level of maturity of the automotive system running the benchmark suite.
Category | Explanation | Auditable |
---|---|---|
Hardened System | Known as ECU; a general term for a computer inside a production vehicle | Yes |
Development System | Known as EVM (EValuation Model) and is generally what SiPs provide | Yes |
Engineering Sample | 3D object detectionVery early silicon, internal SoC R&D | No |
Submission Information
Each row in the results table is a set of results produced by a single submitter using the same software stack and hardware platform. Each Closed and Open division row contains the following information: