Announcing the results of the inaugural AlgoPerf: Training Algorithms benchmark competition

MLCommons — Thu, 01 Aug 2024 19:29:45 +0000

We are thrilled to announce the results and winners of the first MLCommons® AlgoPerf: Training Algorithms benchmark competition, a competition designed to find better training algorithms that speed up neural network training across a diverse set of workloads.

The AlgoPerf: Training Algorithms Competition

To make building useful neural network models less time-consuming and costly, we need better training algorithms. The MLCommons Algorithms working group has developed the open-source AlgoPerf: Training Algorithms benchmark to measure how much faster neural networks can be trained through advancements in underlying training algorithms, such as better optimizers or more effective hyperparameter choices.

The AlgoPerf: Training Algorithms benchmark evaluates the time training required for different training algorithms across multiple realistic deep learning workloads when running on a fixed hardware configuration. To encourage generally useful methods, submissions must fully specify any required workload-specific tuning. Participants could choose to submit under two separate tuning rulesets: the external tuning ruleset, designed to simulate tuning with a limited amount of parallel resources, or the self-tuning ruleset, designed to simulate fully automated tuning on a single machine.

Participation

The first iteration of the AlgoPerf: Training Algorithms competition attracted 18 submissions (with 15 being scorable) from 10 different teams. Scoring involved over 4000 individual training runs across the 14 workloads used in the benchmark. Participants included researchers from Concordia University, ELLIS Tübingen, Google, Max Planck Institute for Intelligent Systems, Meta AI, Meta Platforms, Michigan State University, Mila, Samsung AI, UCLA, UT Austin, the University of Cambridge, the University of West Indies, and the Vector Institute.

The submissions collectively explored many interesting techniques and implementation choices, including submissions using both of our supported frameworks, JAX and PyTorch. As required by the rules, all submissions are released publicly under an Apache 2.0 open-source license.

The Winners & Results

Congratulations to Aaron Defazio (Meta), Alice Yang (Meta), and Konstantin Mishchenko (Samsung AI) who came in first place in the self-tuning ruleset with their “Schedule Free AdamW” submission (see Table 2). For the external tuning ruleset (see Table 1, below), first place goes to the “Distributed Shampoo” submission of Hao-Jun Michael Shi, Tsung-Hsien Lee, Anna Cai, Shintaro Iwasaki, Wenyin Fu, Yuchen Hao, and Mike Rabbat (all Meta).

The external tuning ruleset saw four submissions beating the challenging prize-qualification baseline, improving over the state-of-the-art training algorithm. The “Distributed Shampoo” submission provides an impressive 28% faster model training compared to the baseline. “Schedule Free AdamW” was the only submission in the self-tuning ruleset that beat the prize-qualification baseline, improving over it by providing an 8% faster neural network training process.

Congratulations to the winners and all participants for their contributions to advancing neural network training algorithms!

Score	Submission	Submitters	Institutions	Framework
0.78	Shampoo Submission	Hao-Jun Shi, Tsung-Hsien Lee, Anna Cai, Shintaro Iwasaki, Wenyin Fu, Yuchen Hao, Mike Rabbat	Meta Platforms	PyTorch
0.71	Schedule Free AdamW	Aaron Defazio, Alice Yang, Konstantin Mishchenko	Meta AI, Samsung AI	PyTorch
0.64	Generalized Adam	George Dahl, Sourabh Medapati, Zack Nado, Rohan Anil, Shankar Krishnan, Naman Agarwal, Priya Kasimbeg, Vlad Feinberg	Google DeepMind	JAX
0.63	Cyclic LR	Niccolò Ajroldi, Antonio Orvieto, Jonas Geiping	MPI-IS, ELLIS Tübingen	PyTorch
0.59	NadamP	George Dahl, Sourabh Medapati, Zack Nado, Rohan Anil, Shankar Krishnan, Naman Agarwal, Priya Kasimbeg, Vlad Feinberg	Google DeepMind	JAX
0.57	Prize Qualification Baseline
0.49	Amos	Ran Tian	Google DeepMind	JAX
0.47	Caspr Adaptive	Sai Surya Duvvuri, Inderjit Dhillon, Cho-Jui Hsieh	UT Austin, Google, UCLA	JAX
0.37	Lawa Queue	Niccolò Ajroldi, Antonio Orvieto, Jonas Geiping	MPI-IS, ELLIS Tübingen	PyTorch
0.34	Lawa EMA	Niccolò Ajroldi, Antonio Orvieto, Jonas Geiping	MMPI-IS, ELLIS Tübingen	PyTorch
0.00	Schedule Free Prodigy	Aaron Defazio, Alice Yang, Konstantin Mishchenko	Meta AI, Samsung AI	PyTorch

Table 1: The external tuning leaderboard. All submissions to the external tuning ruleset, ranked by their benchmark score (first column). Benchmark scores are rounded to two significant digits, with higher scores indicating faster training. Note, the benchmark scores cannot be compared between the two rulesets.

Score	Submission	Submitters	Institutions	Framework
0.85	Schedule Free AdamW	Aaron Defazio, Alice Yang, Konstantin Mishchenko	Meta AI, Samsung AI	PyTorch
0.82	Prize Qualification Baseline
0.33	NadamW Sequential	George Dahl, Sourabh Medapati, Zack Nado, Rohan Anil, Shankar Krishnan, Naman Agarwal, Priya Kasimbeg, Vlad Feinberg	Google DeepMind	JAX
0.14	sinv6_75	Abhinav Moudgil	Mila, Concordia University	JAX
0.09	sinv6	Abhinav Moudgil	Mila, Concordia University	JAX
0.00	AdamG	Yijiang Pang	Michigan State University	PyTorch

Table 2: The self-tuning leaderboard. All submissions to the self-tuning ruleset, ranked by their benchmark score (first column). Benchmark scores are rounded to two significant digits, with higher scores indicating faster training. Note, the benchmark scores cannot be compared between the two rulesets.

To receive a cash prize, the competition rules require that at least one other submission outperforms the prize qualification baseline for the relevant ruleset, AND that none of the authors of this competing submission share an affiliation with either of the two MLCommons Algorithms working group chairs. This condition was met for the external tuning ruleset, and thus a cash prize of $25,000 will be awarded by MLCommons for the first place submission. Despite the outstanding performance of the first place submissions in the self-tuning ruleset, the prize requirement was not met, since several competing submissions involved overlapping affiliations with the working group chairs, and the prize qualification baselines were quite difficult to beat. The working group’s goal in designing the AlgoPerf: Training Algorithms benchmark competition was to, first and foremost, make sure that any submission that performed well under our rules had to achieve something truly impressive, and we are delighted that the first place submissions in both rulesets managed to produce such exceptional results.

To view the full results of the AlgoPerf: Training Algorithms competition, including the workload-specific performances of each submission, please visit the AlgoPerf results page. We plan to release a paper with a more in-depth discussion of the results after we are done analyzing them in detail.

The next steps for AlgoPerf

The first iteration of AlgoPerf: Training Algorithms demonstrated that neural network training can be accelerated significantly by improving the underlying training algorithms. This iteration was only the first step in driving innovation in machine learning algorithms. Now that we can reliably measure progress in training algorithms, we anticipate rapid progress in this field, both in terms of new research and better methods. The working group is already hard at work planning for the future of the benchmark. If you are interested in shaping this future, developing and scoring any particular submissions, or collaborating on research that builds on top of our benchmark, please consider joining the working group.

Acknowledgments

We extend our sincere thanks to Google for their generous support in providing computational resources to score and evaluate all submissions across the workloads. Our gratitude also goes to the entire MLCommons organization for supporting the Algorithms working group and funding the $50,000 prize pool. Special thanks are due to the members of the Algorithms Working Group who developed, implemented, and managed the benchmark competition. We particularly want to thank Priya Kasimbeg, the Engineering Lead of the working group, who led the scoring process.

About MLCommons and the Algorithms Working Group

MLCommons is the world leader in building benchmarks for AI. It is an open engineering consortium with a mission to make AI better for everyone through benchmarks and data.

The AlgoPerf: Training Algorithms benchmark was developed by the MLCommons Algorithms Working Group. Researchers from a variety of academic institutions and industry labs serve on the working group. The group’s mission is to create a set of rigorous and relevant benchmarks to measure neural network training speedups due to algorithmic improvements. For additional information on the Algorithms Working Group, and details on how to become a member or contribute to the benchmarks, please visit the working group website or reach out to algorithms-chairs@mlcommons.org.

The post Announcing the results of the inaugural AlgoPerf: Training Algorithms benchmark competition appeared first on MLCommons.

Announcing the MLCommons AlgoPerf Training Algorithms Benchmark Competition

MLCommons — Tue, 28 Nov 2023 16:32:28 +0000

Faster training allows researchers to build more capable machine learning (ML) models, but unlocking the most valuable capabilities requires improvements in every part of the training pipeline. The MLPerf Training benchmark suite has been extremely successful in encouraging innovation in neural network training systems, but more work needs to be done to encourage innovation in training algorithms. Improved training algorithms could save time, computational resources, and lead to better, more accurate, models. Unfortunately, as a community, we are currently unable to reliably identify training algorithm improvements, or even determine the state-of-the-art training algorithm. To accelerate this work, the MLCommons® Algorithms Working Group is delighted to announce the AlgoPerf: Training algorithms competition, which is designed to measure neural network training speedups due to algorithmic improvements (e.g. better optimizers or hyperparameter tuning protocols).

The AlgoPerf: Training algorithms benchmark is a competitive, time-to-result benchmark that runs on a fixed system and compares training algorithms on multiple deep learning workloads (see Table 1 below). In contrast, in MLPerf Training submitters typically compete on the ML training systems. For AlgoPerf, however, the hardware and lower-level software environments are fixed, so submitters must develop and compete on the basis of more efficient algorithms.

To ensure that the benchmark incentivizes generally useful training algorithms, submissions must simultaneously perform well across multiple workloads, including some randomized ones. This includes a wide variety of workloads across many domains to ensure that the results are broadly applicable and relevant to more ML practitioners. The competition will determine the best general-purpose method, measured by an aggregated score across all workloads. The current workloads are listed below.

Task	Dataset	Model
Clickthrough rate prediction	Criteo 1TB	DLRMSmall
MRI reconstruction	FastMRI	U-Net
Image classification	ImageNet	ResNet-50
Image classification	ImageNet	ViT
Speech recognition	LibriSpeech	Conformer
Speech recognition	LibriSpeech	DeepSpeech
Molecular property prediction	OGBG	GNN
Translation	WMT	Transformer

To further encourage generality, competition submissions must automate—and strictly account for—any workload-specific hyperparameter tuning they perform. Submissions are allowed under two separate tuning rulesets: an external tuning ruleset meant to simulate tuning with a fixed amount of parallel resources, or a self-tuning ruleset meant to simulate tuning on a single machine.

The Competition is Open NOW!

The AlgoPerf: Training algorithms benchmark competition opens on November 28, 2023, and is scheduled to close on March 28, 2024. To enter the competition please see the instructions on the competition website. Additionally, the accompanying technical report motivates and explains the design choices of the benchmark.

Sponsorship & Prize Money

MLCommons is offering a total prize pool of $50,000, to be awarded by a committee, for the top-performing submissions in each tuning ruleset.

We would also like to express our gratitude to Google for their generous support in providing computational resources to score the top submissions, and resources to help score promising submissions from submitters with more limited resources.

About the MLCommons Algorithms Working Group

The MLCommons AlgoPerf: Training algorithms benchmark was developed by the MLCommons Algorithms Working Group. Researchers from a variety of academic institutions and industry labs serve on the working group. The group’s mission is to create a set of rigorous and relevant benchmarks to measure neural network training speedups due to algorithmic improvements. For additional information on the Algorithms Working Group and details on how to become a member or contribute to the benchmarks, please visit the working group website or reach out to algorithms-chairs@mlcommons.org.

The post Announcing the MLCommons AlgoPerf Training Algorithms Benchmark Competition appeared first on MLCommons.

Algorithms Archives - MLCommons