AI Benchmark for Windows, Linux and macOS: Let the AI Games Begin...

While Machine Learning is already a mature field, for many years it was lacking a professional, accurate and lightweight tool for measuring AI performance of various hardware used for training and inference with ML algorithms. Today we are making a step forward towards standardizing the benchmarking of AI-related silicon, and present a new standard for all-round performance evaluation of hardware platforms capable of running machine and deep learning models.

Measuring AI Performance of Desktop CPUs and GPUs

AI Benchmark Alpha is an open source python library for evaluating AI performance of various hardware platforms, including CPUs, GPUs and TPUs. The benchmark is relying on TensorFlow machine learning library, and is providing a precise and lightweight solution for assessing inference and training speed for key Deep Learning models. AI Benchmark is currently distributed as a Python pip package and can be downloaded to any system running Windows, Linux or macOS.

In total, AI Benchmark consists of 42 tests and 19 sections provided below:

● Section 1: MobileNet-V2, Classification, [paper]
● Section 2: Inception-V3, Classification, [paper]
● Section 3: Inception-V4, Classification, [paper]
● Section 4: Inception-ResNet-V2, Classification, [paper]
● Section 5: ResNet-V2-50, Classification, [paper]
● Section 6: ResNet-V2-152, Classification, [paper]
● Section 7: VGG-16, Classification, [paper]
● Section 8: SRCNN 9-5-5, Image-to-Image Mapping, [paper]
● Section 9: VGG-19, Image-to-Image Mapping, [paper]
● Section 10: ResNet-SRGAN, Image-to-Image Mapping, [paper]
● Section 11: ResNet-DPED, Image-to-Image Mapping, [paper]
● Section 12: U-Net, Image-to-Image Mapping, [paper]
● Section 13: Nvidia-SPADE, Image-to-Image Mapping, [paper]
● Section 14: ICNet, Image Segmentation, [paper]
● Section 15: PSPNet, Image Segmentation, [paper]
● Section 16: DeepLab, Image Segmentation, [paper]
● Section 17: Pixel-RNN, Image Inpainting, [paper]
● Section 18: LSTM, Sentence Sentiment Analysis, [paper]
● Section 19: GNMT, Text Translation, [paper]

The tests are covering all major Deep Learning tasks and architectures, and are therefore useful for researchers, developers, hardware vendors and end-users running AI applications on their devices. Additional information about setup for each test (input and batch sizes, test modes) can be found on the ranking page.

Installation Instructions - Short Guide

For those who are familiar with Deep Learning and Tensorflow:

1. Install Python and Tensorflow.
2. Install AI Benchmark with pip: pip install ai-benchmark
3. Use the following python code to run the benchmark:

from ai_benchmark import AIBenchmark
results = AIBenchmark().run()
Or, on Linux systems you can simply type ai-benchmark in the command line to start the tests.

That's it, time to see the results! More information about library settings can be found here.

Installation Instructions - Detailed Guide

1. Download and install Python from python.org
If you are running Windows - add Python to the Windows Path using these instructions.
2. Install TensorFlow machine learning library:
2.1. If you do not have Nvidia / AMD GPUs, run pip install tensorflow from the command line.
2.2. If you want to check the performance of Nvidia graphic cards:
2.2.1. Download and install CUDA from Nvidia website.
2.2.2. Download and install cuDNN.
2.2.3. Run pip install tensorflow-gpu from the command line.
2.3. If you want to check the performance of AMD graphic cards: follow these instructions.
3. Run pip install ai-benchmark from the command line.
4. Type python in the command line and run the following commands in the opened console:

from ai_benchmark import AIBenchmark
results = AIBenchmark().run()
Or, on Linux systems you can simply type ai-benchmark in the command line to start the tests.

Public Ranking

On any system with TensorFlow framework, installing and running the benchmark takes just a couple of minutes, making it easy to assess the performance of various hardware configurations and software builds. By introducing a global ranking, we are targeting the following goals:

1.   Establishing an open standard for measuring the performance of AI-related hardware
2.   Showing the relative speed of various hardware used for training / inference with Deep Learning
3.   Showing the speed of training / inference for all major AI models on different hardware platforms
4.   Comparing the performance of different drivers / configs, software platforms and framework builds
5.   Connecting the dots between mobile and conventional Deep Learning

The current preliminary ranking is available here. The ranking will be significantly updated in the next weeks.

Contacts

The next AI Benchmark release is now in development, and we are happy to hear any feedback regarding the current version and suggestions for its improvement. For any proposals or additional information please contact andrey@vision.ee.ethz.ch.

28 June 2019 Andrey Ignatov | AI Benchmark

Copyright © 2018-2025 by A.I.

ETH Zurich, Switzerland