MAI 2025 Workshop

LIVE

Join the main workshop Zoom conference for a Q&A session: https://ethz.zoom.us/j/61668005465

SCHEDULE

Deploying Deep Learning Models on Mobile NPUs and Beyond: What's New in 2025?

07:00 Pacific Time ┈ Andrey Ignatov ┈ AI Benchmark Project Lead, ETH Zurich

Abstract: In this tutorial, we will first recall all basic concepts, steps and optimizations required for efficient AI inference on mobile NPUs. Next, we will go into more detail about the latest mobile platforms from Qualcomm, MediaTek, Google, Samsung, Unisoc and Apple released during the past year, and will compare their inference speed when running common computer vision models. We will talk about power efficiency of mobile NPUs, and will analyze their energy consumption during typical AI workloads. Finally, we will go beyond Android and iOS, covering the topics of AI model deployment on NPUs in Windows, Linux and MacOS systems, analyzing available ML frameworks and their performance.

Recent Mobile AI Advances and Case Studies

09:00 Pacific Time ┈ CM Cheng, YuSyuan Xu and Haoyun Chen ┈ MediaTek Inc.

Biography: Dr. Chia-Ming Cheng is a Senior Manager at MediaTek’s Computing and Artificial Intelligence Technology Group. He is currently responsible for the technical planning and management of AI technologies across a range of products, including smartphones, AR/VR devices, smart IoT solutions, and smart TVs. Dr. Cheng also leads initiatives in advanced AI technology through industry-academia collaborations and ecosystem partnerships. He has extensive experience in the research, design, and mass production of computational photography technologies for smartphones. Dr. Cheng holds both Master and Ph.D. degree in Computer Science from National Tsing-Hua University. His research interests primarily focus on computer vision, machine learning, computational photography, 3D modeling and rendering.

Yu-Syuan Xu is an Senior AI engineer at MediaTek. His research mainly focuses on deep learning and its applications, specifically on developing and deploying the latest and trendiest AI models on edge devices equipped with MediaTek APU. He is responsible for co-designing AI accelerators from an algorithmic point of view. He received both his B.S. degree and M.S. degree in computer science from National Tsing Hua University in 2017 and 2018, respectively, under the guidance of Prof. Chun-Yi Lee. You can learn more about his work on his website at https://xusean0118.github.io/ and on his Google Scholar profile here.

Hao-Yun Chen is an Senior AI engineer at MediaTek. His research mainly focuses on deep learning and its applications, specifically on developing and deploying the latest and trendiest LLM models on edge devices equipped with MediaTek APU. He is responsible for developing LLM long context technique to be compatible with Mediatek platform, with performance and quality optimization. He received both his B.S. degree and M.S. degree in computer science from National Tsing Hua University in 2018 and 2020, respectively, under the guidance of Prof. Shih-Chieh Chang. You can learn more about his work on on his Google Scholar profile here.

Abstract: In this talk, MediaTek will provide you with an overview of their software and hardware platforms and several mobile-oriented research topics. The first part of the talk would be devoted to the discussion of the recent mobile AI advances and MediaTek AI ecosystem. The second part will focus on two recent case studies: Vision Mamba and MOE, which are related to efficient model deployment of mobile AI hardware.

Quantized Image Super-Resolution on Mobile NPUs: Results and Top Solutions

09:40 Pacific Time ┈ Andrey Ignatov, ETH Zurich

Evaluation Platform:
Google Tensor NPU

09:50 Pacific Time RepNet-VSR: Reparameterizable Architecture for High-Fidelity Video Super-Resolution

Biao Wu, Diankai Zhang, Shaoli Liu, Si Gao, Chengjian Zheng, Ning Wang

☉ ZTE Corporation

10:05 Pacific Time CDVS: Compressed Domain On Device Memory Efficient 8K Video SlowMo

Jing Li, Chengyu Wang, Hamid Sheikh, SeokJun Lee

☉ Samsung Research America & Purdue University

Learned Smartphone ISP on Mobile GPUs Challenge: Results and Top Solutions

10:20 Pacific Time ┈ Andrey Ignatov, ETH Zurich

Evaluation Platform:
Snapdragon 8 Elite (Adreno GPU) / Dimensity 9400 (Mali GPU)

10:30 Pacific Time Learned Lightweight Smartphone ISP with Unpaired Data

Andrei Arhire, Radu Timofte

☉ Alexandru Ioan Cuza University

10:45 Pacific Time Compressed Domain Multiframe Processing

Chengyu Wang, Jing Li, Saurabh Kumar, Seok-Jun Lee, Hamid Sheikh

☉ Samsung Research America & SRI-Bangalore

11:00 Pacific Time Break & Lunch

RGB Photo Enhancement on Mobile GPUs Challenge: Results and Top Solutions

12:00 Pacific Time ┈ Andrey Ignatov, ETH Zurich

Evaluation Platform:
Snapdragon 8 Elite (Adreno GPU) / Dimensity 9400 (Mali GPU)

12:15 Pacific Time PETAH: Parameter Efficient Task Adaptation for Hybrid Transformers

Syed Shakib Sarwar, Mostafa Elhoushi, Maximilian Augusting, Yuecheng Li, Sai Zhang, Barbara De Salvo

☉ Meta Inc. & IEEE & University of Tubingen & New York University

12:30 Pacific Time ActNAS: Generating Efficient YOLO Models using Activation NAS

Sudhakar Sah, Ravish Kumar, Darshan Ganji, Ehsan Saboori

☉ Deeplite

12:45 Pacific Time FLAR-SVD: Fast and Latency-Aware Singular Value Decomposition for Model Compression

Moritz Thoma, Jorge Villasante, Emad Aghajanzadeh, Shambhavi Sampath, Pierpaolo Mori et al.

☉ BMW Group & Technical University of Munich

13:15 Pacific Time Robust 6DoF Pose Estimation Against Depth Noise and Evaluation on a Mobile Dataset

Zixun Huang, Keling Yao, Zhihao Zhao, Chuanyu Pan, Allen Yang

☉ UC Berkeley & Carnegie Mellon University & University of California

13:30 Pacific Time Cycle Training with Semi-Supervised Domain Adaptation for Real-Time Mobile Scene Detection

Huu-Phong Phan-Nguyen, Anh Dao, Tien-Huy Nguyen, Tuan Quang et al.

☉ University of Infomation Technology & Michigan State University & LPL Financial Corp et al.

13:45 Pacific Time RepFC: Universal Structural Reparametrization Block for High Performance

Shambhavi Balamuthu Sampath, Judeson Anthony Fernando, Moritz Thoma, Nael Fasfous et al.

☉ BMW Group & Technical University of Munich

14:00 Pacific Time Discussion & Wrap Up

CHALLENGES: ONGOING

4K Image Super-Resolution

Evaluation Platform:
Snapdragon 8 Elite NPU

4K Image Super-Resolution

Evaluation Platform:
Dimensity 9400 NPU

Video Super-Resolution

Evaluation Platform:
Arm Mali / Adreno GPU

Efficient LLMs

Evaluation Platform:
Raspberry Pi 8GB

Efficient Stable Diffusion

Evaluation Platform:
Apple M4 Neural Engine

Image Denoising

Evaluation Platform:
Arm Mali / Adreno GPU

Bokeh Effect Rendering

Evaluation Platform:
Arm Mali / Adreno GPU

RGB Photo Enhancement

Evaluation Platform:
Arm Mali / Adreno GPU

Learned Smartphone ISP

Evaluation Platform:
Arm Mali / Adreno GPU

MAI 2025 CHALLENGE REPORTS

Image Super-Resolution

Evaluation Platform:
Google Tensor TPU

RGB Photo Enhancement

Evaluation Platform:
Arm Mali / Adreno GPU

Learned Smartphone ISP

Evaluation Platform:
Arm Mali / Adreno GPU

MAI CHALLENGE WINNERS @ CVPR 2025

Image Super-Resolution

Learned Smartphone ISP

RGB Image Enhancement

Camera Scene Detection

Bokeh Effect Rendering

RGB Image Denoising

Video Super-Resolution

PREVIOUS CHALLENGES (2022)

Video Super-Resolution

Evaluation Platform:
MediaTek Dimensity APU

Image Super-Resolution

Evaluation Platform:
Synaptics Dolphin NPU

Learned Smartphone ISP

Evaluation Platform:
Snapdragon Adreno GPU

Bokeh Effect Rendering

Evaluation Platform:
Arm Mali GPU

Depth Estimation

Evaluation Platform:
Raspberry Pi 4

PREVIOUS CHALLENGES (2021)

Learned Smartphone ISP

Evaluation Platform:
MediaTek Dimensity APU

Image Denoising

Evaluation Platform:
Exynos Mali GPU

Image Super-Resolution

Evaluation Platform:
Synaptics Dolphin NPU

Video Super-Resolution

Evaluation Platform:
Snapdragon Adreno GPU

Depth Estimation

Evaluation Platform:
Raspberry Pi 4

Camera Scene Detection

Evaluation Platform:
Apple Bionic

CALL FOR PAPERS

Being a part of CVPR 2025, we invite the authors to submit high-quality original papers proposing various machine learning based solutions for mobile, embedded and IoT platforms. The topics of interest cover all major aspects of AI and deep learning research for mobile devices including, but not limited to:


• Efficient deep learning models for mobile devices	• Image / video super-resolution on low-power hardware
• General smartphone photo and video enhancement	• Deep learning applications for mobile camera ISPs
• Fast image classification / object detection algorithms	• Real-time semantic image segmentation
• Image or sensor based identity recognition	• Activity recognition using smartphone sensors
• Depth estimation w/o multiple cameras	• Portrait segmentation / bokeh effect rendering
• Perceptual image manipulation on mobile devices	• NLP models optimized for mobile inference
• Artifacts removal from mobile photos / videos	• RAW image and video processing
• Low-power machine learning inference	• Machine and deep learning frameworks for mobile devices
• AI performance evaluation of mobile and IoT hardware	• Industry-driven applications related to the above problems

To ensure high quality of the accepted papers, all submissions will be evaluated by research and industry experts from the corresponding fields. All accepted workshop papers will be published in the CVPR 2025 Workshop Proceedings by Computer Vision Foundation Open Access and IEEE Xplore Digital Library. The authors of the best selected papers will be invited to present their work during the actual workshop event at CVPR 2025.

The detailed submission instructions and guidelines can be found here.

SUBMISSION DETAILS @ CVPR

For AIM challenge papers submission @ ICCV, please refer to these instructions.

Format and paper length	A paper submission has to be in English, in pdf format, and at most 8 pages (excluding references) in double column. The paper format must follow the same guidelines as for all CVPR 2025 submissions: https://cvpr.thecvf.com/Conferences/2025/AuthorGuidelines
Author kit	The author kit provides a LaTeX2e template for paper submissions. Please refer to this kit for detailed formatting instructions: https://github.com/cvpr-org/author-kit/archive/refs/tags/CVPR2025-v3.1(latex).zip
Double-blind review policy	The review process is double blind. Authors do not know the names of the chair / reviewers of their papers. Reviewers do not know the names of the authors.
Dual submission policy	Dual submission is allowed with CVPR2025 main conference only. If a paper is submitted also to CVPR and accepted, the paper cannot be published both at the CVPR and the workshop.
Proceedings	Accepted and presented papers will be published after the conference in CVPR Workshops proceedings together with the CVPR 2025 main conference papers.
Submission site	https://cmt3.research.microsoft.com/MAI2025

TIMELINE @ ICCV

Workshop Event	Date [ 5pm Pacific Time, 2025 ]
Website online	January 25
Paper submission server online	February 13
Paper submission deadline [challenge papers]	July 9
Paper decision notification	July 11
Camera ready deadline	August
Workshop day	October (TBA)

TIMELINE @ CVPR

Workshop Event	Date [ 5pm Pacific Time, 2025 ]
Website online	January 25
Paper submission server online	February 13
Paper submission deadline [early submission]	March 10
Paper decision notification [early submission]	March 31
Paper submission deadline [late submission & challenge papers]	March 28
Paper decision notification [late submission]	March 31
Camera ready deadline	April 7
Workshop day	June (TBA)

DEEP LEARNING ON MOBILE DEVICES: TUTORIAL

Have some questions? Leave them on the AI Benchmark Forum

RUNTIME VALIDATION

In each MAI 2025 challenge track, the participants have a possibility to check the runtime of their solutions remotely on the target platforms. For this, the converted TensorFlow Lite models should be uploaded to a special web-server, and their runtime on the actual target devices will be returned instantaneously or withing 24 hours, depending on the track. The detailed model conversion instructions and links can be found in the corresponding challenges.

Besides that, we strongly encourage the participants to check the speed and RAM consumption of the obtained models locally on your own Android devices. This will allow you to perform model profiling and debugging faster and much more efficiently. To do this, one can use AI Benchmark application allowing you to load a custom TFLite model and run it with various acceleration options, including CPU, GPU, DSP and NPU:

1. Download AI Benchmark from the Google Play / website and run its standard tests.
2. After the end of the tests, enter the PRO Mode and select the Custom Model tab there.
3. Rename the exported TFLite model to model.tflite and put it into the Download folder of your device.
4. Select your mode type, the desired acceleration / inference options and run the model.

You can find the screenshots demonstrating these 4 steps below: