Improve PyTorch App Performance with Android NNAPI Support

On-device machine learning (ML) enables low latency, better power efficiency, robust security, and new use cases for the end user. Currently, there are several ways to run inference on mobile, with many developers wondering which one they should use.

The Android Neural Networks API (NNAPI) is designed by Google for running computationally intensive operations for ML on Android mobile devices. It provides a single set of APIs to benefit from available hardware accelerators, including GPUs, DSPs, and NPUs.

In this talk, we will share our experience in running PyTorch Mobile with NNAPI on various mobile devices. We hope that this will provide developers with a sense of how these models are executed on mobile devices through PyTorch with NNAPI.