Build an Android Chat App with Llama, Arm KleidiAI, ExecuTorch, and XNNPACK
In this code-along, we’ll be building a ChatAI application for Android, optimizing and deploying a local large language model (LLM) directly onto a mobile device. We’ll be using the latest AI technologies, including ExecuTorch (a PyTorch framework for running AI models on edge devices), XNNPACK (a floating-point neural network library optimized for Arm), KleidiAI (Arm-optimized kernels for neural network operations), and the Llama 3.2 1B Instruct model.
You will learn:
- How to set up an ExecuTorch development environment
- How KleidiAI kernels increase neural networks performance
- About quantizing LLMs to boost inference speeds
- How to build and deploy an Android application with a local LLM and inference framework
Watch the on-demand session below, or start building with the Build an Android Chat App learning path and follow the same workflow at your own pace.
Host
Michael Hall
Principal SW Engineer – Developer Evangelist
@mhall119 on Discord
Michael is a technology advocate and AI innovator at Arm, dedicated to empowering developers and advancing open ecosystems. He leads efforts to make machine learning tools and frameworks more accessible and efficient across platforms.
Recommendation For You
Mobile, Graphics, and Gaming
CFrom documentation and tutorials to specialized tools and libraries, here’s everything you need to build mobile applications on Arm-based devices.
Arm Developer Program
Connect with a global community, access the latest tools and technical resources, and accelerate your software development on Arm.
