Run an LLM (Llama 3.1 and 3.2) chatbot with PyTorch using KleidiAI on Arm servers
Learn how to run Llama 3.1 and 3.2 using PyTorch, accelerated by KleidiAI on an Arm-based AWS instance. Use Torchchat and Streamlit to run the model through a web interface. Follow along with the Learning Path at https://learn.arm.com/learning-paths/servers-and-cloud-computing/pytorch-llama/ 0:00
Intro 1:00 Request access to the Meta Llama models
1:48 Create the Arm-based AWS EC2 instance
3:40 Update and install the required software
5:45 Download and quantize the Llama 3.1 model 6:14 Test that the model is working
6:40 Run the Torchchat backend and Streamlit frontend
8:25 Modify the Learning Path to run Llama 3.2
13:40 Outro