Arm Object Detection Processor 

The most efficient way to detect people and objects on mobile and embedded platforms.

The second generation of the Arm Object Detection processor enables sophisticated object detection in a new generation of smart cameras and other vision-based devices.

With a fixed-function, highly tuned computer vision pipeline, the Object Detection processor continuously scans every frame and provides a list of detected objects, along with their location within the scene.

Developed by Arm’s machine learning experts, the processor’s people model allows you to detect not only whole human forms, but also faces, heads and shoulders, and even determine the direction each person is facing. Rich and detailed metadata allow even more information to be extracted from each frame.

Key Features

  • Detects object in real time with Full HD at 60fps.
  • Object sizes from 50x60 pixels to full screen.
  • Virtually unlimited objects detected per frame.
  • Detailed people model provides rich metadata and allows detection of direction, trajectory, pose and gesture.
  • Advanced software running on accompanying application processor allows for higher-level behaviour to be determined, including sophisticated inter-frame tracking.
  • Additional software libraries enable higher-level, on-device features, such as face recognition.

Find out more

To find out more about Arm Object Detection processor, contact us. Contact us

To learn more about Machine Learning on Arm, visit our ML Developer community. Learn more

Key Benefits

  • Cutting-edge people detection running on mobile or embedded cameras.
  • As a pre-processor detecting regions of interest, the Object Detection processor can be combined with Arm Cortex CPUs, Arm Mali GPUs or the Arm Machine Learning processor for additional local processing, significantly reducing the overall compute requirement.
  • Enables cloud-connected cameras to limit up-streaming to when people are detected, significantly reducing bandwidth and cloud storage.
  • The Object Detection Processor data streams amount to a few kilobytes, reducing bandwidth to the cloud and enabling aggregation of several thousand streams per server compared to a few hundred video streams providing significant economies of scale.



Smart camera

Small area


  • Input resolution: up to full HD (1920x1080) @ 60fps (no dropped frames).
  • Camera input: either raw input from camera or from ISP.
  • Minimum object size detectable: 50x60 pixels.
  • Maximum object size detectable: full frame.
  • Maximum number of objects per frame: virtually unlimited.
  • Latency: four frames.
  • Software included: drivers, people model, object tracking library and sample applications.

Webinar - Project Trillium: Optimizing ML Performance for any Application

Project Trillium is a suite of Arm IP designed to deliver scalable ML and neural network functionality at any point on the performance curve, from sensors, to mobile, and beyond. 


Find out more