The possibilities of a dual Cortex-A53 SoC with a Mali GPU

This SoC design highlights the performance that is available in an SoC designed from IP within the Arm Flexible Access program. Before being included in the Arm Flexible Access program, the Cortex-A53 already had a reputation as a widely used, high-end processor with 64-bit capabilities. In addition to being high performing, the Cortex-A53 has low-power usage.

There are two Cortex-A53 processors in this SoC, which substantially increases the performance that is available. The combination of high-performance with low-power usage gives a range of scenarios under which you could use this SoC. Specifically, IoT and mobile devices could benefit from using a system based around this design. The next two sections explore a machine learning use case and a smart device use case.

Machine Learning at the edge

Machine Learning (ML) performs computational tasks by recognizing patterns and making inferences. An inference is a process of applying models, that are built using sample data, to accomplish a defined task. For example, the task could be image recognition in a frame that is received from a camera. Building the models involves a process that is called training. ML algorithms can continue to learn after the models have been built. Therefore, the algorithms can improve over time and adapt to changes.

ML is moving out of the cloud and into the devices that gather the data. This trend is called ML moving to the edge. The reasons for this trend include efficiency, speed, privacy, and security. The emergence of connected devices in new areas, like advanced autonomous cars, is also accelerating the process.

This SoC can support Machine Learning at the edge. This means that the analysis is done in the same place that the data is collected. This approach represents a marked alternative to sending the data to the cloud for analysis. Eliminating the delay involved in bouncing information to the cloud and back helps give the real-time responses that an end user requires. The solution also works when the cloud is unreachable.

Software support

Arm provides software platforms to complement a system that has the hardware capabilities to run a Neural Net (NN). The following table gives a brief description of two Neural Net software platforms Arm provides.

Software Description 
 Arm NN

 An inference engine that bridges the gap between existing NN frameworks and underlying Arm IP, including the Cortex-A53 and Mali-G52. Arm NN works with models trained by existing neural net frameworks like:

  • Caffe
  • Tensorflow Lite
  • ONNX and PyTorch

The most recent version of Android is supported through the NNAPI. This API enables performance acceleration through Mali GPUs, Ethos-N NPUs, and Cortex-A CPUs.

Importantly, Arm NN abstracts the details of the underlying Arm processor IP. This abstraction allows NN frameworks to use the latest hardware features without the need to port between platforms and generations. Execution of ML algorithms is optimized and can run on a multiprocessor.

The Arm NN SDK is supplied as open-source software and enables ML workloads on Android and Linux edge devices.

Arm Compute Library

A convenient repository of low-level kernels that developers can use to accelerate their algorithms and applications. The functions have been implemented for:

  • The Arm Cortex-A family of CPUs
  • The Arm Mali family of GPUs
Case study

To be effective, inferences must be completed within time constraints. These constraints mean that the performance of a system determines what kind of inference can be completed on time. For example, keyword detection is less expensive than voice and image recognition. Autonomous driving is even more expensive than voice and image recognition.

In terms of machine learning, the inclusion of a six-core Mali-G52 in the SoC gives an advantage. The system is potentially capable of image recognition. Imagine a camera on a door that provides access when it recognizes the face of a person. For a workable solution, the response must be instant as soon as a human is perceived. The system must be able to make five inferences a second. This figure is the inference rate, which is also referred to as the frame rate. In other words, the inference must complete in 200ms. For face unlocking using SSD-Mobile Net v1, we estimate that each inference would take about 20ms. This figure is very usable, because the system can complete 50 inferences a second or run other workloads sequentially.

Smart devices

A high performance yet energy efficient SoC is ideal for powering a sophisticated smart device. Ultimately, a smart device must run an operating system, and this SoC can support this requirement.

The inclusion of a 6-core Mali-G52 processor allows the SoC to bring premium visual experiences to the end user.

You could use this SoC design for any system that requires high-end graphics capabilities, including high-end IoT devices. For example, you could use this SoC for:

  • A smartphone
  • A fridge with a touch-screen interface
  • A printer with a touch-screen interface
Previous Next