Analyzing memory bandwidth

Minimizing GPU memory bandwidth is always a good optimization objective because memory accesses to external DRAM are very power intensive. A good rule of thumb is 100mW per GB/s of bandwidth used.

Memory rule of thumb

Mali memory bandwidth

The Mali Memory Bandwidth chart shows the amount of memory traffic between the GPU and the downstream memory system. Depending on the device, these accesses may go directly to external DRAM, or may be sent through additional levels of system cache outside of the GPU.

Mali memory bandwidth chart in Streamline

Mali memory stall rate

The Mali Memory Stall Rate chart shows the memory stall rate seen by the GPU when attempting to make accesses to the downstream memory system.

Mali memory stall rate chart in Streamline

Some stalls are expected, particularly for high-end devices that have lots of shader cores all trying to write data in parallel. This is because the GPU can read and write data faster than the memory system can provide it. If you see that the stall rate is constant at around 30% or higher, this is indicative of content which needs optimizing. 

Mali memory read latency

The Mali Memory Read Latency chart shows the memory read latency rate seen by the GPU when making memory system accesses. The chart splits the data so that you can see how many data beats were returned more than a certain number of cycles after the transaction started. A high read latency over 256 cycles could indicate that your content is requesting more data than the memory system in the device can provide.

Mali memory read latency chart in Streamline

Memory read latency is a property of the device, so the only way to reduce it is to reduce the amount of data being requested. 


Previous Next