The performance reports generated by Mali Offline Compiler are based only on the shader source code visible to the compiler. They are not aware of the actual uniform values or texture sampler configuration for any specific draw call, or any data-centric effects such as cache miss overheads.
The texture format and the filtering type used, can impact texture unit performance. Trilinear filtering (
GL_LINEAR_MIPMAP_LINEAR) takes twice as long as bilinear filtering (
GL_LINEAR_MIPMAP_NEAREST), and anisotropic filtering can be scaled by both the probe type and the number of anisotropic sample probes made. Mali Offline Compiler assumes simple bilinear filtering for all samples, which is the fastest type supported by the hardware. If you know a draw call is using trilinear filtering for texture samples, you should double the cycle cost of the texture accesses reported in the performance report.
Arm Streamline, also included with Arm Mobile Studio, samples performance data from the Mali GPU hardware while your application runs on your target device. You can supplement Mali Offline Compiler performance reports with this data. For example, measure the number of multi-cycle texture operations being performed to validate the assumption that all accesses in your application are bilinear accesses.