Arm Mali GPUs include an Early-Z algorithm The Early-Z algorithm improves performance by doing an early depth check to remove overdrawn fragments before the GPU wastes effort running the shaders for them.
The Arm Mali GPU typically executes the Early-Z algorithm on most content, but there are cases where, to preserve correctness, the algorithm is not executed. However, determining where Early-Z will not be executed is difficult to control within Unity, because it depends on both the Unity engine and the code that generated by the compiler. But there are some signs that you can look for in your code.
When compiling your shader for mobile, look at your code and make sure that the shader does not fall into one of the following categories. Falling into one of the following categories can mean that either Early-Z cannot be enabled, or that results are incorrect:
- Shader has side effects means that a shader thread modifies global state during its execution, so executing the shader a second time might produce different results. Typically, shader has side effects means that your shader writes to a shared read/write memory buffer like shader storage buffer objects or images. For example, if you create a shader that increments a counter to measure performance, this shader has side effects.
The following are not classed as side effects:
- Read-only memory accesses
- Writes to write-only buffers
- Purely local memory accesses
- If the fragment shader can call discard() during its execution, the Arm Mali GPU cannot enable Early-Z. This is because the fragment shader can discard the current fragment. But the depth value was previously modified by the Early-Z test and this modification cannot be reverted.
- If Alpha-to-coverage is enabled, the fragment shader computes data that is later accessed to define the alpha.
For example, when rendering the leaves of a tree, they are typically represented as a plane. The region of the leaf that is transparent or opaque is defined by the texture. If Early-Z is enabled, you get incorrect results. This is because part of the scene can be occluded by a transparent part of the plane.
- If your fragment shader writes to gl_FragDepth, the Arm Mali GPU cannot perform the Early-Z test. Therefore, the depth value used for depth testing does not come from the vertex shader.