High load store

To improve the performance of applications that are GPU-limited, and that have high shader loads dominated by load/store operations, you should improve memory access efficiency and vectorization in your shader programs.

To reduce a high load/store load:

  1. Improve access density, by using vector loads in compute shaders, and access patterns that touch adjacent data from adjacent threads in each warp. This will enable a single cache line access to return data for multiple threads.
  2. Reduce cache pressure, by reducing precision and improving spatial locality of accesses.
  3. Avoid using imageLoad() calls for read-only texture accesses. Use texture() calls instead.
  4. Avoid using atomic calls, because they have a high per-thread cost.
Previous Next