This guide has shown how we identified optimization opportunities within the Chromium open source codebase. It also provides detail about a number of specific optimizations made using Neon intrinsics.
One additional notable optimization was a 20% increase in performance by optimizing
inflate_fast() to use Neon intrinsics to perform long loads and stores in the byte array.
The end result of all these optimizations was a 2.9x boost to PNG decoding performance. The following figure shows the decoding time improvement (in ms) for test images comparing vanilla (unoptimized) zlib to Neon-optimized zlib:
Optimizations were validated using representative data sets. For PNG, we used three sets of test data:
- An internal data set for Chromium developers, with 92 images.
- The public Kodak data set, with 24 images.
- The public Google doodles data set, with 154 images.
For more information about Neon intrinsics, see the Neon Intrinsics Reference.