This guide has shown how we identified optimization opportunities within the Chromium open-source codebase. The guide also provides detail about several specific optimizations made using Neon intrinsics.
One more notable optimization was a 20% increase in performance by optimizing
inflate_fast() to use Neon intrinsics to perform long loads and stores in the byte array.
The result of all these optimizations was a 2.9x boost to PNG decoding performance. The following figure shows the decoding time improvement, in milliseconds, for test images comparing unoptimized zlib to Neon-optimized zlib:
Optimizations were validated using representative data sets. For PNG, we used three sets of test data:
- An internal data set for Chromium developers, with 92 images
- The public Kodak data set, with 24 images
- The public Google doodles data set, with 154 images
For more information about Neon intrinsics, see the Neon Intrinsics Reference.