Choose a quantization scheme

From previous histograms, we can see that the values are often concentrated. This means that several strategies may be tested:

  • Using the full range of values, quantization based upon the min and max
  • Focusing only on the most likely values, quantization using the range of values making 80% of the histogram, or any other percent you may want to try
  • More complex schemes to detect and remove outliers from this distribution of values

You need to choose one scheme and then do the quantization based on this choice. You’ll have to experiment with several schemes, because results can vary depending on the network and training patterns.

For some networks, you may want to use the full range and base the quantization on the min and max. For other networks, you may get better results by using a smaller range than the min and max and removing outliers from the distribution.

A higher range means that you decrease the probability of internal saturations or sign inversions, but because the statistics are only on input and output and not intermediate computations, this problem can still occur. But having a higher range also means that you have fewer bits for the fractional part and less accuracy. Testing is needed to find the right trade-off.

There is also another factor to take into consideration: the chosen word size. The number of fractional bits you can represent depends on the range of values and the word size.

Most CMSIS-NN functions have an 8-bit version and a 16-bit version. Choosing one of these versions be part of the definition of the quantization scheme. This choice must be the same for all layers, or conversions between 8-bit and 16-bit would have to be introduced between layers.

Previous Next