To convert any network to CMSIS-NN, the following key steps must be followed:

  • Check which layers are the supported layers.
    • If some layers are not supported, you should try to replace them with an equivalent combination of CMSIS-NN and CMSIS-DSP functions.
  • Check the data layout convention.
    • If the conventions are different, some weight reordering will be needed and should be tested with a float version of CMSIS-NN.
  • Compute activation statistics.
    • Choose the statistics that must be computed and use enough input patterns to generate those statistics.
  • Choose a quantization scheme.
    • Choose a word size.
    • Define a method to select a fixed-point format from the computed statistics and the word size.
  • Compute the layer Q-formats.
    • Compute input and output Q-format for each layer based upon the quantization scheme and the layer constraints. Bias and out shifts for fully connected and convolutional layers should be known after this step.
  • Generate the CMSIS-NN implementation:
    • Dump reordered and quantized coefficients for weights.
    • Dump quantized biases.
    • Dump function calls to CMSIS-NN functions.
    • Allocate needed buffers.
  • Test the final fixed-point version.
    • Ideally, test on the same set of test patterns as the original version.
    • If the final fixed-point version is not good enough, you may need to go back to a previous step, for example changing the quantization method or changing the network.
    • It is good to be able to play with the network to get a feeling for how it is behaving. It is completing the quantitative data about its performances.
  • Optimize the final CMSIS-NN code.
    • Use the most efficient version of each layer function.
    • Minimize the memory usage by reusing buffers as much as possible.
Previous Next