Test the result
It is very important that you test the final result. Because of effects present in the fixed-point implementation but not modeled in the ML framework, you cannot rely only on the performance estimations done in the framework.
The accuracy of the final version depends on the details of the quantization. Different quantization schemes can give very different performances. So, it is important that you test the final implementation and that you experiment with different quantization schemes.
The testing should be done on the full set of test patterns, which should be different from the training and validation patterns.
It is possible to communicate with a CMSIS-NN implementation running on a board. But the set of test patterns is expected to be quite big. This means that it may be faster to use a CMSIS-NN implementation running on your desktop to do the final testing.
Choose a development environment in which it is easy to call C code. With such an environment, a CMSIS-NN implementation can be used directly for testing.
CMSIS-NN compiled in CM0 mode, which is ARM_MATH_DSP undefined in the C code, only requires implementation of _SSAT and _USAT. C implementations of those intrinsics are available in the CMSIS library in CMSIS/Core/Include.
With a little work, it is easy to get CMSIS-NN running on a desktop or laptop and use it directly.
Once the CMSIS-NN implementation of the network can be used directly, you can compute some metrics by using your test patterns. Then you should compare those results with the metrics for the original network.
It is also good to be able to play with the network to get an intuitive understanding of how it behaves. It does not replace quantitative metrics, but it is a good complement.
In the following graphics, you can see a comparison of the reference network for keyword spotting, running in an ML framework, and a q15 CMSIS-NN implementation. In this UI, the user can play with different noise levels and different words and see how the final CMSIS-NN implementation is behaving compared to the original implementation in the ML framework.