Define the problem and the model
Our objective here is to find the best way to have your embedded device run image recognition, in order to make your edge device more responsive, reliable, energy efficient and secure by ensuring that your data is processed on the device, rather than in the cloud.
Neural networks (NN) are a class of machine learning (ML) algorithms that have demonstrated good accuracy on image classification, object detection, speech recognition and natural language processing applications. The most popular types of neural networks are multi-layer perceptron (MLP), convolutional neural networks (CNN) and recurrent neural networks (RNN).
In this case we have chosen to use a CNN, provided in the Caffe examples, for CIFAR-10 image classification task, where the input image passes through the CNN layers to classify it into one of the 10 output classes. Typical CNN models have different types of layers to process the input data, below you can see the ones in the chosen CNN:
The input to the model is a 32x32 pixel color image, which will be classified into 10 classes (cat, deer, dog, horse, etc.) by the CNN. The classification is obtained by having the data flow through the following layers:
- Convolution layer - responsible for extracting features from the image.
- Pooling layer - responsible to progressively reduce the spatial size of the model reducing the number of parameters and the amount of computation in the network, and hence also controlling overfitting. Read more about Pooling on this article.
- Rectified Linear Unit (ReLU) - the activation function responsible for introducing non-linearity in the model. The function returns 0 if it receives any negative input, but for any positive value x it returns that value back. Read this ReLU article for a more thorough explanation.
- Inner product (Inp) or fully connected layer.
- Softmax is used to classify the output into one of the categories, by producing a probability.
For this guide we have pretrained the NN with Caffe, and it can be found on GitHub. If you wish to learn how to train the model on your own you can follow this tutorial. Below you can see the topology of the pretrained network that we will be using for the rest of this tutorial. Note: that this model is not very accurate we have chosen it only for simplicity.
The above image has been generated using this online tool.