Define the model

Note: You are not asked to do anything in this section. It is only here for completeness.

There are different ways to implement image classification, but recent work in artificial intelligence has shown that a certain type of neural network, called a Convolutional Neural Network (CNN), is particularly good at classifying images. For an introduction to CNNs read this article.

To recognize smiling faces, we will use the Smile model as defined by OpenMV. The network is a three-layer CNN as shown in the figure below:

The network is composed of the following layers:

  • Convolution layer - responsible for extracting features from the image.

  • Dropout layer - responsible for avoiding overfitting by ignoring random nodes during the training phase. For more information read this dropout article.

  • Rectified Linear Unit (ReLU) - the activation function responsible for introducing non-linearity in the model. The function returns 0 if it receives any negative input, but for any positive value x it returns that value back. Read this ReLU article for a more thorough explanation.

  • Pooling layer - responsible to progressively reduce the spatial size of the model reducing the number of parameters and the amount of computation in the network, and hence also controlling overfitting. Read more here.

  • Inner product (Inp) or fully connected layer.

The above image has been generated using this online tool.

Previous Next