This tutorial assumes you already have a TensorFlow .pb model file using 32-bit floating point weights. If your model is in a different format (Keras, PyTorch, Caffe, MxNet, CNTK etc) and you want to deploy it using TensorFlow then you'll need to use a tool to convert it to the TensorFlow format first.

There are various projects and resources building up around converting model formats – both MMdnn and Deep learning model converter are useful resources, and the ONNX format has potential to vastly simplify this in the future.

The most important preparation that you can do is to ensure that the size and complexity of your trained model is suitable for the device that you intend to run it on. To ensure this:

  • If you are using a pre-trained model for feature extraction or transfer learning, then you should consider using mobile-optimized versions such as MobileNet, TinyYolo and so on.
  • If you designed the architecture yourself, then you should consider adapting the architecture for faster execution and smaller size. An example of this is using depth-separable convolutions where possible, as in MobileNet. This provides better performance and accuracy improvements than by post-processing the model file after training.

This tutorial uses TensorFlow's graph_transforms tool, which is built from the TensorFlow source with this command:

bazel build tensorflow/tools/graph_transforms:transform_graph

 For more details on how to install and build TensorFlow, see the TensorFlow documentation.

Overview Node names