Prepare the Graph for Inference

To prepare the graph for inference with TensorFlow Lite or Arm NN, optimize the graph for inference, and freeze it:

  1. Add fake quantization layers to the graph. This modifies the way the inference graph is exported, to make sure that it is exported with the quantization information in the right format. To add the fake quantization layers, call tf.contrib.quantize.create_eval_graph() on the inference-ready graph before saving it.

    For CifarNet, you can do this by modifying the file models/research/slim/ and adding tf.contrib.quantize.create_eval_graph() before graph_def = graph.as_graph_def() on line 118.

  2. Export the inference graph to a protobuf file. For CifarNet this is done using:

    python --model_name=cifarnet --dataset_name=cifar10 --output_file=/tmp/cifarnet_inf_graph.pb

    At this point, the graph does not contain your trained weights.

  3. Freeze the graph using the freeze_graph tool. You can specify any checkpoint during training for this. The command to freeze the graph is:

    python -m \
    --input_graph=<your_graph_location> \
    --input_checkpoint=<your_chosen_checkpoint> \
    --input_binary=true \
    --output_graph=<output_directory> \

    For CifarNet, using the last checkpoint, you can do this with the commands:

    export LAST_CHECKPOINT=`head -n1 /tmp/cifarnet-model/checkpoint | cut -d'"' -f2`
    python -m \
    --input_graph=/tmp/cifarnet_inf_graph.pb \
    --input_checkpoint=${LAST_CHECKPOINT} \
    --input_binary=true \
    --output_graph=/tmp/frozen_cifarnet.pb \
Previous Next