Optimize Trained GPU Model

Types of Optimizations Applied for Inference

  • Remove training-only operations (checkpoint saving, drop out)
  • Strip out unused nodes
  • Remove debug operations
  • Fold batch normalization ops into weights (super cool)
  • Round weights
  • Quantize weights

Optimize Models

Summarize Graph Utility


In [ ]:
%%bash 

which summarize_graph

In [ ]:
%%bash

ls -l /root/models/optimize_me/

In [ ]:
%%bash

summarize_graph --in_graph=/root/models/optimize_me/unoptimized_gpu.pb

Strip Unused Nodes


In [ ]:
%%bash

transform_graph \
--in_graph=/root/models/optimize_me/unoptimized_gpu.pb \
--out_graph=/root/models/optimize_me/strip_unused_optimized_gpu.pb \
--inputs='x_observed,weights,bias' \
--outputs='add' \
--transforms='
strip_unused_nodes'

In [ ]:
%%bash

ls -l /root/models/optimize_me/

In [ ]:
%%bash

summarize_graph --in_graph=/root/models/optimize_me/strip_unused_optimized_gpu.pb

In [ ]:
%%bash

benchmark_model --graph=/root/models/optimize_me/strip_unused_optimized_gpu.pb --input_layer=weights,bias,x_observed --input_layer_type=float,float,float --input_layer_shape=:: --output_layer=add

Fold Constants


In [ ]:
%%bash

transform_graph \
--in_graph=/root/models/optimize_me/unoptimized_gpu.pb \
--out_graph=/root/models/optimize_me/fold_constants_optimized_gpu.pb \
--inputs='x_observed,weights,bias' \
--outputs='add' \
--transforms='
fold_constants(ignore_errors=true)'

In [ ]:
%%bash

ls -l /root/models/optimize_me/

In [ ]:
%%bash

summarize_graph --in_graph=/root/models/optimize_me/fold_constants_optimized_gpu.pb

In [ ]:
%%bash

benchmark_model --graph=/root/models/optimize_me/fold_constants_optimized_gpu.pb --input_layer=x_observed,bias,weights --input_layer_type=float,float,float --input_layer_shape=:: --output_layer=add

Fold Batch Normalizations

Must run Fold Constants first!


In [ ]:
%%bash

transform_graph \
--in_graph=/root/models/optimize_me/fold_constants_optimized_gpu.pb \
--out_graph=/root/models/optimize_me/fold_batch_norms_optimized_gpu.pb \
--inputs='x_observed,weights,bias' \
--outputs='add' \
--transforms='
fold_batch_norms
fold_old_batch_norms'

In [ ]:
%%bash

ls -l /root/models/optimize_me/

In [ ]:
%%bash

summarize_graph --in_graph=/root/models/optimize_me/fold_batch_norms_optimized_gpu.pb

In [ ]:
%%bash

benchmark_model --graph=/root/models/optimize_me/fold_batch_norms_optimized_gpu.pb --input_layer=x_observed,bias,weights --input_layer_type=float,float,float --input_layer_shape=:: --output_layer=add

Quantize Weights

Should run Fold Batch Norms first!


In [ ]:
%%bash

transform_graph \
--in_graph=/root/models/optimize_me/fold_batch_norms_optimized_gpu.pb \
--out_graph=/root/models/optimize_me/quantized_optimized_gpu.pb \
--inputs='x_observed,weights,bias' \
--outputs='add' \
--transforms='quantize_weights'

In [ ]:
%%bash

ls -l /root/models/optimize_me/

In [ ]:
%%bash

summarize_graph --in_graph=/root/models/optimize_me/quantized_optimized_gpu.pb

In [ ]:
%%bash

benchmark_model --graph=/root/models/optimize_me/quantized_optimized_gpu.pb --input_layer=x_observed,bias,weights --input_layer_type=float,float,float --input_layer_shape=:: --output_layer=add

Perform All Common Optimizations


In [ ]:
%%bash

transform_graph \
--in_graph=/root/models/optimize_me/unoptimized_gpu.pb \
--out_graph=/root/models/optimize_me/fully_optimized_gpu.pb \
--inputs='x_observed,weights,bias' \
--outputs='add' \
--transforms='
add_default_attributes
remove_nodes(op=Identity, op=CheckNumerics)
fold_constants(ignore_errors=true)
fold_batch_norms
fold_old_batch_norms
quantize_weights
quantize_nodes
strip_unused_nodes
obfuscate_names'

In [ ]:
%%bash

ls -l /root/models/optimize_me/

In [ ]:
%%bash

summarize_graph --in_graph=/root/models/optimize_me/fully_optimized_gpu.pb

In [ ]:
%%bash

benchmark_model --graph=/root/models/optimize_me/fully_optimized_gpu.pb --input_layer=weights,x_observed,bias --input_layer_type=float,float,float --input_layer_shape=:: --output_layer=add

Sort by Execution Order (DAG Topological Order)

  • Minimizes inference overhead
  • Inputs for a node guaranteed to be available

In [ ]:
%%bash

transform_graph \
--in_graph=/root/models/optimize_me/fully_optimized_gpu.pb \
--out_graph=/root/models/optimize_me/sort_by_execution_order_optimized_gpu.pb \
--inputs='x_observed,weights,bias' \
--outputs='add' \
--transforms='
sort_by_execution_order'

In [ ]:
%%bash

ls -l /root/models/optimize_me/

In [ ]:
%%bash

summarize_graph --in_graph=/root/models/optimize_me/sort_by_execution_order_optimized_gpu.pb

In [ ]:
%%bash

benchmark_model --graph=/root/models/optimize_me/sort_by_execution_order_optimized_gpu.pb --input_layer=weights,x_observed,bias --input_layer_type=float,float,float --input_layer_shape=:: --output_layer=add

In [ ]: