Advanced Features
Compress Optimizations
In the compress phase of LEIP Optimize you can use two possible optimizations: Tensor Splitting and Bias Correction.
Enabling the Tensor Splitting Quantization
The LEIP SDK supports a quantization technique called Tensor Splitting. The tensors are decomposed into sub-tensors to allow for a separate and more optimal compression ratio.
The algorithm provides a flow to automatically determine the layers whose tensors should be split using a predefined heuristic.
To try out this optimization, simply add --compress_optimization tensor_splitting
to the leip optimize
command as in the following example:
# Create a representative dataset file, in this case with one example item
echo workspace/datasets/open-images-10-classes/eval/Apple/06e47f3aa0036947.jpg > rep_dataset.txt
# Optimize to an 8-bit model
leip optimize \
--input_path workspace/models/inceptionv3/keras-open-images-10-classes \
--output_path output_int8 \
--rep_dataset rep_dataset.txt \
--preprocessor imagenet \
--compress_optimization tensor_splitting
The Tensor Splitting optimization pass could take several minutes depending on the size of the models and its layers.
Enabling the Bias Correction Quantization
The LEIP SDK supports a quantization technique called Bias Correction. Generally, quantization introduces a biased error in the output activations. Bias Correction will calibrate the model and adjust the biases to reduce this error. In some cases, this optimization will significantly improve the model’s performance.
Add --compress_optimization bias_correction
to the leip optimize
to try out Bias Correction as shown in the following example:
# Create a representative dataset file, in this case with 50 example items
head -50 workspace/datasets/open-images-10-classes/eval/dataset_schema.txt > rep_dataset50.txt
# Convert to an 8-bit model
leip optimize \
--input_path workspace/models/inceptionv3/keras-open-images-10-classes \
--output_path output_int8 \
--rep_dataset rep_dataset50.txt \
--preprocessor imagenet \
--compress_optimization bias_correction
The Bias Correction optimization pass could take several minutes depending on the size of the models and its layers.
The tensor_splitting
and bias_correction
optimizations can be cascaded together by specifying --compress_optimization tensor_splitting,bias_correction
.