Skip to main content
Skip table of contents

Advanced Features

Compress Optimizations

In the compress phase of LEIP Optimize you can use two possible optimizations: Tensor Splitting and Bias Correction.

Enabling the Tensor Splitting Quantization

The LEIP SDK supports a quantization technique called Tensor Splitting. The tensors are decomposed into sub-tensors to allow for a separate and more optimal compression ratio.

The algorithm provides a flow to automatically determine the layers whose tensors should be split using a predefined heuristic.

To try out this optimization, simply add --compress_optimization tensor_splitting to the leip optimize command as in the following example:

CODE
# Create a representative dataset file, in this case with one example item
echo workspace/datasets/open-images-10-classes/eval/Apple/06e47f3aa0036947.jpg > rep_dataset.txt

# Optimize to an 8-bit model
leip optimize \
  --input_path workspace/models/inceptionv3/keras-open-images-10-classes \
  --output_path output_int8 \
  --rep_dataset rep_dataset.txt \
  --preprocessor imagenet \
  --compress_optimization tensor_splitting

The Tensor Splitting optimization pass could take several minutes depending on the size of the models and its layers.

Enabling the Bias Correction Quantization

The LEIP SDK supports a quantization technique called Bias Correction. Generally, quantization introduces a biased error in the output activations. Bias Correction will calibrate the model and adjust the biases to reduce this error. In some cases, this optimization will significantly improve the model’s performance.

Add --compress_optimization bias_correction to the leip optimize to try out Bias Correction as shown in the following example:

CODE
# Create a representative dataset file, in this case with 50 example items
head -50 workspace/datasets/open-images-10-classes/eval/dataset_schema.txt > rep_dataset50.txt

# Convert to an 8-bit model
leip optimize \
  --input_path workspace/models/inceptionv3/keras-open-images-10-classes \
  --output_path output_int8 \
  --rep_dataset rep_dataset50.txt \
  --preprocessor imagenet \
  --compress_optimization bias_correction

The Bias Correction optimization pass could take several minutes depending on the size of the models and its layers.

The tensor_splitting and bias_correction optimizations can be cascaded together by specifying --compress_optimization tensor_splitting,bias_correction.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.