Skip to main content
Skip table of contents

Advanced Features

Compress Optimizations

In the compress phase of LEIP Optimize you can use two possible optimizations: Tensor Splitting and Bias Correction.

Enabling the Tensor Splitting Quantization

The LEIP SDK supports a quantization technique called Tensor Splitting, where tensors are decomposed into sub-tensors to allow for separate and more optimal compression ratio.

The algorithm provides a flow to automatically determine the layers whose tensors should be split, using a predefined heuristic.

To try out this optimization, simply add --compress_optimization tensor_splitting to the leip optimize command as in the following example:

CODE
# Create a representative dataset file, in this case with one example item
echo workspace/datasets/open-images-10-classes/eval/Apple/06e47f3aa0036947.jpg > rep_dataset.txt

# Optimize to an 8-bit model
leip optimize \
  --input_path workspace/models/inceptionv3/keras-open-images-10-classes \
  --output_path output_int8 \
  --data_type uint8 \
  --rep_dataset rep_dataset.txt \
  --preprocessor imagenet \
  --compress_optimization tensor_splitting

Depending on the size of the model and its layers, the Tensor Splitting optimization pass could take several minutes.

Enabling the Bias Correction Quantization

The LEIP SDK supports a quantization technique called Bias Correction. Generally, quantization introduces a biased error in the output activations. Bias Correction will calibrate the model and adjust the biases to reduce this error. In some cases, this optimization will significantly improve the model’s performance.

To try out Bias Correction, simply add --compress_optimization bias_correction to the leip optimize command as in the following example:

CODE
# Create a representative dataset file, in this case with 50 example items
head -50 workspace/datasets/open-images-10-classes/eval/index.txt > rep_dataset50.txt

# Convert to an 8-bit model
leip optimize \
  --input_path workspace/models/inceptionv3/keras-open-images-10-classes \
  --output_path output_int8 \
  --data_type uint8 \
  --rep_dataset rep_dataset50.txt \
  --preprocessor imagenet \
  --compress_optimization bias_correction

Please note that currently only the legacy quantizer (--use_legacy_quantizer true) is enabled for use with Bias Correction.

Depending on the size of the model and its layers, the Bias Correction optimization pass could take several minutes.

The tensor_splitting and bias_correction optimizations can be cascaded together by specifying --compress_optimization tensor_splitting,bias_correction.

Quantization Guided Training

In addition to Post Training Quantization, the LEIP SDK also allows Quantization Guided Training. Quantization Guided Training can be used by integrating your training code with the LEIP Python APIs directly, or by automating the process as much as possible using the leip train command.

Click here for more information about the leip train command.

For a tutorial on using the Quantization Guided Training Python API directly, click here . The LEIP Zoo models include source code which demonstrates how to use Quantization Guided Training.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.