LEIP Train
The leip train
tool is used for automating the process of doing Quantization Guided Training (QGT) on a model. This is intended for first-time to moderate level users of the LEIP SDK. More advanced users might want details on using the Python API for QGT directly. Refer to this document for information on the Python API.
The leip train
tool accepts an input model and configuration JSON file and generates a training script and automatically runs it. If you would prefer to edit the script before running it manually, you can supply the --invoke_training false
option. You can specify the training dataset and control the training lambda parameters using the configuration JSON file. The specifics about the fields are detailed in the sections below.
Please note that leip train
currently supports the Keras .h5 file format.
The leip train
tool fits into the LEIP toolchain as shown in the following diagram (click to expand):
CLI Usage
The basic command is:
leip train --input_path path/to/model/ \
--output_path output/ \
--training_config_file path/to/config_file \
--training_script_path destination_path_for_script \
--invoke_training [true|false]
For a detailed explanation of each option, refer to the CLI Reference for LEIP Train.
API Usage
Refer to the following for examples on QGT using the Python API:
Config JSON File Fields
The training_config_file
command line argument is optional. When it is supplied, it is a path to a JSON document containing any of the following fields. Default values will be used for any fields that are not present in the JSON document.
bits
- (int) number of bits to quantize to - defaults to 8
batchsize
- (int) batchsize used during training - defaults to 256
epochs
- (int) number of epochs used during training - defaults to 10
epochs_pre_quantization
- (int) the number of pre-quantization epochs to train on the model
quantizer
- (string) quantization algorithm to use - defaults to “asymmetric”
dataset
- (string or JSON) - please see the section below
regularizer_attachment_scheme
- (JSON) - please see the section below
The Step Option
The --step
option accepts three possible values:
train
- (default) this will execute the full training pipeline in the training script
list
- this will cause the layer names and regularizers to be listed to stdout. this can be useful when creating the regularizer_attachment_scheme (see below)
dump_model_json
- this will dump the entire model to the screen as JSON
Example Usage
The first step is to create and save the lenet model we are going to train with the mnist dataset. Add the following Python code to a file named create_and_save_lenet_model.py
.
import os
import sys
import tensorflow as tf
def construct_model():
tf.keras.backend.clear_session()
model = tf.keras.Sequential(name="LeNet-5",)
model.add(tf.keras.layers.InputLayer(input_shape=(28, 28)))
model.add(tf.keras.layers.Reshape(target_shape=(28, 28, 1)))
model.add(tf.keras.layers.Conv2D(filters=6, kernel_size=(3, 3), activation='relu',))
model.add(tf.keras.layers.AveragePooling2D())
model.add(tf.keras.layers.Conv2D(filters=16, kernel_size=(3, 3), activation='relu',))
model.add(tf.keras.layers.AveragePooling2D())
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(units=120, activation='relu',))
model.add(tf.keras.layers.Dense(units=84, activation='relu',))
model.add(tf.keras.layers.Dense(units=10, activation='softmax',))
model.summary()
return model
def save_model(model, path):
model.save(
path,
overwrite=True,
include_optimizer=False,
save_format="h5")
save_model(construct_model(), sys.argv[1]
Run the script you just created. The model will be saved to the lenet_model
directory.
# create the destination directory
mkdir lenet_model
# run the script we created above
python3 create_and_save_lenet_model.py ./lenet_model/lenet.h5
Now, create a config file in your current directory and name it my_config.json
.
{
"epochs_before_attach": 0,
"epochs": 2,
"bits": 5,
"batchsize": 256
}
Next, using the leip zoo
command download the mnist dataset to your local workspace.
leip zoo download --dataset_id mnist --variant_id eval
Then run leip train
pointing to the model we just created and the config file using the list
step.
leip train --input_path ./lenet_model/lenet.h5 \
--training_config_file my_config.json \
--output_path ./outputTrain \
--step list
This will print out something similar to the following:
(The output has been truncated in this example.)
INFO:root:{
"conv2d_1": {
"class_name": "Conv2D",
"kernel_regularizer": null,
"bias_regularizer": null,
"kernel_constraint": null,
"bias_constraint": null
},
"dense": {
"class_name": "Dense",
"kernel_regularizer": null,
"bias_regularizer": null,
"kernel_constraint": null,
"bias_constraint": null
}
...
}
We can then edit our my_config.json
config JSON file using the layer names displayed to add the regularizer_attachment_scheme
field. The details on how this is done are explained in the section below.
We can then run the training using the following command:
leip train --input_path ./lenet_model/lenet.h5 \
--training_config_file my_config.json \
--output_path ./outputTrain \
--step train
Note the --step train
is the default and does not need to be explicitly specified.
Finally, we can display the entire JSON representation of the trained model with:
leip train --input_path ./outputTrain/trained/leip_trained_model.h5 \
--training_config_file my_config.json \
--output_path ./outputTrain \
--step dump_model_json
This will show the regularizers that were attached to the layers. Refer to LEIP Compile for information on how to compile the newly trained model.
The Dataset Field
The dataset
field’s value can be one of two types. When a string is supplied, the name looked for in the tf.keras.datasets
Python module specifying the prepackaged dataset in the Tensorflow library.
"dataset": "mnist"
When the value is a JSON subdocument, it is used to specify a custom dataset supplied by the user and will have the following fields:
path
- (string) a path to theindex.txt
file of the dataset. please see the LEIP Evaluate document for the format of the index.txt file
preprocessor
- (string) (optional) the preprocessor that should be applied to the input
size
- (int) (optional) amount of items in the dataset that should be used during training/validation
feeder
- (string) (optional) which FileFeeder class to use
Here is an example dataset
field with a custom dataset value:
"dataset": {
"path": "mnist_dataset/index.txt",
"preprocessor": None
"size": 70000
}
An example case of when to override the feeder
field is in the case of a detection model dataset that has both images and XML based annotations (such as the pascal-voc2007
dataset from the LEIP Model Zoo). You would add the field for the appropriate class such as: "feeder": "PascalVOCFeeder"
.
The Regularizer Attachment Scheme Field
This section describes the syntax for the regularizer_attachment_scheme
field of the JSON configuration. It tells the QGT system how to attach regularizer JSON to the model description such that the regularizers guide the training towards the appropriate quantized weights.
The value of the fields is a JSON array where each item in the array is a JSON subdocument with two fields: pattern
and regularizers
.
The pattern
field is a simplified regular expression (regex) that matches a set of layer names in the model. For example, if you would like to match any layer beginning with the prefix dense
you would use dense*
.
The regularizers
field’s value is a JSON subdocument with four fields:
kernel_regularizer
bias_regularizer
kernel_constraint
bias_constraint
The value of each of these fields is a JSON subdocument with implementation specific fields for that regularizer/constraint class.
To tie this all together, the following JSON snippet shows a complete example regularizer_attachment_scheme
field with two entries, one with a pattern of dense*
and one with a pattern of *
. Note that for each layer in the model, the entries are checked for a match in order and the first matching entry will have its regularizers
value attached to the layer. In this example, we have made any layers whose name starts with dense
quantize to two bits and all other layers to eight bits.
"regularizer_attachment_scheme": [
{
"pattern": "dense*",
"regularizers": {
"kernel_regularizer": {
"class_name": "QuantizationGuidedRegularizer",
"config": {
"num_bits": 2,
"lambda_1": 1.0,
"lambda_2": 1.0,
"lambda_3": 0.0,
"lambda_4": 1.0,
"lambda_5": 1.0,
"quantizer_name": "asymmetric"
}
},
"bias_regularizer": {
"class_name": "QuantizationGuidedRegularizer",
"config": {
"num_bits": 2,
"lambda_1": 1.0,
"lambda_2": 1.0,
"lambda_3": 0.0,
"lambda_4": 1.0,
"lambda_5": 1.0,
"quantizer_name": "asymmetric"
}
},
"kernel_constraint": null,
"bias_constraint": null
}
},
{
"pattern": "*",
"regularizers": {
"kernel_regularizer": {
"class_name": "QuantizationGuidedRegularizer",
"config": {
"num_bits": 8,
"lambda_1": 1.0,
"lambda_2": 1.0,
"lambda_3": 0.0,
"lambda_4": 1.0,
"lambda_5": 1.0,
"quantizer_name": "asymmetric"
}
},
"bias_regularizer": {
"class_name": "QuantizationGuidedRegularizer",
"config": {
"num_bits": 8,
"lambda_1": 1.0,
"lambda_2": 1.0,
"lambda_3": 0.0,
"lambda_4": 1.0,
"lambda_5": 1.0,
"quantizer_name": "asymmetric"
}
},
"kernel_constraint": null,
"bias_constraint": null
}
}
]
Please note that when this field is left blank, the training script will implicitly attach a regularization scheme to all quantizable layers which is the equivalent of the following:
"regularizer_attachment_scheme": [
{
"pattern": "*",
"regularizers": {
"kernel_regularizer": {
"class_name": "QuantizationGuidedRegularizer",
"config": {
"num_bits": 8,
"lambda_1": 1.0,
"lambda_2": 1.0,
"lambda_3": 0.0,
"lambda_4": 1.0,
"lambda_5": 1.0,
"quantizer_name": "asymmetric"
}
},
"bias_regularizer": {
"class_name": "QuantizationGuidedRegularizer",
"config": {
"num_bits": 8,
"lambda_1": 1.0,
"lambda_2": 1.0,
"lambda_3": 0.0,
"lambda_4": 1.0,
"lambda_5": 1.0,
"quantizer_name": "asymmetric"
}
},
"kernel_constraint": null,
"bias_constraint": null
}
}
]
In addition to accepting a JSON array, the regularizer_attachment_scheme
can accept a string which is interpreted as a separate file where the JSON array is stored. This allows one to keep the actual mappings of layer name regular expressions to regularizer JSON documents in separate files for easier A/B testing.