Skip to main content
Skip table of contents

Classifier Recipes: Spicing Up Your Recipe

There are several possible steps you can take now that you have completed an end-to-end workflow for training, exporting, optimizing, compiling, packaging, and evaluating the timm:gernet_m classifier recipe. These steps are explained in the following sections.

Explore Different Classifier Backbones

One key benefit of adopting LEIP recipes is you can choose from a selection of models that are guaranteed to compile and optimize for a variety of target devices. You can try out different models on your hardware to see which ones best fit your speed and size requirements. From there you can train a few of those selected models to see which one gives you the best accuracy. If you have a heterogeneous hardware environment, you can select the best model for your needs knowing that it is easy to compile and optimize that model across many different architectures. We have released 22 backbones that work across 4 flavors of hardware in our initial classifier recipe release. This released provided more than 88 classifier recipe variants. To try out another model, return to Step One and repeat the steps by replacing timm:gernet_m with another backbone. Please note that while training, Open Images will provide you with a good sense of the speed performance of a model. However, the small size of the dataset does not lend itself to a good metric for accuracy. We have arranged our list of backbones from fastest at the top to slowest at the bottom. Accuracy will generally increase as you go down the list. But be mindful the accuracy results will vary by dataset.

Bring Your Own Data

You only need to set up your dataset once to evaluate any of the models or sweep against all of them. We support a simple, straightforward ImageFolder format for the classifier recipes. We also provide instructions for preparing your data to train and evaluate with LEIP recipes.

Tweak the Machine Learning Recipe

The machine learning configuration for a recipe is meant to be a starting point. You may find that the default settings allow you to train a model accurately with your data. Or you may find that you want to adjust the learning parameters to suit your purpose. For example, you may want to experiment with different schedulers or learning rates to improve the accuracy of your model. Alternatively, you may wish to change settings to trade off accuracy for faster training. We will provide a quick example here that will allow you to speed up the model training of the classifier recipe. You may find this useful if you want to evaluate a number of backbones for speed on your target hardware before turning your focus to getting the most accuracy out of the chosen model. Refer to the available Advanced AF Options documentation if you want to optimize for accuracy.

We will now alter the classifier-recipe by overwriting some default parameters. We will adjust the following parameters for faster training:

Pytorch Lighting Parameter

classifier-recipe

classifier-fast

gradient_clip_val

0.5

0.1

gradient_clip_algorithm

default (norm)

value

max_epochs

30

4

We override these values to create a fast training run. We will do this by passing the following settings as part of the af command line argument:

The default classifier-recipe does not set trainer.gradient_clip_algorithm, so we need to prepend a + to add the parameter. In most cases, you will be overriding settings, so you should not use +.

CODE
af --config-name classifier-recipe \
  model.module.backbone=timm:efficientnet_em \
  trainer.gradient_clip_val=0.1 \
  +trainer.gradient_clip_algorithm="value" \
  trainer.max_epochs=4 \
  command=train

For your convenience, we have provided a second configuration called classifier-fast for fast training using the above settings. You can perform the same training as above by using this configuration:

CODE
af --config-name classifier-fast \
  model.module.backbone=timm:efficientnet_em \
  command=train

The important takeaway is that the provided recipes are a starting point, and advanced users can modify the recipes to find parameters that better meet their requirements, be it higher accuracy or faster training times. Refer to the Advanced AF Options documentation for more information on modifying recipes.

Tweak the Build Recipe

Quantization Options

We provide default build recipes for common ARM and x86 targets, both with and without Nvidia GPU support. The GPU pipelines target both Float32 and Int8 with per-channel quantization. The default CPU pipelines provide Float32, but the Int8 default is per-tensor quantization. The per-tensor default allows for faster optimization, and is supported across all of the provided Classifier Recipes. Some of the models may significantly improve only the CPU accuracy by optimizing with per-channel quantization. For example, you can add the following to /latentai/recipes/classifier-recipe/pipeline_x86_64.yaml if you want to try symmetric per-channel quantization targeting x86:

CODE
  - name: Int8pc
    model:
      path: "$input_path"
      input_shapes: [ [ 1, 3, 224, 224 ] ]
      preprocessor: imagenet_torch_nchw
      postprocessor: top1
    optimize:
      compress:
        rep_dataset: /latentai/recipes/classifiers/rep_dataset.txt
        quantize_input: false
        quantize_output: false
        quantizer: symmetricpc
      compile:
        target: llvm
  - name: Int8pc
    model:
      path: $TASK_OUTPUT{Int8pc:optimize}
    package:
      format: python3.8

Note the name Int8pc in the above example. By adding this to the pipeline with a different name, the optimizer will provide the per-tensor optimized output in the Int8 subdirectory with the per-channel optimized output in the Int8pc subdirectory. This allows you to easily compare the results. Simply replace Int8 with Int8pc in the leip evaluate instructions detailed in Step Three to evaluate the resulting per-channel output. Refer to the SDK documentation for LEIP Optimize for more information on these options.

Target Hardware Optimizations

Some provided build recipes are optimized for certain hardware targets. You may need to change the compiler flags if you are targeting alternative hardware. In some cases, incorrect compiler flags will cause suboptimal performance. In other cases, incorrect compiler flags will prevent the target system from running the compiled models. The default build recipe targets are listed below:

Build Recipe

CPU

GPU

pipeline_x86_64.yaml

Intel Skylake processor

None

pipeline_x86_64_cuda.yaml

Intel Skylake processor

Generic cuda (no sm_xx flag)

pipeline_aarch64.yaml

ARM7 Cortex (Raspberry Pi)

None

pipeline_aarch64_cuda.yaml

ARM8 Carmel (Xavier AGX/NX)

Volta (-arch=sm_72)

These default pipeline configuration files can be updated to match your hardware target by modifying the target: or target_host: fields. Refer to the LEIP SDK documentation for more information about the compiler settings and the LEIP Pipeline configuration files.

Add Additional Pipeline Steps

Additional tasks can be added to the provided build recipes to further automate your build process. You may wish to add the leip evaluate step to the pipeline file so that a single leip pipeline command completes the optimize, compile, package, and evaluate. Refer to the LEIP Pipeline documentation for more information on building your own build recipe.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.