Skip to main content
Skip table of contents

Classifier Recipe Step Four: Deploying Your Model

Installing Dependencies

You can run the following inference examples within the SDK docker container without any additional setup. If you are planning on running these inference code examples on an embedded device, you will first need to set up the device with the proper software dependencies.

To run the inference examples on a Xavier AGX or NX device, please follow the instructions for Setting up the Runtime Environment on an AGX in the Detector Recipes documentation.

To run the inference examples on a Raspberry Pi, please follow the instructions for Setting up the Runtime Environment on a RPi.

Artifact Collection

Take note of the model that you are using. The following examples assume you are using the timm:gernet_m backbone that was used in the earlier examples. If you have changed the backbone, update the following commands accordingly.

The artifacts created in Step Two can be collected into a tar file by running the following commands in the SDK Docker container:

CODE
# Gather the model artifacts to copy to the target
cd /latentai/workspace/output/

# If you targeted aarch64_cuda, use the following script command:
sh /latentai/recipes/classifiers/create-artifacts-tar.sh timm-gernet_m/aarch64_cuda

# If you targeted aarch64, use the following script command:
sh /latentai/recipes/classifiers/create-artifacts-tar.sh timm-gernet_m/aarch64

# If you targeted x86_64_cuda, use the following script command:
sh /latentai/recipes/classifiers/create-artifacts-tar.sh timm-gernet_m/x86_64_cuda

# If you targeted x86_64, use the following script command:
sh /latentai/recipes/classifiers/create-artifacts-tar.sh timm-gernet_m/x86_64

Regardless of the architecture path you provided to the artifact collection script, the output will now be available at: /latentai/workspace/output/model-artifacts.tar.gz

Artifacts in Target

On the Host:

CODE
# Copy the model artifacts from the host to the target
scp ./model-artifacts.tar.gz <username>@<target>:~/

On the Target:

CODE
# Create a new directory to install our model
mkdir ~/classifiers
cd ~/classifiers
mkdir timm-gernet_m
cd timm-gernet_m

# Unpack the model artifacts
mv ~/model-artifacts.tar.gz .
tar xvzf model-artifacts.tar.gz

You should now see three folders: models, images and inference. The inference folder contains the inference scripts that you can use as an example when you are ready to integrate the model into your application. See the next sections on how to run Python and C++ inference examples.

Python Inference Example

The Python inference example provided is infer.py. There is an example bash script called inference_commands.sh provided in the same directory that calls this script two times, once each for FP32 and INT8.

Take special note of several of the environment variables that are set for you in inference_commands.sh. When you develop your own code, you will need to take care to set these variables as appropriate:

TVM/TRT environment variables:

  • TVM_TENSORRT_CACHE_DIR can be optionally set to a directory where the TRT engines should be cached. If it is not set, the engines are not cached and will be recreated each time.

  • TVM_TENSORRT_USE_FP16=1 is set to run the FP16 model. If it is not set, the FP32 is used.

  • TVM_TENSORRT_USE_INT8=1 is set to run the INT8 model. If it is not set, the FP32 is used.
    If the INT8 is set, the location of the calibration data used during the creation of the TRT engines for INT8 (a hidden .activations directory alongside the model library) needs to be specified as follows:

    • TRT_INT8_PATH=~/.latentai/LRE/.model/.activations/

Command line options for infer.py:

  • --lre_object is used to specify where to find the model package (latentai.lre file).

  • --input_image is used to specify where to find the input image.

  • --labels is used to specify the text file with the labels.

C++ Example

The C++ example provides a C++ project containing a CMakelists.txt that will allow building and linking an example application called application.cpp.

If you are running the CPP inference example without a GPU (i.e. Raspberry Pi or x86_64 without CUDA), you will need to change kDLCUDA to kDLCPU in application.cpp. You can make this change with the following commands in the inference/cpp directory before building and running the example:
sed -i 's/kDLCUDA/kDLCPU/' application.cpp


Steps to build and run the application example:

CODE
# Go inside the inference C++ project example
cd inference/cpp

./run_inference.run

# Building an linking succesfully is indicated by: 
# [100%] Linking CXX executable ../bin/latentai.lre

# And then the results for the inference as shown as:
------------------------------------------------------------ 
 Detections 
------------------------------------------------------------ 
 The image prediction result is: id 6 Name: Penguin Score: 3.37577
 ------------------------------------------------------------

As with the Python examples, note the use of the following TVM/TRT Environment Variables are set for you in the provided bash script. When you are developing your own code, you will need to set these variables appropriately. Refer to the Python section above for more information.

  • TVM_TENSORRT_CACHE_DIR

  • TVM_TENSORRT_USE_FP16=1

  • TVM_TENSORRT_USE_INT8=1 and TRT_INT8_PATH=$INT8_MODEL/.activations/

Refer to the README.md in the project for more information on the C++ example.

Now that you have completed each step of the timm:gernet_m recipe, you can go back and repeat these steps for one of the other supported backbones, you can learn about training the classifier recipes with your own data (BYOD), or you learn about other options for spicing up your recipe.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.