Skip to main content
Skip table of contents

Detector Recipe Step Four: Deploying Your Model

The artifacts that you generated in Step Two and evaluated in Step Three are Latent AI Runtime Environment libraries that are ready to use in your application on the target device.

Inference examples are included in the files previously copied over to the target that illustrate how to integrate the model into Python and C++ examples.

The examples have been installed in the following directories on the target under the directory that you created in Step Three. For example:

  • ~/yolov5L/inference/c++/

  • ~/yolov5L/inference/python/

Python Inference Example

The Python inference example provided is There is an example bash script called provided the same directory that calls this script three times, once each for FP32, FP16, and INT8.

Take special note of several environment variables that are set for you in When you develop your own code, you will need to take care to set these variables as appropriate:

TVM/TRT environment variables:

  • TVM_TENSORRT_CACHE_DIR can be optionally set to a directory where the TRT engines should be cached. If it is not set, the engines are not cached and will be recreated each time.

  • TVM_TENSORRT_USE_FP16=1 is set to run the FP16 model. If it is not set, the FP32 is used.

  • TVM_TENSORRT_USE_INT8=1 is set to run the INT8 model. If it is not set, the FP32 is used.
    If the INT8 is set, the location of the calibration data used during the creation of the TRT engines for INT8 (a hidden .activations directory alongside the model library) needs to be specified as follows:

    • TRT_INT8_PATH=$INT8_MODEL/.activations/

Command line options for

  • --model_path is used to tell the example where to find the model binary.

  • --input_path tells the example where to find the input image.

  • --labels coco.names tells the example to use COCO style naming labels.

  • --count tells the example how many iterations to run.

Refer to the comments in the Python file for additional information on the inference example.

C++ Example

The C++ example provides a Makefile that builds and links an example application called cpp_deploy. As with the Python example, there are several environment variables that need to be set.

As with the Python examples, note the use of the following TVM/TRT Environment Variables are set for you in the provided bash script. When you are developing your own code, you will need to set these variables appropriately. Refer to the Python section above for more information.



  • TVM_TENSORRT_USE_INT8=1 and TRT_INT8_PATH=$INT8_MODEL/.activations/

Refer to the comments in the source file for more information on the C++ example.

Next Steps

The LRE binaries can now be integrated in the pre-trained model into your application. Please contact for assistance if you need additional help integrating the model into your application.

If you would like to train the model with your own dataset, please refer to the instructions for Bring Your Own Data.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.