Skip to main content
Skip table of contents

Detector Recipe Step Four: Deploying Your Model

Example Code

A Git Repository with Python and C++ applications is provided to make it easy to get started on the path to deploying the models you have created with LEIP recipes. You will also find instructions in the repository to run the examples, as well as instructions for installing any needed dependencies.

Collecting the Model

You will need the LEIP Runtime Environment objects you created with the pipeline build process in Step Two in order to deploy your model. These objects are stored in the directory path provided by the --output_path flag when you ran the leip pipeline command.

We will use the example of building a YOLOv5 Large model targeting an NVIDIA AGX device. You may have provided the output path of:

--output_path /latentai/workspace/output/yolov5l/aarch64_cuda

If you now look in that directory and you used the default pipeline_aarch64_cuda.yaml build recipe, you will find the following artifacts have been created:

# The following directory will change based on your earlier pipeline command
ls /latentai/workspace/output/yolov5l/aarch64_cuda

Float32-compile  Float32-package  Int8-optimize  Int8-package  results.json

If you are using a Python application with your model, you will want the packaged latentai.lre object in either the Float32-package or Int8-package directory depending on whether you are building around the compiled Float32 or optimized Int8 version of the model. The packaged version of the model will include a number of Python dependencies needed to run the model.

If you are using a C++ application with your model, you will want the file that you will find in the Float32-compile or Int8-optimize directories.

Testing Your Models on the Device

Perform the following steps to test your models on the target device:

  1. Copy the latentai.lre or files over to the device you want to test. You may skip this step if you are testing within the SDK docker container.

  2. Clone the example application Git repository.

  3. Follow the instructions to install the necessary dependencies for your given device.

  4. Modify the provided scripts to match your device architecture and installed model location

  5. Run the test code using the provided instructions and scripts.

Evaluating for Speed

The C++ example applications are the best way to evaluate the timing of your model on the device. Note that some of the example applications code for pre- and post- processing has been written to be general enough to run on different devices and you may get faster total inference times by optimizing the pre- and post- processing examples using specific optimized libraries available on your platform.

Next Steps

The LRE binaries can now be integrated into your own application. Please contact for assistance if you need additional help integrating the recipe models into your application.

Please refer to the instructions for Bring Your Own Data if you would like to train the recipe models with your own dataset.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.