Evaluation of the model on the target device verifies that the model’s accuracy matches the accuracy on host and provides inference speed measurements. For a GPU capable target such as the AGX, it also uses the LEIP runtime capabilities to extract the GPU portion of the graph to create a target specific compute engine.

Evaluating on target vs. evaluating in the SDK container:

There are two versions of Step Three depending on where you intend to evaluate the model.

If you compiled your model in Step Two with the intent on evaluating the model on an Xavier AGX or NX device, follow the steps immediately below.

If you do not have access to an Xavier device and compiled the model with the intent of evaluating within the SDK docker container, follow the instructions at the bottom of this page.

Note that our benchmarks are based on running on an Xavier AGX as described below. Inference times will vary in the SDK docker container depending on machine configuration and load.


Evaluating on an Xavier device

The first step in evaluating on the target is ensuring that the target environment is established with the necessary components. For the AGX target, this includes copying over the TRT and LEIP Optimized TVM runtime libraries, as well as building and installing various dependencies.

Setting Up The Runtime Environment on the Target Device

You should start with a clean environment on the target device. If you have previously installed any packages, you should wipe the system and start with a fresh JetPack installation. You can find instructions for installing JetPack in the Jetson AGX Developer Manual.

The dependencies needed to evaluate your model are provided inside the SDK docker container. These files were copied to the host machine as part of Step 2. Install these dependencies by copying them from the host to the target.

On the host:

# Copy the dependency file from your host to your target

scp ./agx-install.tar.gz <username>@<target>:~/
CODE

On the target:

Note: For this next step, you will need sudo access on your target device. The installation will take approximately 30 minutes, and the process will occasionally request password authentication.

# Ensure you are in the correct directory
cd ~/

# Uncompress the archive
tar xvzf agx-install.tar.gz

# Install the dependencies
cd agx-install
sh ./install.sh

# IMPORTANT: To ensure the correct environment, log out of the target
# and log back in before proceeding.
logout
CODE

Installing the Model on the Target

The necessary dependencies were copied over and installed using the above referenced commands to set up the evaluation environment. That installation only needs to be completed the first time you are preparing to run an evaluation on the target. The model and associated files also need to be copied over to the target machine in order to run the evaluation. These files include the optimized model binary and preprocessor and will need to be copied over anytime you retrain, export, compile, and optimize your model. In the previous step, these files were copied to the host in the archive model-artifacts.tar.gz . You will now need to install the model on the target device.

In this example, the model will be installed in a directory named yolov5_L. This will allow you to later repeat the previous steps for the medium and small models and install them in different directories so that you can compare results.

On the host:

# Copy the model artifacts from the host to the target
scp ./model-artifacts.tar.gz <username>@<target>:~/
CODE

On the target:

# Create a new directory to install our model
mkdir ~/yolov5_L
cd ~/yolov5_L

# Unpack the model artifacts
mv ~/model-artifacts.tar.gz .
tar xvzf model-artifacts.tar.gz
CODE

You should now see three folders: models, evaluation and inference. The evaluation folder contains the script that you can use to evaluate the model. The inference folder contains the inference scripts that you can use as an example when you are ready to integrate the model into your application. The evaluation of the model will be covered in the next paragraph. For more information on the inference scripts, see Deploying Your Model.

Installing the Data on the Target

The pre-trained model has been trained on the MS COCO dataset. To evaluate the model, data from this dataset must be installed on the target device:

# Install MS COCO data for validation
cd ~/yolov5_L/evaluation
sh ./download_mscoco.sh
CODE

If you have trained your model with a different dataset, instructions on installing different validation data on the target is covered in the section on evaluating and deploying your model with BYOD.

Configuring the Target

The inference speed results will depend on the Clock Frequency and Power Management Settings for the target. The results we have published for the Xavier AGX were obtained with the MAXN setting achieved by the following commands:

# Set the AGX to max performance
sudo nvpmodel -m 0

# Verify settings
sudo nvpmodel -q
CODE

Note that while most of the instructions on this page work equally for the Xavier NX and AGX. devices, the power settings differ between the devices.

Evaluating Your Model

The evaluation script has been installed on the target in the evaluation folder. Run the following script to perform the evaluation:

On the target:

# Run the evaluation
cd ~/yolov5_L/evaluation
sh ./run_eval.sh

# When the evaluation completes, results will be in yolov5_L_RT-results.txt
cat ~/yolov5_L/evaluation/yolov5_L_RT-results.txt
CODE

The TRT Compute Engine will generate optimized binaries the first time you run a new model. This process will cause the model to load slowly the first time. The system caches these binaries so that loads are much faster on subsequent runs. Separate cache directories are provided in this example to ensure that the FP32 and INT8 files do not conflict. Be sure to delete the directories of the old model if you install a new model. This will ensure the TRT Compute Engine will correctly generate new optimized binaries.

Interpreting the Results

The inference script will report the average inference time and the average mAP accuracy score for the model. Note that the accuracy on the device should match the accuracy measured on the host in Step 1.

Please note that there are several ways to measure inference time. The run_eval.sh script reports inference speed as the time it takes to copy data in, run an inference, and copy the result out again. The example script provided in the next section reports only the inference time without the memory transfers.

Next Steps

Once you have completed evaluating the model on the target, you can either integrate the model into your code for deployment, or retrain the model with your own data, export, compile, and evaluate again.


Evaluating within the SDK Docker Container

Installing the Model

To evaluate the model in the SDK container, copy the model and associated files that you created in Step Two to a new directory. These files include the optimized model binary and preprocessor and will need to be copied over anytime you retrain, export, compile, and optimize your model. In the previous step, these files were packaged in a tar archive model-artifacts.tar.gz.

In this example, the model will be installed in a directory named yolov5_L. This will allow you to later repeat the previous steps for the medium and small models and install them in different directories so that you can compare results.

 

# Create a new directory to install our model
mkdir ~/yolov5_L
cd ~/yolov5_L

# Unpack the model artifacts
mv /latentai/recipes/yolov5_L_RT/model-artifacts.tar.gz .
tar xvzf model-artifacts.tar.gz
CODE

 

You should now see three folders: models, evaluation and inference. The evaluation folder contains the script that you can use to evaluate the model. The inference folder contains the inference scripts that you can use as an example when you are ready to integrate the model into your application. The evaluation of the model will be covered in the next paragraph. For more information on the inference scripts, see Deploying Your Model.

Installing the Data on the Target

The pre-trained model has been trained on the MS COCO dataset. To evaluate the model, data from this dataset must be installed on the target device:

# Install MS COCO data for validation
cd ~/yolov5_L/evaluation
sh ./download_mscoco.sh
CODE

 

If you have trained your model with a different dataset, instructions on installing different validation data on the target is covered in the section on evaluating and deploying your model with BYOD.

Evaluating Your Model

The evaluation script has been installed on the target in the evaluation folder. Run the following script to perform the evaluation:

On the target:

# Run the evaluation
cd ~/yolov5_L/evaluation
sh ./run_eval.sh x86_64

# When the evaluation completes, results will be in yolov5_L_RT-results.txt
cat ~/yolov5_L/evaluation/yolov5_L_RT-results.txt
CODE

 

The TRT Compute Engine will generate optimized binaries the first time you run a new model. This process will cause the model to load slowly the first time. The system caches these binaries so that loads are much faster on subsequent runs. Separate cache directories are provided in this example to ensure that the FP32 and INT8 files do not conflict. Be sure to delete the directories of the old model if you install a new model. This will ensure the TRT Compute Engine will correctly generate new optimized binaries.

Interpreting the Results

The inference script will report the average inference time and the average mAP accuracy score for the model. Note that the accuracy on the device should match the accuracy measured on the host in Step 1.

Please note that there are several ways to measure inference time. The run_eval.sh script reports inference speed as the time it takes to copy data in, run an inference, and copy the result out again. The example script provided in the next section reports only the inference time without the memory transfers.

Next Steps

Once you have completed evaluating the model on the target, you can either integrate the model into your code for deployment, or retrain the model with your own data, export, compile, and evaluate again.