Evaluation of the model on the target device verifies that the model’s accuracy matches the accuracy on host and provides inference speed measurements. For a GPU capable target such as the AGX, it also uses the LEIP runtime capabilities to extract the GPU portion of the graph to create a target specific compute engine.
The first step in evaluating on the target is ensuring that the target environment is established with the necessary components. For the AGX target, this includes copying over the TRT and LEIP Optimized TVM runtime libraries, as well as building and installing various dependencies.
Setting Up The Runtime Environment on the Target Device
You should start with a clean environment on the target device. If you have previously installed any packages, you should wipe the system and start with a fresh JetPack installation. You can find instructions for installing JetPack in the Jetson AGX Developer Manual.
The dependencies needed to evaluate your model are provided inside the SDK docker container. These files were copied to the host machine as part of Step 2. Install these dependencies by copying them from the host to the target.
On the host:
# Copy the dependency file from your host to your target scp ./agx-install.tar.gz <username>@<target>:~/
On the target:
Note: For this next step, you will need
sudo access on your target device. The installation will take approximately 30 minutes, and the process will occasionally request password authentication.
# Ensure you are in the correct directory cd ~/ # Uncompress the archive tar xvzf agx-install.tar.gz # Install the dependencies cd agx-install sh ./install.sh # IMPORTANT: To ensure the correct environment, log out of the target # and log back in before proceeding. logout
Installing the Model on the Target
The necessary dependencies were copied over and installed using the above referenced commands to set up the evaluation environment. That installation only needs to be completed the first time you are preparing to run an evaluation on the target. The model and associated files also need to be copied over to the target machine in order to run the evaluation. These files include the optimized model binary and preprocessor and will need to be copied over anytime you retrain, export, compile, and optimize your model. In the previous step, these files were copied to the host in the archive
agx-artifacts.tar.gz . You will now need to install the model on the target device.
On the host:
# Copy the model artifacts from the host to the target scp ./agx-artifacts.tar.gz <username>@<target>:~/
On the target:
# Unpack the model artifacts cd ~/ tar xvzf agx-artifacts.tar.gz
You should now see three folders:
evaluation folder contains the script that you can use to evaluate the model. The
inference folder contains the inference scripts that you can use as an example when you are ready to integrate the model into your application. The evaluation of the model will be covered in the next paragraph. For more information on the inference scripts, see Deploying Your Model.
Installing the Data on the Target
The pre-trained model has been trained on the MS COCO dataset. To evaluate the model, data from this dataset must be installed on the target device:
# Install MS COCO data for validation cd ~/evaluation sh ./download_mscoco.sh
If you have trained your model with a different dataset, instructions on installing different validation data on the target is covered in the section on evaluating and deploying your model with BYOD.
Configuring the Target
The inference speed results will depend on the Clock Frequency and Power Management Settings for the target. The results we have published for the Xavier AGX were obtained with the
MAXN setting achieved by the following commands:
# Set the AGX to max performance sudo nvpmodel -m 0 # Verify settings sudo nvpmodel -q
Evaluating Your Model
The evaluation script has been installed on the target in the
evaluation folder. Run the following script to perform the evaluation:
On the target:
# Run the evaluation cd ~/evaluation sh ./run_eval.sh # When the evaluation completes, results will be in yolov5_L_RT-results.txt cat ~/evaluation/yolov5_L_RT-results.txt
The TRT Compute Engine will generate optimized binaries the first time you run a new model. This process will cause the model to load slowly the first time. The system caches these binaries so that loads are much faster on subsequent runs. Separate cache directories are provided in this example to ensure that the FP32 and INT8 files do not conflict. Be sure to delete the directories of the old model if you install a new model. This will ensure the TRT Compute Engine will correctly generate new optimized binaries.
Interpreting the Results
The inference script will report the average inference time and the average mAP accuracy score for the model. Note that the accuracy on the device should match the accuracy measured on the host in Step 1.
Please note that there are several ways to measure inference time. The
run_eval.sh script reports inference speed as the time it takes to copy data in, run an inference, and copy the result out again. The example script provided in the next section reports only the inference time without the memory transfers.
Once you have completed evaluating the model on the target, you can either integrate the model into your code for deployment, or retrain the model with your own data, export, compile, and evaluate again.