Classifier Recipe Step Three: Evaluating a Model on Target Hardware
Run the leip evaluate
SDK command to evaluate the model you optimized, compiled, and packaged in Step Two. The command line arguments will differ slightly depending on whether you are running inference locally in the SDK docker container or connecting to a networked device to run inference remotely.
Running leip evaluate
on a GPU enabled device will provide skewed (slow) inference numbers if you do not set up a compute engine cache and pass it by environment variable. Please see the LEIP Evaluate for more details. This feature is not currently supported for running leip evaluate
on a remote target device.
Evaluating Within the SDK Docker Container
Pass the compiled model directly to leip evaluate
along with the test path to evaluate the model entirely within the SDK container. Use the following commands to evaluate if you followed the path and naming conventions used earlier in the tutorial.
Perform the following to evaluate within an x86_64 Docker container with NVIDIA Graphics Card:
# Evaluating Float32:
leip evaluate \
--input_path workspace/output/timm-gernet_m/x86_64_cuda/Float32-compile \
--test_path workspace/datasets/open-images-10-classes/eval/index.txt
# Evaluating Int8:
leip evaluate \
--input_path workspace/output/timm-gernet_m/x86_64_cuda/Int8-optimize \
--test_path workspace/datasets/open-images-10-classes/eval/index.txt
Replace x86_64_cuda
in the above examples with x86_64
when evaluating in an x86_64 Docker container without a GPU.
If LEIP Evaluate for a GPU targeted model fails with a CUDA_ERROR_NO_BINARY_FOR_GPU
error, this indicates that the model was optimized/compiled with the wrong arch
flag.
Evaluating with Remote Inference:
Next, run leip evaluate
in the SDK Docker container with inference performed on the device under test to evaluate on a remote target device. You will first need to set-up your target by installing Latent AI Object Runner (LOR). You will then evaluate using the LRE objects created by leip pipeline
in Step Two. The following examples assume you followed the naming conventions and paths from earlier in the tutorial.
Perform the following for an ARM processor without a GPU:
# Substitute the IP address of your target device for <IP_ADDR>
# The default port for LOR is 50051
# Evaluating Float32:
leip evaluate \
--input_path workspace/output/timm-gernet_m/aarch64/Float32-package \
--host <IP_ADDR> --port 50051 \
--test_path workspace/datasets/open-images-10-classes/eval/index.txt
# Evaluating Int8:
leip evaluate \
--input_path workspace/output/timm-gernet_m/aarch64/Int8-package \
--host <IP_ADDR> --port 50051 \
--test_path workspace/datasets/open-images-10-classes/eval/index.txt
Replace aarch64
in the above example for an ARM processor with a GPU, x86_64, or x86_64 with a GPU, with aarch64_cuda
, x86_64
or x86_64_cuda
as appropriate for your device under test.
It is also possible to test an LRE object with the leip evaluate
inside the Docker container by running the LOR inside the container itself. Launch the LOR within the SDK by calling python3 -m lor.lor_server
to enable the LOR within the SDK.
You will need t expose the LOR port if you want to access the LOR in one container by a leip evaluate
process running in another:
If you use the default port, you can enable this by adding
-p 50051:50051
to thedocker run
command.Use the IP address of the Docker container when passing the
--host
flag toleip evaluate
Next Steps
Once you have completed evaluating the model on the target, you can either integrate the model into your code for deployment, or try out different models, including training with your own datasets.