Skip to main content
Skip table of contents

LEIP Evaluate

The LEIP Evaluate tool is provided to evaluate the accuracy of a model in a consistent way across several stages of the LEIP tool chain. It allows inference and evaluation in TensorFlow, TFLite, PyTorch, and LRE runtimes.

The leip evaluate command takes as an input a path to a model folder or file and a path to a testset file. The model is loaded into the specified runtime and performs inference of the input files specified in the testset. The command outputs the accuracy metric along with information about the number of inferences per second.

If you are evaluating a model on a device with an NVIDIA GPU, the system will build and cache the compute engine the first time you run it. This will result in significant initial startup times and will skew your inference timing results. Run leip evaluate on a device with an NVIDIA GPU once as a dry run and then a second time to generate accurate timing. You will need to do this each time you recompile the model.

In order to enable caching of the compute engine, you need to create the cache directory and provide that path as an environment variable to the command that will use the compute engine.

Our convention is to place the cache under the model path. If your model is in the optimizedModel directory, you would set the cache by first creating the directory:
mkdir optimizedModel/trt-cache

Then you would use this path when calling leip evaluate:
TVM_TENSORRT_CACHE_DIR=optimizedModel/trt-cache leip evaluate <options>

CLI Usage

The basic command is:

BASH
leip evaluate --input_path optimizedModel/ \
              --test_path workspace/datasets/open-images-10-classes/eval/dataset_schema.json

The output of the command will be:

BASH
accuracy: top1: XXXXXX%, top5: XXXXXX%, top1: rate YY.YY inferences/s (ZZ.ZZ)

# please note
The first number (Y, usually higher) represents the inference speed without including the overhead of preprocessing.
The second number inside the parentheses (Z) represents the inference speed that includes the preprocessing overhead.

For a detailed explanation of each option, refer to the CLI Reference for LEIP Evaluate.

Batch Size

Batch sizes are supported in LEIP Evaluate. The batch size is controlled by two parameters:

  1. The first dimension of the input shapes of the model, and

  2. The --batch_size option in the leip evaluate command.

When --batch_size is NOT explicitly passed, the batch dimension of the first input’s shape is checked. If it is None, the batch is set to 1. If N is present in the shape, the batch size is set to N.

When --batch_size is explicitly passed, if the batch dimension of the first input’s shape is None, the --batch_size option is used. If the the batch dimension of the first input is not None and does not match the --batch_size option, an error is raised.

Here are some examples:

CODE
--input_shapes None,224,224,3                # Batch size 1 is used
--input_shapes None,224,224,3 --batch_size 8 # Batch size 8 is used, without padding
--input_shapes 8,224,224,3                   # Batch size 8 is used, with padding
--input_shapes 8,224,224,3 --batch_size 8    # Batch size 8 is used with padding
--input_shapes 2,224,224,3 --batch_size 8    # ValueError since the batch size and batch dim in input shape are explicitly defined and do not match

Padding is needed for some execution frameworks whenever they do not support None as an N dimension in the input shapes and there is a remainder of test items in an incomplete batch given the --test_size.

The batch size must be set as the N part of the input shapes when compiling a model. This shape must then be used while running leip evaluate. Compiled models do not accept None as a shape dimension and will not allow overriding the --batch_size from the CLI unless the input shape matches the batch size.

Testset Files

A testset file is a text file with test file information separated by newlines. Each line has a path for the test image/sound/etc., followed by a space, followed by an integer that represents the number of the correct output class.

The testset files are specified as follows:

  • A text file, typically ending in .txt but not required, containing a list of input examples to be evaluated.

  • Each example is separated by a newline.

  • Each example is a line with two columns separated by a space.

  • The first column is a path to the input file of that example

  • The second column is the ground truth output for that example. It is usually a number specifying the class offset into the classnames file.

  • In the case of detection models, the output column is a comma-separated list of numbers which specify the bounding box coordinates in addition to the class offset.

  • The classnames file is documented below.

Here is an example row of a testset file:

BASH
$ head -1 testset.preprocessed.1001.txt
resources/images/imagenet_images/preprocessed/ILSVRC2012_val_00000001.JPEG 66

There is also an analogous JSON (or more specifically JSONL) file format supported. This is identical to the text format. However, instead of a space-separated row of newline-separated examples, each row is a single line JSON document. Each row has key paths and outputs. The value for the output key is an array so it can handle the multiple-output case. Here is an example row:

BASH
$ head -1 testset.preprocessed.1001.json
{"path": "resources/images/imagenet_images/preprocessed/ILSVRC2012_val_00000001.JPEG", "output": [66]}

Hardware Accelerator Optimized Model

The --inference_context target needs to be set when running inference based on optimized for hardware accelerators.

For example, if a model was compiled with:

BASH
$ leip compile --target cuda:2080ti ...

Then the evaluate/run command must be called:

BASH
$ leip evaluate --inference_context cuda ...


JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.