Using PyLRE to Deploy your Compiled `modelLibrary.so`¶

You have compiled a model with LEIP Optimize, and now you want to deploy it in a target environment. The tutorial provides step-by-step instructions for loading an optimized artifact, creating an LRE instance, and performing inference

Runtime Setup¶

We need two components to execute a model on your target:

target-compatible and model-compatible runtime (LRE)
target-compatible model or model library (optimized output)

In [ ]:

Copied!

import pylre
from pylre import LatentRuntimeEngine as LRE
import numpy as np
import pylre
from pylre import LatentRuntimeEngine as LRE
import numpy as np

Using LEIP Optimize, get your optimized artifact. This tutorial assumes the model is compiled for float32 and for CPU target. For more information, consult the LEIP Optimize tutorial for quantizing and compiling a model.

In [1]:

Copied!

optimized_artifact_path = "path/to/modelLibrary.so"
optimized_artifact_path = "path/to/modelLibrary.so"

In [ ]:

Copied!

pylre_options = pylre.TVMOptions(precision="float32")
lre = LRE(optimized_artifact_path, options=pylre_options)
pylre_options = pylre.TVMOptions(precision="float32")
lre = LRE(optimized_artifact_path, options=pylre_options)

With this LRE object, we can introspect on the model we have optimized:

In [ ]:

Copied!

lre.get_metadata()
lre.get_metadata()

Creating a random tensor to do inference¶

As the model expects only one input, we pick the first one to create a random input tensor:

In [ ]:

Copied!

shape = lre.input_shapes[0]
type = lre.input_dtypes[0]
input = np.random.random(shape).astype(type)
shape = lre.input_shapes[0]
type = lre.input_dtypes[0]
input = np.random.random(shape).astype(type)

With this input data tensor, we can run an inference on the model LRE instantiation we created:

In [ ]:

Copied!

output = lre(input)
output = lre(input)

This output is in a device-independent target. But you may want to convert it into a more amenable format for postprocessing. We will use NumPy for this, but depending on your application and hardware usage, you may want to explore other formats.

In [ ]:

Copied!

numpy_output = np.from_dlpack(output[0])
numpy_output = np.from_dlpack(output[0])

Verifying expected output shape¶

In [ ]:

Copied!

expected_output_shape = lre.output_shapes[0]
assert numpy_output.shape == expected_output_shape
expected_output_shape = lre.output_shapes[0]
assert numpy_output.shape == expected_output_shape

Using PyLRE to Deploy your Compiled modelLibrary.so¶

Runtime Setup¶

Creating a random tensor to do inference¶

Verifying expected output shape¶

Using PyLRE to Deploy your Compiled `modelLibrary.so`¶