Latent Runtime Engine for ONNX models. More...

Classes
struct	Cryption
	Encryption parameters for protected model access. More...
class	LatentRuntimeEngine
	The LatentRuntimeEngine class provides a C++ interface to load and run ONNX models using ONNX Runtime. More...
struct	Options
	Configuration parameters for the inference engine. More...

Enumerations
enum class	ExecutionProvider { TensorRT , CUDA , CPU , UNSET }
	Hardware acceleration backends for ONNX model execution. More...
enum class	Precision { Float32 , Float16 , Int8 , UNSET }
	Numeric precision options for model execution. More...

Detailed Description

Latent Runtime Engine for ONNX models.

The OnnxLre namespace contains all classes, functions, and types that form the Latent Runtime Engine for executing ONNX models with hardware acceleration. It provides abstractions for model loading, inference execution, and optimized tensor management on various compute devices.

Enumeration Type Documentation

◆ ExecutionProvider

enum class OnnxLre::ExecutionProvider

strong

Hardware acceleration backends for ONNX model execution.

Enumerator
TensorRT	NVIDIA TensorRT - highest performance for supported operations with optimization passes.
CUDA	NVIDIA CUDA - GPU acceleration without TensorRT optimizations.
CPU	CPU execution - universal fallback with no special hardware requirements.
UNSET	Default - select from the model metadata.

◆ Precision

enum class OnnxLre::Precision

strong

Numeric precision options for model execution.

Enumerator
Float32	32-bit floating point - highest precision, largest memory footprint
Float16	16-bit floating point - reduced precision, ~50% memory reduction, faster on compatible hardware
Int8	8-bit integer quantization - lowest precision, smallest memory footprint, requires calibration
UNSET	Default - select from the model metadata.

Classes

Enumerations

Detailed Description

Enumeration Type Documentation

◆ ExecutionProvider

◆ Precision