ONNX-LRE
C++ API documentation
Loading...
Searching...
No Matches
OnnxLre Namespace Reference

Latent Runtime Engine for ONNX models. More...

Classes

struct  Cryption
 Encryption parameters for protected model access. More...
class  LatentRuntimeEngine
 The LatentRuntimeEngine class provides a C++ interface to load and run ONNX models using ONNX Runtime. More...
struct  Options
 Configuration parameters for the inference engine. More...

Enumerations

enum class  ExecutionProvider { TensorRT , CUDA , CPU , UNSET }
 Hardware acceleration backends for ONNX model execution. More...
enum class  Precision { Float32 , Float16 , Int8 , UNSET }
 Numeric precision options for model execution. More...

Detailed Description

Latent Runtime Engine for ONNX models.

The OnnxLre namespace contains all classes, functions, and types that form the Latent Runtime Engine for executing ONNX models with hardware acceleration. It provides abstractions for model loading, inference execution, and optimized tensor management on various compute devices.

Enumeration Type Documentation

◆ ExecutionProvider

enum class OnnxLre::ExecutionProvider
strong

Hardware acceleration backends for ONNX model execution.

Enumerator
TensorRT 

NVIDIA TensorRT - highest performance for supported operations with optimization passes.

CUDA 

NVIDIA CUDA - GPU acceleration without TensorRT optimizations.

CPU 

CPU execution - universal fallback with no special hardware requirements.

UNSET 

Default - select from the model metadata.

◆ Precision

enum class OnnxLre::Precision
strong

Numeric precision options for model execution.

Enumerator
Float32 

32-bit floating point - highest precision, largest memory footprint

Float16 

16-bit floating point - reduced precision, ~50% memory reduction, faster on compatible hardware

Int8 

8-bit integer quantization - lowest precision, smallest memory footprint, requires calibration

UNSET 

Default - select from the model metadata.