ONNX-LRE
C++ API documentation
OnnxLre Namespace Reference

Latent Runtime Engine for ONNX models. More...

Classes

struct  Cryption
 Encryption parameters for protected model access. More...
 
class  LatentRuntimeEngine
 
struct  Options
 Configuration parameters for the inference engine. More...
 

Enumerations

enum  ExecutionProvider { ExecutionProvider::TensorRT, ExecutionProvider::CUDA, ExecutionProvider::CPU }
 Hardware acceleration backends for ONNX model execution. More...
 
enum  Precision { Precision::Float32, Precision::Float16, Precision::Int8 }
 Numeric precision options for model execution. More...
 

Detailed Description

Latent Runtime Engine for ONNX models.

The OnnxLre namespace contains all classes, functions, and types that form the Latent Runtime Engine for executing ONNX models with hardware acceleration. It provides abstractions for model loading, inference execution, and optimized tensor management on various compute devices.

Enumeration Type Documentation

◆ ExecutionProvider

Hardware acceleration backends for ONNX model execution.

Enumerator
TensorRT 

NVIDIA TensorRT - highest performance for supported operations with optimization passes.

CUDA 

NVIDIA CUDA - GPU acceleration without TensorRT optimizations.

CPU 

CPU execution - universal fallback with no special hardware requirements.

◆ Precision

enum OnnxLre::Precision
strong

Numeric precision options for model execution.

Enumerator
Float32 

32-bit floating point - highest precision, largest memory footprint

Float16 

16-bit floating point - reduced precision, ~50% memory reduction, faster on compatible hardware

Int8 

8-bit integer quantization - lowest precision, smallest memory footprint, requires calibration