![]() |
ONNX-LRE
C++ API documentation
|
#include <onnx_lre.hpp>
Public Member Functions | |
| LatentRuntimeEngine (const std::string &modelPath, const Options &config=Options()) | |
| High-performance inference engine for ONNX models. More... | |
| ~LatentRuntimeEngine () | |
| Releases all allocated resources. More... | |
| size_t | getNumberOfInputs () const |
| Returns the number of input tensors required by the model. More... | |
| size_t | getNumberOfOutputs () const |
| Returns the number of output tensors produced by the model. More... | |
| const std::vector< const char * > & | getInputNames () const |
| Retrieves the names of all model input nodes. More... | |
| const std::vector< const char * > & | getOutputNames () const |
| Retrieves the names of all model output nodes. More... | |
| std::vector< std::string > | getInputDTypes () const |
| Gets the data types of all input tensors as strings. More... | |
| std::vector< std::string > | getOutputDTypes () const |
| Gets the data types of all output tensors as strings. More... | |
| const std::vector< std::vector< int64_t > > & | getInputShapes () const |
| Retrieves the dimensional shapes of all input tensors. More... | |
| const std::vector< std::vector< int64_t > > & | getOutputShapes () const |
| Retrieves the dimensional shapes of all output tensors. More... | |
| void | infer (const std::vector< DLManagedTensor * > &t_input_data_vec) |
| Performs inference using DLPack tensor inputs. More... | |
| void | infer (const std::vector< Ort::Value > &t_input_data_vec) |
| Performs inference using ONNX Runtime tensor inputs. More... | |
| void | infer (const std::vector< void * > &t_input_data_vec, const std::vector< int64_t * > shape, const std::string device) |
| Performs inference using raw memory pointers and shapes. More... | |
| Ort::Value | makeORTTensor (void *t_input_data_vec, const int64_t *shape, int input_index, const std::string &device) |
| Creates an ONNX Runtime tensor from raw memory. More... | |
| std::vector< DLManagedTensor * > | getOutput () |
| Retrieves inference results as DLPack tensors. More... | |
| std::vector< Ort::Value > | getOutputOrt () |
| Retrieves and transfers ownership of inference results as ONNX Runtime tensors. More... | |
| void | setCPUOutput (bool use_cpu) |
| Controls output tensor placement between device and host memory. More... | |
| bool | isCPUOutput () |
| Checks the current output tensor memory placement policy. More... | |
| std::string | getMetaValue (std::string key) |
| Retrieves model metadata by key. More... | |
Private Member Functions | |
| void | initLRE (std::vector< unsigned char > model) |
| Initializes the model for inference. More... | |
| void | configureTensorRTProvider () |
| Configures TensorRT provider options. More... | |
| void | configureCUDAProvider () |
| Configures TensorRT provider options. More... | |
| void | fetchInputNodeInfo () |
| Fetches and stores input node information. More... | |
| void | fetchOutputNodeInfo () |
| Fetches and stores output node information. More... | |
Private Attributes | |
| Options | config |
| Ort::Env | env |
| ONNX Runtime environment. More... | |
| Ort::SessionOptions | sessionOptions |
| Session options for ONNX Runtime. More... | |
| Ort::Session | session {nullptr} |
| The ONNX Runtime session for model inference. More... | |
| Ort::IoBinding | io_binding {nullptr} |
| std::string | model_path |
| Path to the ONNX model file. More... | |
| bool | isModelLoaded = false |
| Flag indicating if the model is successfully loaded. More... | |
| bool | gpuOutput = false |
| Flag indicating if the output tensors would be on GPU (true for CUDA and TensorrRT) More... | |
| bool | graphQuantized = false |
| bool | trt_calib = false |
| Flag for trt_calib available. More... | |
| Ort::MemoryInfo | cpu_memory_info {nullptr} |
| Ort::MemoryInfo | cuda_memory_info {nullptr} |
| OrtTensorRTProviderOptionsV2 * | tensorrt_options = nullptr |
| TensorRT provider options. More... | |
| OrtCUDAProviderOptionsV2 * | cuda_options = nullptr |
| CUDA Provider options. More... | |
| Ort::ModelMetadata | metadata {nullptr} |
| size_t | number_inputs = 0 |
| size_t | number_outputs = 0 |
| Count of input and output nodes. More... | |
| std::vector< const char * > | input_names |
| std::vector< const char * > | output_names |
| Names of input and output nodes. More... | |
| std::vector< ONNXTensorElementDataType > | input_dtypes |
| std::vector< ONNXTensorElementDataType > | output_dtypes |
| Data types of input and output nodes. More... | |
| std::vector< std::vector< int64_t > > | input_shapes |
| std::vector< std::vector< int64_t > > | output_shapes |
| Shapes of input and output nodes. More... | |
| std::vector< size_t > | input_tensors_dtype_bytes |
| std::vector< size_t > | output_tensors_dtype_bytes |
| Ort::AllocatorWithDefaultOptions | allocator |
| Allocator for ONNX Runtime. More... | |
| std::vector< Ort::Value > | input_tensors |
| std::vector< Ort::Value > | output_tensors |
| ExecutionProvider | executionProvider |
| Ort::Value | dummy_tensor {nullptr} |
| std::string | tempDirectoryPath |
| std::string | calibration_file_path |
|
private |
Initializes the model for inference.
|
private |
Configures TensorRT provider options.
|
private |
Configures TensorRT provider options.
|
private |
Fetches and stores input node information.
|
private |
Fetches and stores output node information.
|
private |
|
private |
ONNX Runtime environment.
|
private |
Session options for ONNX Runtime.
|
private |
The ONNX Runtime session for model inference.
|
private |
|
private |
Path to the ONNX model file.
|
private |
Flag indicating if the model is successfully loaded.
|
private |
Flag indicating if the output tensors would be on GPU (true for CUDA and TensorrRT)
|
private |
|
private |
Flag for trt_calib available.
|
private |
|
private |
|
private |
TensorRT provider options.
|
private |
CUDA Provider options.
|
private |
|
private |
|
private |
Count of input and output nodes.
|
private |
|
private |
Names of input and output nodes.
|
private |
|
private |
Data types of input and output nodes.
|
private |
|
private |
Shapes of input and output nodes.
|
private |
|
private |
|
private |
Allocator for ONNX Runtime.
|
private |
|
private |
|
private |
|
private |
|
private |
|
private |