forge.onnx.inference¶
forge.onnx.inference.get_inference_function(model: Union[ModelProto, str, bytes, os.PathLike], providers: Optional[Union[str, List[str]]] = None, opt_level: Union[int, GraphOptimizationLevel] = ort.GraphOptimizationLevel.ORT_DISABLE_ALL) -> Callable
¶
Creates an ONNX Runtime inference function from the given model.
This function loads an ONNX model and returns a callable inference function that can be used to run predictions. The returned function automatically handles input and output names and shapes, and executes inference using the specified execution providers and graph optimization level.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model
|
Union[ModelProto, str, bytes, PathLike]
|
The ONNX model to be loaded. Can be a |
required |
providers
|
Optional[Union[str, List[str]]]
|
The execution providers to use for inference. Can be a string, e.g. "CUDAExecutionProvider" or a list of provider strings to try in priority order. If not provided, defaults to "CPUExecutionProvider". |
None
|
opt_level
|
Optional[Union[int, GraphOptimizationLevel]]
|
The level of graph optimization to apply during model loading. Defaults to
|
ORT_DISABLE_ALL
|
Returns:
Name | Type | Description |
---|---|---|
Callable |
Callable
|
A callable function that takes input data as arguments and returns a dictionary
mapping output names to their corresponding NumPy arrays. The function has
additional metadata attributes such as |
Example
inference_fn = get_inference_function("model.onnx")
output = inference_fn(input_data)
output_name = inference_fn.output_names[0])
print(output[output_name])
Raises:
Type | Description |
---|---|
ValueError
|
If the model, providers, or optimization level are invalid. |
forge.onnx.inference.get_inference_session(model: Union[str, bytes, os.PathLike], providers: Union[str, List[str]], opt_level: Union[int, GraphOptimizationLevel], **kwargs) -> InferenceSession
¶
Creates an ONNX Runtime Inference Session.
This helper function initializes and returns an ONNX Runtime InferenceSession
with the specified model, execution providers, and optimization level. Additional
session options can be set via keyword arguments.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model
|
Union[str, bytes, PathLike]
|
The ONNX model to load. Can be a path to the model file, a serialized model in bytes, or an object that implements the os.PathLike interface. |
required |
providers
|
Union[str, List[str]]
|
The execution providers to use. This can be a string (e.g., "CPUExecutionProvider", "CUDAExecutionProvider") or a list of providers to try in priority order. |
required |
opt_level
|
Union[int, GraphOptimizationLevel]
|
The level of graph optimization to apply. It can be an integer or an instance of
|
required |
**kwargs
|
Additional session options to customize ONNX Runtime behavior. These are dynamically passed as attributes to the session options, see onnruntime.SessionOptions() |
{}
|
Returns:
Name | Type | Description |
---|---|---|
InferenceSession |
InferenceSession
|
The initialized ONNX Runtime |
Raises:
Type | Description |
---|---|
ValueError
|
If the model, providers, or optimization level are invalid. |
Example
session = get_inference_session("model.onnx", "CPUExecutionProvider", 2)
result = session.run(None, {"input": input_data})