Skip to content

forge.onnx.inference

forge.onnx.inference.get_inference_function(model: Union[ModelProto, str, bytes, os.PathLike], providers: Optional[Union[str, List[str]]] = None, opt_level: Union[int, GraphOptimizationLevel] = ort.GraphOptimizationLevel.ORT_DISABLE_ALL) -> Callable

Creates an ONNX Runtime inference function from the given model.

This function loads an ONNX model and returns a callable inference function that can be used to run predictions. The returned function automatically handles input and output names and shapes, and executes inference using the specified execution providers and graph optimization level.

Parameters:

Name Type Description Default
model Union[ModelProto, str, bytes, PathLike]

The ONNX model to be loaded. Can be a ModelProto object, a path to the model file, a serialized model in bytes, or an os.PathLike object representing a file path.

required
providers Optional[Union[str, List[str]]]

The execution providers to use for inference. Can be a string, e.g. "CUDAExecutionProvider" or a list of provider strings to try in priority order. If not provided, defaults to "CPUExecutionProvider".

None
opt_level Optional[Union[int, GraphOptimizationLevel]]

The level of graph optimization to apply during model loading. Defaults to GraphOptimizationLevel.ORT_DISABLE_ALL.

ORT_DISABLE_ALL

Returns:

Name Type Description
Callable Callable

A callable function that takes input data as arguments and returns a dictionary mapping output names to their corresponding NumPy arrays. The function has additional metadata attributes such as input_names, input_shapes, output_names, output_shapes, and session.

Example
inference_fn = get_inference_function("model.onnx")
output = inference_fn(input_data)
output_name = inference_fn.output_names[0])
print(output[output_name])

Raises:

Type Description
ValueError

If the model, providers, or optimization level are invalid.

forge.onnx.inference.get_inference_session(model: Union[str, bytes, os.PathLike], providers: Union[str, List[str]], opt_level: Union[int, GraphOptimizationLevel], **kwargs) -> InferenceSession

Creates an ONNX Runtime Inference Session.

This helper function initializes and returns an ONNX Runtime InferenceSession with the specified model, execution providers, and optimization level. Additional session options can be set via keyword arguments.

Parameters:

Name Type Description Default
model Union[str, bytes, PathLike]

The ONNX model to load. Can be a path to the model file, a serialized model in bytes, or an object that implements the os.PathLike interface.

required
providers Union[str, List[str]]

The execution providers to use. This can be a string (e.g., "CPUExecutionProvider", "CUDAExecutionProvider") or a list of providers to try in priority order.

required
opt_level Union[int, GraphOptimizationLevel]

The level of graph optimization to apply. It can be an integer or an instance of onnxruntime.GraphOptimizationLevel, which controls how much the graph is optimized.

required
**kwargs

Additional session options to customize ONNX Runtime behavior. These are dynamically passed as attributes to the session options, see onnruntime.SessionOptions()

{}

Returns:

Name Type Description
InferenceSession InferenceSession

The initialized ONNX Runtime InferenceSession object that can be used for running inference on the provided model.

Raises:

Type Description
ValueError

If the model, providers, or optimization level are invalid.

Example
session = get_inference_session("model.onnx", "CPUExecutionProvider", 2)
result = session.run(None, {"input": input_data})