Forge¶
Target Functions¶
forge.list_cuda_tags(verbose: bool = False) -> Union[List[str], List[Tuple[str, TargetHost]]]
¶
List all tags (pre-defined aliases) of CUDA targets
Parameters:
Name | Type | Description | Default |
---|---|---|---|
verbose
|
bool
|
A flag to return all the tags with their corresponding target string literals for the TVM compiler if true. Default is False. |
False
|
Returns:
Type | Description |
---|---|
Union[List[str], List[Tuple[str, TargetHost]]]
|
A list of tags |
forge.list_x86_tags(verbose: bool = False) -> Union[List[str], List[Tuple[str, TargetHost]]]
¶
List all tags (pre-defined aliases) of x86 targets
Parameters:
Name | Type | Description | Default |
---|---|---|---|
verbose
|
bool
|
A flag to return all the tags with their corresponding target string literals for the TVM compiler if true. Default is False. |
False
|
Returns:
Type | Description |
---|---|
Union[List[str], List[Tuple[str, TargetHost]]]
|
A list of tags |
forge.list_arm_tags(verbose: bool = False) -> Union[List[str], List[Tuple[str, TargetHost]]]
¶
List all tags (pre-defined aliases) of ARM targets
Parameters:
Name | Type | Description | Default |
---|---|---|---|
verbose
|
bool
|
A flag to return all the tags with their corresponding target string literals for the TVM compiler if true. Default is False. |
False
|
Returns:
Type | Description |
---|---|
Union[List[str], List[Tuple[str, TargetHost]]]
|
A list of tags |
forge.list_android_tags(verbose: bool = False) -> Union[List[str], List[Tuple[str, TargetHost]]]
¶
List all tags (pre-defined aliases) for Android targets
Parameters:
Name | Type | Description | Default |
---|---|---|---|
verbose
|
bool
|
A flag to return all the tags with their corresponding target string literals for the TVM compiler if true. Default is False. |
False
|
Returns:
Type | Description |
---|---|
Union[List[str], List[Tuple[str, TargetHost]]]
|
A list of tags |
forge.list_target_tags(verbose: bool = False) -> Union[List[str], List[Tuple[str, TargetHost]]]
¶
List all tags (pre-defined aliases) of all targets
Parameters:
Name | Type | Description | Default |
---|---|---|---|
verbose
|
bool
|
A flag to return all the tags with their corresponding target string literals for the TVM compiler if true. Default is False. |
False
|
Returns:
Type | Description |
---|---|
Union[List[str], List[Tuple[str, TargetHost]]]
|
A list of tags |
Environment Functions¶
forge.is_tensorrt_compiler_enabled() -> bool
¶
Checks if TensorRT is discoverable and available for use with Forge.
This function performs a system-level check to determine if TensorRT is installed, properly configured, and can be leveraged for compiling models using Forge. It verifies the presence of necessary libraries, environment settings, and other dependencies required to integrate TensorRT with Forge.
Returns:
Name | Type | Description |
---|---|---|
bool |
bool
|
True if TensorRT is available and can be used with Forge, False otherwise. |
forge.get_tensorrt_time_cache() -> str
¶
Get the location of the TensorRT timing cache.
The TensorRT timing cache accelerates the compilation of deep learning models for inference by storing the results of performance tests for different execution algorithms. When a model is compiled, TensorRT typically tests various algorithms to find the most efficient ones for the specific hardware. By caching these test results, subsequent compilations of the same or similar models on the same hardware can bypass this time-consuming testing phase. This leads to a significant reduction in compilation time, as the optimal algorithms are already known and can be immediately applied, ensuring quicker and more efficient model optimization for inference.
Note: By default, within the Docker container, the cache is designated to be at
'/tensorrt_cache'. A user can manually set the cache location themselves, see
forge.set_tensorrt_cache()
.
Returns:
Type | Description |
---|---|
str
|
Cache location (str): TensorRT timing cache location |
forge.set_tensorrt_time_cache(cache_path: Union[str, Path]) -> None
¶
Set the location of the TensorRT timing cache.
The TensorRT timing cache accelerates the compilation of deep learning models for inference by storing the results of performance tests for different execution algorithms. When a model is compiled, TensorRT typically tests various algorithms to find the most efficient ones for the specific hardware. By caching these test results, subsequent compilations of the same or similar models on the same hardware can bypass this time-consuming testing phase. This leads to a significant reduction in compilation time, as the optimal algorithms are already known and can be immediately applied, ensuring quicker and more efficient model optimization for inference.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
cache_cache
|
Union[str, Path]
|
The directory to locate or store the TensorRT timing cache. |
required |
Returns:
Type | Description |
---|---|
None
|
None |