Golden Recipes

leip_recipe_designer.GoldenVolumes ¶

GoldenVolumes(task: str = 'vision.detection.2d')

Represents Golden Recipe Volumes for model workflows, categorized by tasks and datasets.

This class allows access to pre-tested Golden Recipe Volumes from the LEIP Zoo for tasks such as vision-based detection and classification. It organizes datasets and provides utilities to retrieve recipes and metadata.

Parameters:

task (str, default: 'vision.detection.2d' ) –

The task for which Golden Recipe Volumes should be fetched. Supported options: - "vision.detection.2d": For detection tasks, retrieves volumes like xval_det. - "vision.classification.2d": For classification tasks, retrieves volumes like xval_cls.

Attributes:

datasets (Dict[str, GoldenDataset]) –

A dictionary of available datasets corresponding to the task, where the key is the dataset name and the value is the GoldenDataset object.

Examples:

Initialize for detection tasks:

from leip_recipe_designer import GoldenVolumes
volumes = GoldenVolumes(task="vision.detection.2d")

Initialize for classification tasks:

from leip_recipe_designer import GoldenVolumes
volumes = GoldenVolumes(task="vision.classification.2d")

Methods:

get_dataframe –

Retrieves the Golden dataframe for the provided volume.
list_volumes_from_zoo –

Retrieve available recipe volumes from LEIP Zoo.

get_dataframe ¶

get_dataframe(key='xval_det')

Retrieves the Golden dataframe for the provided volume.

By default, it pulls the volume xval_det for detection tasks. For classification tasks, you can access xval_cls by setting the key and task appropriately.

Parameters:

key –

Provide the volume string for which you want the Golden Dataframe. Examples: - Detection task: Use "xval_det" - Classification task: Use "xval_cls"

Returns:

pd.DataFrame: –

A Pandas DataFrame containing Golden recipes with metadata, including metrics, model family, SPPR, backbone, etc.

Examples:

For detection:

from leip_recipe_designer import GoldenVolumes
volumes = GoldenVolumes(task="vision.detection.2d")
df = volumes.get_dataframe(key="xval_det")

For classification:

from leip_recipe_designer import GoldenVolumes
volumes = GoldenVolumes(task="vision.classification.2d")
df = volumes.get_dataframe(key="xval_cls")

list_volumes_from_zoo ¶

list_volumes_from_zoo() -> Dict[str, GoldenDataset]

Retrieve available recipe volumes from LEIP Zoo.

Each volume contains recipes that were tested on one or more datasets. Volumes named xval_... are cross-validated across diverse datasets and are the recommended starting points.

Returns:

datasets ( Dict[str, GoldenDataset] ) –

A collection of datasets, where the key is a name and the value is the dataset itself.

Examples:

For detection tasks:

from leip_recipe_designer import GoldenVolumes
goldenvolumes = GoldenVolumes(task="vision.detection.2d")
goldenvolumes.list_volumes_from_zoo()

{
    'chemistrylab': <leip_recipe_designer.core.utils.golden_recipe_helpers.GoldenDataset>,
    'bdd100k': <leip_recipe_designer.core.utils.golden_recipe_helpers.GoldenDataset>,
    'insects': <leip_recipe_designer.core.utils.golden_recipe_helpers.GoldenDataset>,
    'wheat': <leip_recipe_designer.core.utils.golden_recipe_helpers.GoldenDataset>,
    'carsimple': <leip_recipe_designer.core.utils.golden_recipe_helpers.GoldenDataset>,
    'xval_det': <leip_recipe_designer.core.utils.golden_recipe_helpers.GoldenDataset>
}

For classification tasks:

from leip_recipe_designer import GoldenVolumes
goldenvolumes = GoldenVolumes(task="vision.classification.2d")
goldenvolumes.list_volumes_from_zoo()

{
    'xval_cls': <leip_recipe_designer.core.utils.golden_recipe_helpers.GoldenDataset>
}

GoldenDataset ¶

GoldenDataset(resource: str, variant: str, zoo: Zoo, task: str = 'vision.detection.2d')

Methods:

anchor_boxes –

Plots the mean intersection over union (IoU) per each anchor box group (small, medium and large).
boxes_info –

A histogram of the number of bounding boxes for each image in the dataset.
class_distribution –

Plots image and box class distributions.
describe_table –

Displays a table with all the column values in the dataframe and its description.
get_golden_df –

Generates a recipe DataFrame with the metrics calculated on this dataset.
get_samples –

Show a randomly selected subset of data.
get_table_description –

Returns a dictionary with the column names and their description.
resolution –

Plots the resolution of the train and val images.

anchor_boxes ¶

anchor_boxes() -> ImageViewer

Plots the mean intersection over union (IoU) per each anchor box group (small, medium and large).

The anchor box groupings are calculated using K-means clustering.

Returns:

ImageViewer ( ImageViewer ) –

An ImageViewer containing the anchor box info for train and validation datasets. Can be viewed in Jupyter notebook using IPython.display, or shown in matplotlib by invoking ImageViewer.show().

boxes_info ¶

boxes_info() -> ImageViewer

A histogram of the number of bounding boxes for each image in the dataset.

Returns:

ImageViewer ( ImageViewer ) –

An ImageViewer containing the histogram for train and validation datasets. Can be viewed in Jupyter notebook using IPython.display, or shown in matplotlib by invoking ImageViewer.show().

class_distribution ¶

class_distribution() -> ImageViewer

Plots image and box class distributions.

Plot shows how many boxes of each class are present in the training and validation datasets and how many images contain a specific class.

Returns:

ImageViewer ( ImageViewer ) –

An ImageViewer containing the plots for the train and validation datasets. Can be viewed in Jupyter notebook using IPython.display, or shown in matplotlib by invoking ImageViewer.show().

describe_table ¶

describe_table() -> None

Displays a table with all the column values in the dataframe and its description.

get_golden_df ¶

get_golden_df(all_columns: bool = False) -> DataFrame

Generates a recipe DataFrame with the metrics calculated on this dataset.

Parameters:

all_columns (bool, default: False ) –

If True, all available metric columns are returned. Otherwise only a subset of more relevant columns are returned.

Returns:

pd.DataFrame: –

A Pandas DataFrame of recipes, which contains golden recipes, different metrics, model_family, sppr, backbone, etc.

get_samples ¶

get_samples(split: str = 'train', num_samples=16, seed=42) -> ImageViewer

Show a randomly selected subset of data.

Parameters:

split (str, default: 'train' ) –

The dataset subset (either train or val) from which to sample.
num_samples –

The number of samples to return.
seed –

The seed of the random sample.

Returns:

ImageViewer ( ImageViewer ) –

An ImageViewer containing a grid of the selected images. Can be viewed in Jupyter notebook using IPython.display, or shown in matplotlib by invoking ImageViewer.show().

get_table_description ¶

get_table_description() -> Dict

Returns a dictionary with the column names and their description.

Returns:

Dict ( Dict ) –

A dictionary with the column names and their description.

resolution ¶

resolution() -> ImageViewer

Plots the resolution of the train and val images.

Returns:

ImageViewer ( ImageViewer ) –

An ImageViewer containing the resolution info for train and validation datasets. Can be viewed in Jupyter notebook using IPython.display, or shown in matplotlib by invoking ImageViewer.show().