Golden Recipes
leip_recipe_designer.GoldenVolumes
¶
GoldenVolumes(task: str = 'vision.detection.2d')
Represents Golden Recipe Volumes for model workflows, categorized by tasks and datasets.
This class allows access to pre-tested Golden Recipe Volumes from the LEIP Zoo for tasks such as vision-based detection and classification. It organizes datasets and provides utilities to retrieve recipes and metadata.
Parameters:
-
task(str, default:'vision.detection.2d') –The task for which Golden Recipe Volumes should be fetched. Supported options: - "vision.detection.2d": For detection tasks, retrieves volumes like
xval_det. - "vision.classification.2d": For classification tasks, retrieves volumes likexval_cls.
Attributes:
-
datasets(Dict[str, GoldenDataset]) –A dictionary of available datasets corresponding to the task, where the key is the dataset name and the value is the
GoldenDatasetobject.
Examples:
Initialize for detection tasks:
from leip_recipe_designer import GoldenVolumes
volumes = GoldenVolumes(task="vision.detection.2d")
Initialize for classification tasks:
from leip_recipe_designer import GoldenVolumes
volumes = GoldenVolumes(task="vision.classification.2d")
Methods:
-
get_dataframe–Retrieves the Golden dataframe for the provided volume.
-
list_volumes_from_zoo–Retrieve available recipe volumes from LEIP Zoo.
get_dataframe
¶
get_dataframe(key='xval_det')
Retrieves the Golden dataframe for the provided volume.
By default, it pulls the volume xval_det for detection tasks. For classification tasks,
you can access xval_cls by setting the key and task appropriately.
Parameters:
-
key–Provide the volume string for which you want the Golden Dataframe. Examples: - Detection task: Use "xval_det" - Classification task: Use "xval_cls"
Returns:
-
pd.DataFrame:–A Pandas DataFrame containing Golden recipes with metadata, including metrics, model family, SPPR, backbone, etc.
Examples:
For detection:
from leip_recipe_designer import GoldenVolumes
volumes = GoldenVolumes(task="vision.detection.2d")
df = volumes.get_dataframe(key="xval_det")
For classification:
from leip_recipe_designer import GoldenVolumes
volumes = GoldenVolumes(task="vision.classification.2d")
df = volumes.get_dataframe(key="xval_cls")
list_volumes_from_zoo
¶
list_volumes_from_zoo() -> Dict[str, GoldenDataset]
Retrieve available recipe volumes from LEIP Zoo.
Each volume contains recipes that were tested on one or more datasets.
Volumes named xval_... are cross-validated across diverse datasets and are the recommended starting points.
Returns:
-
datasets(Dict[str, GoldenDataset]) –A collection of datasets, where the key is a name and the value is the dataset itself.
Examples:
For detection tasks:
from leip_recipe_designer import GoldenVolumes
goldenvolumes = GoldenVolumes(task="vision.detection.2d")
goldenvolumes.list_volumes_from_zoo()
{
'chemistrylab': <leip_recipe_designer.core.utils.golden_recipe_helpers.GoldenDataset>,
'bdd100k': <leip_recipe_designer.core.utils.golden_recipe_helpers.GoldenDataset>,
'insects': <leip_recipe_designer.core.utils.golden_recipe_helpers.GoldenDataset>,
'wheat': <leip_recipe_designer.core.utils.golden_recipe_helpers.GoldenDataset>,
'carsimple': <leip_recipe_designer.core.utils.golden_recipe_helpers.GoldenDataset>,
'xval_det': <leip_recipe_designer.core.utils.golden_recipe_helpers.GoldenDataset>
}
For classification tasks:
from leip_recipe_designer import GoldenVolumes
goldenvolumes = GoldenVolumes(task="vision.classification.2d")
goldenvolumes.list_volumes_from_zoo()
{
'xval_cls': <leip_recipe_designer.core.utils.golden_recipe_helpers.GoldenDataset>
}
GoldenDataset
¶
Methods:
-
anchor_boxes–Plots the mean intersection over union (IoU) per each anchor box group (small, medium and large).
-
boxes_info–A histogram of the number of bounding boxes for each image in the dataset.
-
class_distribution–Plots image and box class distributions.
-
describe_table–Displays a table with all the column values in the dataframe and its description.
-
get_golden_df–Generates a recipe DataFrame with the metrics calculated on this dataset.
-
get_samples–Show a randomly selected subset of data.
-
get_table_description–Returns a dictionary with the column names and their description.
-
resolution–Plots the resolution of the train and val images.
anchor_boxes
¶
anchor_boxes() -> ImageViewer
Plots the mean intersection over union (IoU) per each anchor box group (small, medium and large).
The anchor box groupings are calculated using K-means clustering.
Returns:
-
ImageViewer(ImageViewer) –An ImageViewer containing the anchor box info for train and validation datasets. Can be viewed in Jupyter notebook using IPython.display, or shown in matplotlib by invoking
ImageViewer.show().
boxes_info
¶
boxes_info() -> ImageViewer
A histogram of the number of bounding boxes for each image in the dataset.
Returns:
-
ImageViewer(ImageViewer) –An ImageViewer containing the histogram for train and validation datasets. Can be viewed in Jupyter notebook using IPython.display, or shown in matplotlib by invoking
ImageViewer.show().
class_distribution
¶
class_distribution() -> ImageViewer
Plots image and box class distributions.
Plot shows how many boxes of each class are present in the training and validation datasets and how many images contain a specific class.
Returns:
-
ImageViewer(ImageViewer) –An ImageViewer containing the plots for the train and validation datasets. Can be viewed in Jupyter notebook using IPython.display, or shown in matplotlib by invoking
ImageViewer.show().
describe_table
¶
describe_table() -> None
Displays a table with all the column values in the dataframe and its description.
get_golden_df
¶
get_golden_df(all_columns: bool = False) -> DataFrame
Generates a recipe DataFrame with the metrics calculated on this dataset.
Parameters:
-
all_columns(bool, default:False) –If True, all available metric columns are returned. Otherwise only a subset of more relevant columns are returned.
Returns:
-
pd.DataFrame:–A Pandas DataFrame of recipes, which contains golden recipes, different metrics, model_family, sppr, backbone, etc.
get_samples
¶
get_samples(split: str = 'train', num_samples=16, seed=42) -> ImageViewer
Show a randomly selected subset of data.
Parameters:
-
split(str, default:'train') –The dataset subset (either train or val) from which to sample.
-
num_samples–The number of samples to return.
-
seed–The seed of the random sample.
Returns:
-
ImageViewer(ImageViewer) –An ImageViewer containing a grid of the selected images. Can be viewed in Jupyter notebook using IPython.display, or shown in matplotlib by invoking
ImageViewer.show().
get_table_description
¶
get_table_description() -> Dict
Returns a dictionary with the column names and their description.
Returns:
-
Dict(Dict) –A dictionary with the column names and their description.
resolution
¶
resolution() -> ImageViewer
Plots the resolution of the train and val images.
Returns:
-
ImageViewer(ImageViewer) –An ImageViewer containing the resolution info for train and validation datasets. Can be viewed in Jupyter notebook using IPython.display, or shown in matplotlib by invoking
ImageViewer.show().