Offline Mode

These functions are designed to facilitate offline execution of recipes in LEIP Design. Below, you will find reference documentation for the key public methods available in the offline module. You can also review their source code by expanding the dropdowns.

To enable offline execution, ensure your pantry and recipes are properly set up as described in the examples below.

from leip_recipe_designer.helpers import offline
from leip_recipe_designer import Pantry

# Setup a pantry
pantry = Pantry.build("./local_pantry")
offline.setup(pantry)

Offline Execution Functions¶

Setup¶

leip_recipe_designer.helpers.offline.setup ¶

setup(pantry)

Source code in leip_recipe_designer/helpers/offline.py

def setup(pantry):
    _setup(pantry)

Single Recipe Preparation¶

leip_recipe_designer.helpers.offline.single_recipe ¶

single_recipe(recipe: RecipeNode, task='train')

Prepare a specific recipe for offline execution. The function does not modify the incoming recipe. The process to make a recipe offline-ready is to do a minimal training and validation run. It does require some data to run in preparation for offline execution. If you want to have certain data available offline, you can use this function to download the data as well. Note that this is not required if you bring the data yourself when offline. The latter can be addressed by using any of the BYOD (Bring Your Own Data) recipe componentes (see LEIP Design documentation).

The function makes the following modifications to the recipe before running it (the incoming recipe is not modified): - Set the fast_dev_run flag to True - Set the use_pretrained flag to True (i.e. force to use a pretrained model) - Set the batch size to 1 (to enable offlining a minimal size model)

Parameters:

recipe (RecipeNode) –

The recipe to be available online

Examples:

>>> import leip_recipe_designer as lrd
>>>
>>> pantry = lrd.Pantry.build('my_pantry')
>>> recipe = lrd.create.empty_detection_recipe(pantry=pantry)
>>> recipe.fill_empty_recursively({'data_generator': 'Smoke (data/sets/url/detection/smoke-pascal-like)'})
>>> lrd.helpers.offline.single_recipe(recipe)

Source code in leip_recipe_designer/helpers/offline.py

def single_recipe(recipe: "RecipeNode", task="train"):
    """Prepare a specific recipe for offline execution. The function does not modify the incoming recipe. The process
    to make a recipe offline-ready is to do a minimal training and validation run. It does require some data to run in
    preparation for offline execution. If you want to have certain data available offline, you can use this function to
    download the data as well. Note that this is not required if you bring the data yourself when offline. The latter
    can be addressed by using any of the BYOD (Bring Your Own Data) recipe componentes (see LEIP Design documentation).

    The function makes the following modifications to the recipe before running it (the incoming recipe is not
    modified):
    - Set the fast_dev_run flag to True
    - Set the use_pretrained flag to True (i.e. force to use a pretrained model)
    - Set the batch size to 1 (to enable offlining a minimal size model)

    Parameters
    ----------
    recipe:
        The recipe to be available online

    Examples
    ----------
    >>> import leip_recipe_designer as lrd
    >>>
    >>> pantry = lrd.Pantry.build('my_pantry')
    >>> recipe = lrd.create.empty_detection_recipe(pantry=pantry)
    >>> recipe.fill_empty_recursively({'data_generator': 'Smoke (data/sets/url/detection/smoke-pascal-like)'})
    >>> lrd.helpers.offline.single_recipe(recipe)

    """
    _single_recipe(recipe, task=task)

Query Pareto Optimal Recipes¶

leip_recipe_designer.helpers.offline.query_pareto_optimal_recipes ¶

query_pareto_optimal_recipes(df, method='exact', metric_custom='od_inf_rate_ms', sort_by='metric_custom_best_to_worst', top_N_fuzzy=None) -> Tuple[List[int], Dict[str, Any]]

Get the pareto optimal experiments from the Golden Recipe dataframe

The function supports two methods: - exact: the strict pareto front, i.e. every returned experiment is locally optimal for the given two metrics in the pareto sense

fuzzy: a smooth approximation to the strict front. The function estimates empirically how close to pareto-optimality each experiment is. This considers the empirical density of recipes, i.e. in places of the pareto space that are easier to achieve are biased against. The result is a ranking of experiments, where the top N experiments can be queried. It is not guaranteed that the top N experiments are exactly the strict pareto optimal ones. However, the fuzzy pareto optimal method allows to traverse more experiments than the strict pareto optimal method by choosing various N.

In order to ensure that there is a trade off (and thus a reasonable pareto frontier), we are fixing one metric of the pareto frontier to 'relative_task_metric', the other can be chosen by the user.

In the following, we will refer to the fixed metric as 'metric_accuracy' or 'accuracy metric' and the user chosen metric as 'metric_custom' or 'custom metric'.

The function automatically converts metrics to conform to "larger value is better" for the pareto front computation. The conversion is usually performed by taking the maximum value of the metric and dividing it by the metric value. This is done to ensure that the pareto front is computed with "higher is better" in mind. The conversion is done on the entire set of values, not only the potentially pareto-optimal ones.

The clear text names of any normalized metrics are returned in the output dictionary, such as "x times faster" (for timings), or "x times smaller" (for sizes and energy).

Parameters:

df –

The dataframe containing the experiments from the Golden Recipe Database
method –

The method to use to compute the pareto front. Can be either 'exact' or 'fuzzy'
metric_custom –

The "other" metric to use for the pareto front. The second metric is fixed to 'relative_task_metric'.
sort_by –

The sorting method to use for the pareto front. Can be either of the following: metric_custom_best_to_worst: sort strictly by provided metric, best to worst (default) metric_accuracy_best_to_worst: sort strictly by accuracy metric, best to worst angular_custom_to_accuracy: sort by angular value from common origin from custom metric to accuracy metric. This is useful if one wants to explore the tradeoff between the two metrics paretoness: sort by the "estimated paretoness" value from best to worst. There is no particular order between custom and accuracy metric.

Return Values

A tupel of two values: - The first value is a list of the experiment names that are pareto optimal, sorted according to the sort_by parameter - The second value is a dictionary containing containing various meta information about the pareto front: - 'metric_custom': the values of the first metric for the pareto optimal experiments (sorted according to the sort_by parameter) - 'metric_accuracy': the values of the second metric for the pareto optimal experiments (sorted according to the sort_by parameter) - 'metric_custom_name': the name of the first metric (as it may have normalized to conform to "larger value is better") - 'metric_accuracy_name': the name of the second metric (as it may have normalized to conform to "larger value is better") - 'metric_custom_all': the values of the first metric for all experiments in no particular order - 'metric_accuracy_all': the values of the second metric for all experiments in no particular order

Examples:

>>> xx
>>> yy
>>> zz

Source code in leip_recipe_designer/helpers/offline.py

def query_pareto_optimal_recipes(
    df,
    method="exact",
    metric_custom="od_inf_rate_ms",
    sort_by="metric_custom_best_to_worst",
    top_N_fuzzy=None,
) -> Tuple[List[int], Dict[str, Any]]:
    """
    Get the pareto optimal experiments from the Golden Recipe dataframe

    The function supports two methods:
    - exact: the strict pareto front, i.e. every returned experiment is locally optimal for the given two metrics in
    the pareto sense

    - fuzzy: a smooth approximation to the strict front. The function estimates empirically how close to
        pareto-optimality each experiment is. This considers the empirical density of recipes, i.e. in places of the
        pareto space that are easier to achieve are biased against. The result is a ranking of experiments, where the
        top N experiments can be queried. It is *not* guaranteed that the top N experiments are exactly the strict
        pareto optimal ones. However, the fuzzy pareto optimal method allows to traverse more experiments than the
        strict pareto optimal method by choosing various N.

    In order to ensure that there is a trade off (and thus a reasonable pareto frontier), we are fixing one metric of
    the pareto frontier to 'relative_task_metric', the other can be chosen by the user.

    In the following, we will refer to the fixed metric as 'metric_accuracy' or 'accuracy metric' and the user chosen
    metric as 'metric_custom' or 'custom metric'.

    The function automatically converts metrics to conform to "larger value is better" for the pareto front
    computation. The conversion is usually performed by taking the maximum value of the metric and dividing it by the
    metric value. This is done to ensure that the pareto front is computed with "higher is better" in mind.
    The conversion is done on the entire set of values, not only the potentially pareto-optimal ones.

    The clear text names of any normalized metrics are returned in the output dictionary, such as "x times faster"
    (for timings), or "x times smaller" (for sizes and energy).

    Parameters
    ----------
    df:
        The dataframe containing the experiments from the Golden Recipe Database
    method:
        The method to use to compute the pareto front. Can be either 'exact' or 'fuzzy'
    metric_custom:
        The "other" metric to use for the pareto front. The second metric is fixed to 'relative_task_metric'.
    sort_by:
        The sorting method to use for the pareto front. Can be either of the following:
        metric_custom_best_to_worst: sort strictly by provided metric, best to worst (default)
        metric_accuracy_best_to_worst: sort strictly by accuracy metric, best to worst
        angular_custom_to_accuracy: sort by angular value from common origin from custom metric to accuracy metric.
        This is useful if one wants to explore the tradeoff between the two metrics
        paretoness: sort by the "estimated paretoness" value from best to worst. There is no particular order between
        custom and accuracy metric.

    Return Values
    ----------
    A tupel of two values:
    - The first value is a list of the experiment names that are pareto optimal, sorted according to the sort_by
    parameter
    - The second value is a dictionary containing containing various meta information about the pareto front:
        - 'metric_custom': the values of the first metric for the pareto optimal experiments (sorted according to the
        sort_by parameter)
        - 'metric_accuracy': the values of the second metric for the pareto optimal experiments (sorted according to
        the sort_by parameter)
        - 'metric_custom_name': the name of the first metric (as it may have normalized to conform to "larger value is
        better")
        - 'metric_accuracy_name': the name of the second metric (as it may have normalized to conform to "larger value
        is better")
        - 'metric_custom_all': the values of the first metric for all experiments in no particular order
        - 'metric_accuracy_all': the values of the second metric for all experiments in no particular order

    Examples
    ----------
    >>> xx
    >>> yy
    >>> zz

    """

    metric_2 = "relative_task_metric"
    metric_1 = metric_custom

    if metric_1 not in [
        "od_inf_rate_ms",
        "if_inf_macs",
        "od_total_time",
        "od_total_mem_MB",
        "od_gpu_mem_MB",
        "od_cpu_mem_MB",
        "od_unified_mem_MB",
        "if_inf_size",
        "od_size_MB",
        "od_inf_energy_mJ",
    ]:
        logger.error(f"Unsupported metric {metric_1} for pareto front")
        raise ValueError(f"Unsupported metric {metric_1} for pareto front")

    if method == "fuzzy":
        if top_N_fuzzy is None:
            logger.error("Please provide top_N_fuzzy when using fuzzy pareto mode")
            raise ValueError("Please provide top_N_fuzzy when using fuzzy pareto mode")

    assert sort_by in [
        "metric_custom_best_to_worst",
        "metric_accuracy_best_to_worst",
        "angular_custom_to_accuracy",
        "paretoness",
    ], f"Unknown sort_by {sort_by}"
    assert method in ["exact", "fuzzy"], f"Unknown method {method}"

    filtered_df = df[df[metric_1].notna()]
    filtered_df = filtered_df[filtered_df[metric_2].notna()]

    filtered_df_ri = filtered_df.reset_index()

    ret_meta: Dict[str, List] = {
        "metric_custom": [],
        "metric_accuracy": [],
        "metric_custom_name": [],
        "metric_accuracy_name": [],
        "metric_custom_all": [],
        "metric_accuracy_all": [],
    }

    if len(filtered_df) == 0:
        logger.warning(f"No experiments found where both metrics {metric_1} and {metric_2} have values")
        return [], ret_meta

    points_x, metric_x_name = _normalize_metric_to_bigger_is_better(
        metric_1, np.array(filtered_df[metric_1]), return_new_name=True
    )
    points_y, metric_y_name = _normalize_metric_to_bigger_is_better(
        metric_2, np.array(filtered_df[metric_2]), return_new_name=True
    )

    names = np.array(filtered_df_ri["id"])

    iv = None
    if method == "fuzzy":
        iv = _get_balanced_fuzzy_paretoness(points_x, points_y)

        # here we could also threshold the fuzzy pareto front, e.g. take the top 10% of points
        if top_N_fuzzy >= len(iv):
            top_N_fuzzy = len(iv) - 1

        fuzzy_pareto_idx = iv.argsort()[:top_N_fuzzy]
    elif method == "exact":
        fuzzy_pareto_idx = _get_exact_pareto_optimal(points_x, points_y)
    else:
        logger.error(f"Unknown method {method}")
        raise ValueError(f"Unknown method {method}")

    if sort_by == "metric_custom_best_to_worst":
        # sort by metric 1
        points_x_pareto = points_x[fuzzy_pareto_idx]
        fuzzy_pareto_idx = fuzzy_pareto_idx[np.argsort(-points_x_pareto)]
    elif sort_by == "metric_accuracy_best_to_worst":
        # sort by metric 2
        points_y_pareto = points_y[fuzzy_pareto_idx]
        fuzzy_pareto_idx = fuzzy_pareto_idx[np.argsort(-points_y_pareto)]
    elif sort_by == "angular_custom_to_accuracy":
        points_x_pareto = points_x[fuzzy_pareto_idx]
        points_y_pareto = points_y[fuzzy_pareto_idx]
        # subtract min to make the angles start at 0
        points_x_pareto -= points_x_pareto.min()
        points_y_pareto -= points_y_pareto.min()

        angles = np.arctan2(points_y_pareto, points_x_pareto)
        # sort the pareto index by angle
        fuzzy_pareto_idx = fuzzy_pareto_idx[np.argsort(angles)]
    elif sort_by == "paretoness":
        # sort by fuzzy paretoness
        if iv is None:
            iv = _get_balanced_fuzzy_paretoness(points_x, points_y)

        iv_pareto = iv[fuzzy_pareto_idx]
        fuzzy_pareto_idx = fuzzy_pareto_idx[np.argsort(iv_pareto)]
    else:
        logger.error(f"Unknown sort_by {sort_by}")
        raise ValueError(f"Unknown sort_by {sort_by}")

    return names[fuzzy_pareto_idx], {
        # 'grdb_id' : names[fuzzy_pareto_idx],
        "metric_custom": points_x[fuzzy_pareto_idx],
        "metric_accuracy": points_y[fuzzy_pareto_idx],
        "metric_custom_name": metric_x_name,
        "metric_accuracy_name": metric_y_name,
        "metric_custom_all": points_x,
        "metric_accuracy_all": points_y,
        # 'metric_1_original': np.array(filtered_df[metric_1])[fuzzy_pareto_idx],
        # 'metric_2_original': np.array(filtered_df[metric_2])[fuzzy_pareto_idx],
    }

GRDB Recipe Handling¶

leip_recipe_designer.helpers.offline.GRDB_recipe ¶

GRDB_recipe(recipe_id, pantry, grdb_volume=None, grdb_task=None, data_generator=None, allow_upgrade=True, allow_deprecated=False, verbose=0, default=False)

Source code in leip_recipe_designer/helpers/offline.py

def GRDB_recipe(
    recipe_id,
    pantry,
    grdb_volume=None,
    grdb_task=None,
    data_generator=None,
    allow_upgrade=True,
    allow_deprecated=False,
    verbose=0,
    default=False,
):
    if grdb_task is None:
        grdb_task = "vision.detection.2d"

    if data_generator is None:
        if grdb_task == "vision.detection.2d":
            try:
                data_generator = lrd.helpers.data.get_data_generator_by_name(
                    pantry, "Smoke (data/sets/url/detection/smoke-pascal-like)"
                )
            except Exception:
                logger.error(
                    "I was not able to create a default dataset for this recipe."
                    "Please provide a data generator via the data_generator parameter"
                )
                raise ValueError(
                    "I was not able to create a default dataset for this recipe."
                    "Please provide a data generator via the data_generator parameter"
                )
        else:
            logger.error(f"Please provide a data generator for task {grdb_task}")
            raise ValueError(f"Please provide a data generator for task {grdb_task}")

    if grdb_volume is None:
        if grdb_task == "vision.detection.2d":
            grdb_volume = "xval_det"
        else:
            logger.error(f"Please provide a volume for task {grdb_task}")
            raise ValueError(f"Please provide a volume for task {grdb_task}")

    recipe = lrd.create.from_recipe_id(
        recipe_id,
        pantry=pantry,
        volume=grdb_volume,
        task=grdb_task,
        allow_upgrade=allow_upgrade,
        allow_deprecated=allow_deprecated,
        verbose=verbose,
        default=default,
    )
    lrd.helpers.data.replace_data_generator(recipe, data_generator)

    single_recipe(recipe)