Skip to content

Offline Mode

These functions are designed to facilitate offline execution of recipes in LEIP Design. Below, you will find reference documentation for the key public methods available in the offline module. You can also review their source code by expanding the dropdowns.

To enable offline execution, ensure your pantry and recipes are properly set up as described in the examples below.

from leip_recipe_designer.helpers import offline
from leip_recipe_designer import Pantry

# Setup a pantry
pantry = Pantry.build("./local_pantry")
offline.setup(pantry)

Offline Execution Functions

Setup

leip_recipe_designer.helpers.offline.setup

setup(pantry)
Source code in leip_recipe_designer/helpers/offline.py
28
29
def setup(pantry):
    _setup(pantry)

Single Recipe Preparation

leip_recipe_designer.helpers.offline.single_recipe

single_recipe(recipe: RecipeNode, task='train')

Prepare a specific recipe for offline execution. The function does not modify the incoming recipe. The process to make a recipe offline-ready is to do a minimal training and validation run. It does require some data to run in preparation for offline execution. If you want to have certain data available offline, you can use this function to download the data as well. Note that this is not required if you bring the data yourself when offline. The latter can be addressed by using any of the BYOD (Bring Your Own Data) recipe componentes (see LEIP Design documentation).

The function makes the following modifications to the recipe before running it (the incoming recipe is not modified): - Set the fast_dev_run flag to True - Set the use_pretrained flag to True (i.e. force to use a pretrained model) - Set the batch size to 1 (to enable offlining a minimal size model)

Parameters:

  • recipe (RecipeNode) –

    The recipe to be available online

Examples:

>>> import leip_recipe_designer as lrd
>>>
>>> pantry = lrd.Pantry.build('my_pantry')
>>> recipe = lrd.create.empty_detection_recipe(pantry=pantry)
>>> recipe.fill_empty_recursively({'data_generator': 'Smoke (data/sets/url/detection/smoke-pascal-like)'})
>>> lrd.helpers.offline.single_recipe(recipe)
Source code in leip_recipe_designer/helpers/offline.py
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
def single_recipe(recipe: "RecipeNode", task="train"):
    """Prepare a specific recipe for offline execution. The function does not modify the incoming recipe. The process
    to make a recipe offline-ready is to do a minimal training and validation run. It does require some data to run in
    preparation for offline execution. If you want to have certain data available offline, you can use this function to
    download the data as well. Note that this is not required if you bring the data yourself when offline. The latter
    can be addressed by using any of the BYOD (Bring Your Own Data) recipe componentes (see LEIP Design documentation).

    The function makes the following modifications to the recipe before running it (the incoming recipe is not
    modified):
    - Set the fast_dev_run flag to True
    - Set the use_pretrained flag to True (i.e. force to use a pretrained model)
    - Set the batch size to 1 (to enable offlining a minimal size model)

    Parameters
    ----------
    recipe:
        The recipe to be available online

    Examples
    ----------
    >>> import leip_recipe_designer as lrd
    >>>
    >>> pantry = lrd.Pantry.build('my_pantry')
    >>> recipe = lrd.create.empty_detection_recipe(pantry=pantry)
    >>> recipe.fill_empty_recursively({'data_generator': 'Smoke (data/sets/url/detection/smoke-pascal-like)'})
    >>> lrd.helpers.offline.single_recipe(recipe)

    """
    _single_recipe(recipe, task=task)

Query Pareto Optimal Recipes

leip_recipe_designer.helpers.offline.query_pareto_optimal_recipes

query_pareto_optimal_recipes(df, method='exact', metric_custom='od_inf_rate_ms', sort_by='metric_custom_best_to_worst', top_N_fuzzy=None) -> Tuple[List[int], Dict[str, Any]]

Get the pareto optimal experiments from the Golden Recipe dataframe

The function supports two methods: - exact: the strict pareto front, i.e. every returned experiment is locally optimal for the given two metrics in the pareto sense

  • fuzzy: a smooth approximation to the strict front. The function estimates empirically how close to pareto-optimality each experiment is. This considers the empirical density of recipes, i.e. in places of the pareto space that are easier to achieve are biased against. The result is a ranking of experiments, where the top N experiments can be queried. It is not guaranteed that the top N experiments are exactly the strict pareto optimal ones. However, the fuzzy pareto optimal method allows to traverse more experiments than the strict pareto optimal method by choosing various N.

In order to ensure that there is a trade off (and thus a reasonable pareto frontier), we are fixing one metric of the pareto frontier to 'relative_task_metric', the other can be chosen by the user.

In the following, we will refer to the fixed metric as 'metric_accuracy' or 'accuracy metric' and the user chosen metric as 'metric_custom' or 'custom metric'.

The function automatically converts metrics to conform to "larger value is better" for the pareto front computation. The conversion is usually performed by taking the maximum value of the metric and dividing it by the metric value. This is done to ensure that the pareto front is computed with "higher is better" in mind. The conversion is done on the entire set of values, not only the potentially pareto-optimal ones.

The clear text names of any normalized metrics are returned in the output dictionary, such as "x times faster" (for timings), or "x times smaller" (for sizes and energy).

Parameters:

  • df

    The dataframe containing the experiments from the Golden Recipe Database

  • method

    The method to use to compute the pareto front. Can be either 'exact' or 'fuzzy'

  • metric_custom

    The "other" metric to use for the pareto front. The second metric is fixed to 'relative_task_metric'.

  • sort_by

    The sorting method to use for the pareto front. Can be either of the following: metric_custom_best_to_worst: sort strictly by provided metric, best to worst (default) metric_accuracy_best_to_worst: sort strictly by accuracy metric, best to worst angular_custom_to_accuracy: sort by angular value from common origin from custom metric to accuracy metric. This is useful if one wants to explore the tradeoff between the two metrics paretoness: sort by the "estimated paretoness" value from best to worst. There is no particular order between custom and accuracy metric.

Return Values

A tupel of two values: - The first value is a list of the experiment names that are pareto optimal, sorted according to the sort_by parameter - The second value is a dictionary containing containing various meta information about the pareto front: - 'metric_custom': the values of the first metric for the pareto optimal experiments (sorted according to the sort_by parameter) - 'metric_accuracy': the values of the second metric for the pareto optimal experiments (sorted according to the sort_by parameter) - 'metric_custom_name': the name of the first metric (as it may have normalized to conform to "larger value is better") - 'metric_accuracy_name': the name of the second metric (as it may have normalized to conform to "larger value is better") - 'metric_custom_all': the values of the first metric for all experiments in no particular order - 'metric_accuracy_all': the values of the second metric for all experiments in no particular order

Examples:

>>> xx
>>> yy
>>> zz
Source code in leip_recipe_designer/helpers/offline.py
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
def query_pareto_optimal_recipes(
    df,
    method="exact",
    metric_custom="od_inf_rate_ms",
    sort_by="metric_custom_best_to_worst",
    top_N_fuzzy=None,
) -> Tuple[List[int], Dict[str, Any]]:
    """
    Get the pareto optimal experiments from the Golden Recipe dataframe

    The function supports two methods:
    - exact: the strict pareto front, i.e. every returned experiment is locally optimal for the given two metrics in
    the pareto sense

    - fuzzy: a smooth approximation to the strict front. The function estimates empirically how close to
        pareto-optimality each experiment is. This considers the empirical density of recipes, i.e. in places of the
        pareto space that are easier to achieve are biased against. The result is a ranking of experiments, where the
        top N experiments can be queried. It is *not* guaranteed that the top N experiments are exactly the strict
        pareto optimal ones. However, the fuzzy pareto optimal method allows to traverse more experiments than the
        strict pareto optimal method by choosing various N.

    In order to ensure that there is a trade off (and thus a reasonable pareto frontier), we are fixing one metric of
    the pareto frontier to 'relative_task_metric', the other can be chosen by the user.

    In the following, we will refer to the fixed metric as 'metric_accuracy' or 'accuracy metric' and the user chosen
    metric as 'metric_custom' or 'custom metric'.

    The function automatically converts metrics to conform to "larger value is better" for the pareto front
    computation. The conversion is usually performed by taking the maximum value of the metric and dividing it by the
    metric value. This is done to ensure that the pareto front is computed with "higher is better" in mind.
    The conversion is done on the entire set of values, not only the potentially pareto-optimal ones.

    The clear text names of any normalized metrics are returned in the output dictionary, such as "x times faster"
    (for timings), or "x times smaller" (for sizes and energy).

    Parameters
    ----------
    df:
        The dataframe containing the experiments from the Golden Recipe Database
    method:
        The method to use to compute the pareto front. Can be either 'exact' or 'fuzzy'
    metric_custom:
        The "other" metric to use for the pareto front. The second metric is fixed to 'relative_task_metric'.
    sort_by:
        The sorting method to use for the pareto front. Can be either of the following:
        metric_custom_best_to_worst: sort strictly by provided metric, best to worst (default)
        metric_accuracy_best_to_worst: sort strictly by accuracy metric, best to worst
        angular_custom_to_accuracy: sort by angular value from common origin from custom metric to accuracy metric.
        This is useful if one wants to explore the tradeoff between the two metrics
        paretoness: sort by the "estimated paretoness" value from best to worst. There is no particular order between
        custom and accuracy metric.

    Return Values
    ----------
    A tupel of two values:
    - The first value is a list of the experiment names that are pareto optimal, sorted according to the sort_by
    parameter
    - The second value is a dictionary containing containing various meta information about the pareto front:
        - 'metric_custom': the values of the first metric for the pareto optimal experiments (sorted according to the
        sort_by parameter)
        - 'metric_accuracy': the values of the second metric for the pareto optimal experiments (sorted according to
        the sort_by parameter)
        - 'metric_custom_name': the name of the first metric (as it may have normalized to conform to "larger value is
        better")
        - 'metric_accuracy_name': the name of the second metric (as it may have normalized to conform to "larger value
        is better")
        - 'metric_custom_all': the values of the first metric for all experiments in no particular order
        - 'metric_accuracy_all': the values of the second metric for all experiments in no particular order

    Examples
    ----------
    >>> xx
    >>> yy
    >>> zz

    """

    metric_2 = "relative_task_metric"
    metric_1 = metric_custom

    if metric_1 not in [
        "od_inf_rate_ms",
        "if_inf_macs",
        "od_total_time",
        "od_total_mem_MB",
        "od_gpu_mem_MB",
        "od_cpu_mem_MB",
        "od_unified_mem_MB",
        "if_inf_size",
        "od_size_MB",
        "od_inf_energy_mJ",
    ]:
        logger.error(f"Unsupported metric {metric_1} for pareto front")
        raise ValueError(f"Unsupported metric {metric_1} for pareto front")

    if method == "fuzzy":
        if top_N_fuzzy is None:
            logger.error("Please provide top_N_fuzzy when using fuzzy pareto mode")
            raise ValueError("Please provide top_N_fuzzy when using fuzzy pareto mode")

    assert sort_by in [
        "metric_custom_best_to_worst",
        "metric_accuracy_best_to_worst",
        "angular_custom_to_accuracy",
        "paretoness",
    ], f"Unknown sort_by {sort_by}"
    assert method in ["exact", "fuzzy"], f"Unknown method {method}"

    filtered_df = df[df[metric_1].notna()]
    filtered_df = filtered_df[filtered_df[metric_2].notna()]

    filtered_df_ri = filtered_df.reset_index()

    ret_meta: Dict[str, List] = {
        "metric_custom": [],
        "metric_accuracy": [],
        "metric_custom_name": [],
        "metric_accuracy_name": [],
        "metric_custom_all": [],
        "metric_accuracy_all": [],
    }

    if len(filtered_df) == 0:
        logger.warning(f"No experiments found where both metrics {metric_1} and {metric_2} have values")
        return [], ret_meta

    points_x, metric_x_name = _normalize_metric_to_bigger_is_better(
        metric_1, np.array(filtered_df[metric_1]), return_new_name=True
    )
    points_y, metric_y_name = _normalize_metric_to_bigger_is_better(
        metric_2, np.array(filtered_df[metric_2]), return_new_name=True
    )

    names = np.array(filtered_df_ri["id"])

    iv = None
    if method == "fuzzy":
        iv = _get_balanced_fuzzy_paretoness(points_x, points_y)

        # here we could also threshold the fuzzy pareto front, e.g. take the top 10% of points
        if top_N_fuzzy >= len(iv):
            top_N_fuzzy = len(iv) - 1

        fuzzy_pareto_idx = iv.argsort()[:top_N_fuzzy]
    elif method == "exact":
        fuzzy_pareto_idx = _get_exact_pareto_optimal(points_x, points_y)
    else:
        logger.error(f"Unknown method {method}")
        raise ValueError(f"Unknown method {method}")

    if sort_by == "metric_custom_best_to_worst":
        # sort by metric 1
        points_x_pareto = points_x[fuzzy_pareto_idx]
        fuzzy_pareto_idx = fuzzy_pareto_idx[np.argsort(-points_x_pareto)]
    elif sort_by == "metric_accuracy_best_to_worst":
        # sort by metric 2
        points_y_pareto = points_y[fuzzy_pareto_idx]
        fuzzy_pareto_idx = fuzzy_pareto_idx[np.argsort(-points_y_pareto)]
    elif sort_by == "angular_custom_to_accuracy":
        points_x_pareto = points_x[fuzzy_pareto_idx]
        points_y_pareto = points_y[fuzzy_pareto_idx]
        # subtract min to make the angles start at 0
        points_x_pareto -= points_x_pareto.min()
        points_y_pareto -= points_y_pareto.min()

        angles = np.arctan2(points_y_pareto, points_x_pareto)
        # sort the pareto index by angle
        fuzzy_pareto_idx = fuzzy_pareto_idx[np.argsort(angles)]
    elif sort_by == "paretoness":
        # sort by fuzzy paretoness
        if iv is None:
            iv = _get_balanced_fuzzy_paretoness(points_x, points_y)

        iv_pareto = iv[fuzzy_pareto_idx]
        fuzzy_pareto_idx = fuzzy_pareto_idx[np.argsort(iv_pareto)]
    else:
        logger.error(f"Unknown sort_by {sort_by}")
        raise ValueError(f"Unknown sort_by {sort_by}")

    return names[fuzzy_pareto_idx], {
        # 'grdb_id' : names[fuzzy_pareto_idx],
        "metric_custom": points_x[fuzzy_pareto_idx],
        "metric_accuracy": points_y[fuzzy_pareto_idx],
        "metric_custom_name": metric_x_name,
        "metric_accuracy_name": metric_y_name,
        "metric_custom_all": points_x,
        "metric_accuracy_all": points_y,
        # 'metric_1_original': np.array(filtered_df[metric_1])[fuzzy_pareto_idx],
        # 'metric_2_original': np.array(filtered_df[metric_2])[fuzzy_pareto_idx],
    }

GRDB Recipe Handling

leip_recipe_designer.helpers.offline.GRDB_recipe

GRDB_recipe(recipe_id, pantry, grdb_volume=None, grdb_task=None, data_generator=None, allow_upgrade=True, allow_deprecated=False, verbose=0, default=False)
Source code in leip_recipe_designer/helpers/offline.py
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
def GRDB_recipe(
    recipe_id,
    pantry,
    grdb_volume=None,
    grdb_task=None,
    data_generator=None,
    allow_upgrade=True,
    allow_deprecated=False,
    verbose=0,
    default=False,
):
    if grdb_task is None:
        grdb_task = "vision.detection.2d"

    if data_generator is None:
        if grdb_task == "vision.detection.2d":
            try:
                data_generator = lrd.helpers.data.get_data_generator_by_name(
                    pantry, "Smoke (data/sets/url/detection/smoke-pascal-like)"
                )
            except Exception:
                logger.error(
                    "I was not able to create a default dataset for this recipe."
                    "Please provide a data generator via the data_generator parameter"
                )
                raise ValueError(
                    "I was not able to create a default dataset for this recipe."
                    "Please provide a data generator via the data_generator parameter"
                )
        else:
            logger.error(f"Please provide a data generator for task {grdb_task}")
            raise ValueError(f"Please provide a data generator for task {grdb_task}")

    if grdb_volume is None:
        if grdb_task == "vision.detection.2d":
            grdb_volume = "xval_det"
        else:
            logger.error(f"Please provide a volume for task {grdb_task}")
            raise ValueError(f"Please provide a volume for task {grdb_task}")

    recipe = lrd.create.from_recipe_id(
        recipe_id,
        pantry=pantry,
        volume=grdb_volume,
        task=grdb_task,
        allow_upgrade=allow_upgrade,
        allow_deprecated=allow_deprecated,
        verbose=verbose,
        default=default,
    )
    lrd.helpers.data.replace_data_generator(recipe, data_generator)

    single_recipe(recipe)