Relay For Developers¶

Relay is an embedded language that serves as the Intermediate Representation for machine learning models within TVM. There is an introduction to Intermediate Representations from the perspective of a cooking analogy. Please read that first. Below is a continuation and extension of the analogy for the developer learner.

Understanding Key Relay Expressions¶

Now, let's dive deeper into some technical terms you'll encounter in machine learning compilation using the IR that underpins the TVM compiler framework, Relay. To quickly emphasize, Relay is a language (and intermediate representation) that is generalizable and describes the ML algorithm. Below is a breakdown of commonplace building blocks found within Relay.

1. IRModule: The Master Recipe Book

Think of an IRModule as your master recipe book. It's where you keep the adaptable versions of all your recipes. It’s organized, detailed, and can be used to create specific recipes for different kitchens.

2. Function: Individual Recipes

In your recipe book, a Function is like an individual recipe. Each recipe has its own ingredients and steps, tailored for making a specific dish, i.e. specific set of instructions to make a particular part of your model.

3. Var (Variable): Adjustable Ingredients

When a recipe says 'add sugar', but doesn't say how much yet (or specification of type like brown sugar or syrup), 'sugar' is like a Var (short for 'Variable'). It's a placeholder for an ingredient whose amount might change depending on the dish you're making. With Relay, it's standard to represent the model's input(s) with a Var (and sometimes the weights as well).

4. Constant: Fixed Ingredients

Some ingredients in your recipe don't change, like the need for exactly a cup of flour in a cake. These are your Constant ingredients - they’re always the same. In your model, a weight may be represented by a Constant.

5. Call: Cooking Actions

When you perform a step in your recipe, like mixing or baking, that's a Call. It’s the action you take to turn your ingredients one step closer to a delicious dish. It’s an operation in your model, like adding numbers or multiplying them.

6. Tuple: Ingredient Mixes

A Tuple is like a mix of ingredients you prepare together, you might put them in a bowl. A Tuple is like this bowl, holding different values or results together. It's a combination of things that you may use at the same time.

7. TupleGetItem: Selecting a Single Ingredient from a Mix

If you need to get just one thing from your mix, like only the eggs, that’s like TupleGetItem. You’re picking out a specific part of your combined ingredients.

8. GlobalVar: Famous, Widely-Used Recipes

Imagine you have a special technique that you use in many different recipes of your cookbook. This technique is stored separately and is referred to in multiple recipes of your cookbook. It's a known technique referenced by name. This is like a GlobalVar - a named function or operation that you can use in different parts of your models.

Relay Syntax¶

Relay is pretty straightforward, and should look familiar to the ML engineer/developer. It serves as a one-to-one representation of some model that originates from a ML-framework like PyTorch, ONNX, or TensorFlow.

IRModule¶

One should expect to see something like the below after ingesting a model from an ML-framework into Relay. The IRModule describes a whole and complete program. The example below will be used to detail all the sub-components. An IRModule is not an expression in itself, but it is comprised of Relay expressions.

Note

Sections of the Relay text are contracted with ellipses.

def @main(%image: Tensor[(1, 3, 640, 640), float32], %onnx::Conv_1100: Tensor[(16, 3, 6, 6), float32], ..., %model.model.24.dfl.conv.weight: Tensor[(1, 16, 1, 1), float32]) -> Tensor[(1, 8400, 84), float32] {
  %0 = nn.conv2d(%image, %onnx::Conv_1100, strides=[2, 2], padding=[2, 2, 2, 2], channels=16, kernel_size=[6, 6]);
  %1 = nn.bias_add(%0, %onnx::Conv_1101);
  %2 = sigmoid(%1);
  %3 = multiply(%1, %2);
  %4 = nn.conv2d(%3, %onnx::Conv_1103, strides=[2, 2], padding=[1, 1, 1, 1], channels=32, kernel_size=[3, 3]);
  ...
  %404 = scatter_nd(%399, meta[relay.Constant][10], %403);
  %405 = expand_dims(%404, axis=0);
  %408 = strided_slice(%358, begin=[4], end=[84], strides=[1], axes=[2]);
  %409 = (%405, %408);
  %410 = concatenate(%409, axis=2)
  %411 = %409.0
  %412 = (%410, %411)
}

Function Exression¶

There is always a defined main function.

def @main(%images: Tensor[(1, 3, 640, 640), float32], %onnx::Conv_1100: Tensor[(16, 3, 6, 6), float32], ..., %model.model.24.dfl.conv.weight: Tensor[(1, 16, 1, 1), float32]) -> Tensor[(1, 8400, 84), float32] {
    ...
}

Function Syntax

Functions are declared with a def keyword and may or may not have a name associated with it following the def keyword.
In parentheses (...) the parameter arguments for the function are declared.
The function's return type (shape and dtype) is hinted with ->.
The body of the function are enclosed in curly braces {...}.

Var (Variable) Expression¶

Variables are commonplace in Relay model and can be used to represent both inputs and weights, and can be referenced via an assigned name. Inputs are always represented by Vars. Vars do not inherently have a value associated with it, i.e. the exact value will be provided at runtime or exists outside the model.

%image: Tensor[(1, 3, 640, 640), float32]
%0 = ...
%1 = ...

Var Syntax

All variables are identifiable with the "%" prefix sigil.
Variables are named with alphanumeric characters
Intermediate computations are stored as variables, e.g. %1 (these intermediate values do not explicitly have an assigned variable name)

Constant Expression¶

Constants, unlike variables explicitly are defined with a static value. Constants do not have assigned names. Weights and other static values (i.e. non-input values) can be represented by constant expressions.

meta[relay.Constant][10]

Constant Syntax

All constants are identifiable with the meta[relay.Constant][...] string

Call Expression¶

Calls are steps of computation, which are "calls" to predefined functions or operators. Calls typically expect arguments.

%0 = nn.conv2d(%image, %onnx::Conv_1100, strides=[2, 2], padding=[2, 2, 2, 2], channels=16, kernel_size=[6, 6]);
%1 = nn.bias_add(%0, %onnx::Conv_1101);
%2 = sigmoid(%1);

Call Syntax

Calls typically appear as operators, identifiable via the string-representation, e.g. nn.conv2d, nn.bias_add, or sigmoid
Calls appear as lines of computation in the IRModule and the outputs are stored as intermediate variables, e.g. %0 = ..., %1 = ...
Calls will reference variables or constants as arguments to the operator or function, e.g. nn.bias_add(%0, %onnx::Conv_1101) or sigmoid(%1)

Tuple Expression¶

A sequential container of Relay expressions, akin to a list or tuple data structure seen in many programming languages.

%409 = (%405, %408);
%412 = (%410, %411)

Tuple Syntax

Tuples are expressed with comma-seperated expressions/references enclosed in parentheses, e.g. (%x, %y, %z)
Tuples can be of any size, and can contain any Relay expression
Tuples can be expressed literally (as shown above) but can also be implicitly expressed as the output of a Call

TupleGetItem Expression¶

TupleGetItem expressions are just a way of referencing a specific element within a Tuple by index.

%411 = %409.0

TupleGetItem Syntax

Comprised of two parts: the reference to a tuple, and an index
The tuple-reference, and the element index are separated by a period, "."

GlobalVar Expression¶

GlobalVars are special references that can be referenced within any "scope" of the IRModule program. Typically, only top-level functions are assigned GlobalVar references.

@main

GlobalVar Syntax

Identifiable with the "@" prefix sigil
The "main" function is referenced via its assigned "main" GlobalVar
When graphs are partitioned, the subgraphs are represented as functions and are assigned unique GlobalVar references <<<<<<<< HEAD:docs/content/intro/relay-adv.md

========

develop:docs/dev/relay.md