The following tables provide a list of classifier backbones that represent a range of performance options. These entries are the result of Latent AI testing hundreds of Classifier Backbones and selecting the models that provided the most accurate results for different target speeds in our device farm. The models are listed with the fastest at the top and more accurate models at the bottom. Accuracy may vary depending on your dataset. Keep in mind that the dataset used in the tutorial is small and should not be used for accuracy assessment.

Each of the models in the first table will work for applications targets ARM and x86, both CPU+GPU and CPU only.

In addition to the backbone name in the first column (which you can pass to AF in Step One), we provide a relative latency metric. This latency metric was determined through training and then testing the models on a cluster of a Raspberry Pi. The latency metric is dependent on hardware settings, quantization settings, memory, etc., so it is best used for relative comparison between the models.

The per-channel column indicates whether the backbone is compatible with our per-channel quantization for CPU only (x86, ARM) Int8 optimization. In many cases, you will get better accuracy using per-channel quantization on these models. Some models utilize operators that are currently unsupported by our Int8 optimization. For optimizations targeting GPU, per-channel Int8 quantization is used by default.

Table 1 - Classifier ML Recipe Variants

Backbone

Relative Latency

Per-channel (CPU)

Limitations

timm:semnasnet_075

85

(tick)

timm:efficientnet_es

318

(tick)

timm:efficientnet_em

463

(tick)

timm:gernet_m

521

(tick)

timm:regnety_032

623

timm:gernet_l

656

(tick)

timm:swsl_resnet50

708

(tick)

timm:efficientnet_el

779

(tick)

timm:swsl_resnext50_32x4d

816

timm:regnety_040

866

timm:inception_v4

1160

(tick)

Width, Height > 64

timm:regnety_064

1255

timm:resnet101

1348

(tick)

timm:inception_resnet_v2

1372

(tick)

Width, Height > 64

timm:swsl_resnext101_32x4d

1485

timm:regnety_080

1666

In addition to the models above, there are some additional models that do not currently support Int8 quantization for GPU. The first four models may be of particular interest for low SWaP applications due to their relative latency on modest CPU architectures.

Table 2 - Classifier ML Recipe Variants - No GPU Int8 Support

Backbone

Relative Latency

Per-channel (CPU)

Limitations

ptcv:fdmobilenet_wd4

10

(tick)

only 224x224

ptcv:fdmobilenet_wd2

13

(tick)

only 224x224

ptcv:shufflenetv2_w2

18

(tick)

only 224x224

ptcv:shufflenetv2b_w2

18

(tick)

only 224x224

timm:ens_adv_inception_resnet_v2

1394

(tick)

Width, Height > 64

timm:resnet152

1948

(tick)