Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

Computer_Vision_Models_Pretrained_Checkpoints.md 5.7 KB

You have to be logged in to leave a comment. Sign In

Computer Vision Models - Pretrained Checkpoints

Pretrained Classification PyTorch Checkpoints

Model Dataset Resolution Top-1 Top-5 Latency (HW)*T4 Latency (Production)**T4 Latency (HW)*Jetson Xavier NX Latency (Production)**Jetson Xavier NX Latency Cascade Lake
ViT base ImageNet21K 224x224 84.15 - 4.46ms 4.60ms - * - 57.22ms
ViT large ImageNet21K 224x224 85.64 - 12.81ms 13.19ms - * - 187.22ms
BEiT ImageNet21K 224x224 - - -ms -ms - * - -ms
EfficientNet B0 ImageNet 224x224 77.62 93.49 0.93ms 1.38ms - * - 3.44ms
RegNet Y200 ImageNet 224x224 70.88 89.35 0.63ms 1.08ms 2.16ms 2.47ms 2.06ms
RegNet Y400 ImageNet 224x224 74.74 91.46 0.80ms 1.25ms 2.62ms 2.91ms 2.87ms
RegNet Y600 ImageNet 224x224 76.18 92.34 0.77ms 1.22ms 2.64ms 2.93ms 2.39ms
RegNet Y800 ImageNet 224x224 77.07 93.26 0.74ms 1.19ms 2.77ms 3.04ms 2.81ms
ResNet 18 ImageNet 224x224 70.6 89.64 0.52ms 0.95ms 2.01ms 2.30ms 4.56ms
ResNet 34 ImageNet 224x224 74.13 91.7 0.92ms 1.34ms 3.57ms 3.87ms 7.64ms
ResNet 50 ImageNet 224x224 81.91 93.0 1.03ms 1.44ms 4.78ms 5.10ms 9.25ms
MobileNet V3_large-150 epochs ImageNet 224x224 73.79 91.54 0.67ms 1.11ms 2.42ms 2.71ms 1.76ms
MobileNet V3_large-300 epochs ImageNet 224x224 74.52 91.92 0.67ms 1.11ms 2.42ms 2.71ms 1.76ms
MobileNet V3_small ImageNet 224x224 67.45 87.47 0.55ms 0.96ms 2.01ms * 2.35ms 1.06ms
MobileNet V2_w1 ImageNet 224x224 73.08 91.1 0.46 ms 0.89ms 1.65ms * 1.90ms 1.56ms

NOTE:

  • Latency (HW)* - Hardware performance (not including IO)
  • Latency (Production)** - Production Performance (including IO)
  • Performance measured for T4 and Jetson Xavier NX with TensorRT, using FP16 precision and batch size 1
  • Performance measured for Cascade Lake CPU with OpenVINO, using FP16 precision and batch size 1

Pretrained Object Detection PyTorch Checkpoints

Model Dataset Resolution mAPval
0.5:0.95
Latency (HW)*T4 Latency (Production)**T4 Latency (HW)*Jetson Xavier NX Latency (Production)**Jetson Xavier NX Latency Cascade Lake
SSD lite MobileNet v2 COCO 320x320 21.5 0.77ms 1.40ms 5.28ms 6.44ms 4.13ms
SSD lite MobileNet v1 COCO 320x320 24.3 1.55ms 2.84ms 8.07ms 9.14ms 22.76ms
YOLOX nano COCO 640x640 26.77 2.47ms 4.09ms 11.49ms 12.97ms -
YOLOX tiny COCO 640x640 37.18 3.16ms 4.61ms 15.23ms 19.24ms -
YOLOX small COCO 640x640 40.47 3.58ms 4.94ms 18.88ms 22.48ms -
YOLOX medium COCO 640x640 46.4 6.40ms 7.65ms 39.22ms 44.5ms -
YOLOX large COCO 640x640 49.25 10.07ms 11.12ms 68.73ms 77.01ms -

NOTE:

  • Latency (HW)* - Hardware performance (not including IO)
  • Latency (Production)** - Production Performance (including IO)
  • Latency performance measured for T4 and Jetson Xavier NX with TensorRT, using FP16 precision and batch size 1
  • Latency performance measured for Cascade Lake CPU with OpenVINO, using FP16 precision and batch size 1

Pretrained Semantic Segmentation PyTorch Checkpoints

Model Dataset Resolution mIoU Latency b1T4 Latency b1T4 including IO Latency (Production)**Jetson Xavier NX
PP-LiteSeg B50 Cityscapes 512x1024 76.48 4.18ms 31.22ms 31.69ms
PP-LiteSeg B75 Cityscapes 768x1536 78.52 6.84ms 33.69ms 49.89ms
PP-LiteSeg T50 Cityscapes 512x1024 74.92 3.26ms 30.33ms 26.20ms
PP-LiteSeg T75 Cityscapes 768x1536 77.56 5.20ms 32.28ms 38.03ms
DDRNet 23 slim Cityscapes 1024x2048 78.01 5.74ms 32.01ms 45.18ms
DDRNet 23 Cityscapes 1024x2048 80.26 12.74ms 39.01ms 106.26ms
STDC 1-Seg50 Cityscapes 512x1024 75.11 3.34ms 30.12ms 27.54ms
STDC 1-Seg75 Cityscapes 768x1536 77.8 5.53ms 32.490ms 43.88
STDC 2-Seg50 Cityscapes 512x1024 76.44 4.12ms 30.94ms 32.03ms
STDC 2-Seg75 Cityscapes 768x1536 78.93 6.95ms 33.89ms 54.48ms
RegSeg (exp48) Cityscapes 1024x2048 78.15 12.03ms 38.91ms 78.20ms
Larger RegSeg (exp53) Cityscapes 1024x2048 79.2 22.00ms 48.96ms 150.78ms

NOTE: Performance measured on T4 GPU with TensorRT, using FP16 precision and batch size 1 (latency), and not including IO

NOTE: For resolutions below 1024x2048 we first resize the input to the inference resolution and then resize the predictions to 1024x2048. The time of resizing is included in the measurements so that the practical input-size is 1024x2048.

Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...