Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

#875 Feature/sg 761 yolo nas

Merged
Ghost merged 1 commits into Deci-AI:master from deci-ai:feature/SG-761-yolo-nas
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
  1. # Distillation for semantic segmentation on Cityscapes dataset.
  2. #
  3. # Instructions:
  4. # 0. Make sure that the data is stored in dataset_params.dataset_dir or add "dataset_params.data_dir=<PATH-TO-DATASET>" at the end of the command below (feel free to check ReadMe)
  5. # 1. Move to the project root (where you will find the ReadMe and src folder)
  6. # 2. Run the command:
  7. # DDRNet23: python -m super_gradients.train_from_kd_recipe --config-name=cityscapes_kd_base student_architecture=ddrnet_23
  8. # DDRNet23-Slim: python -m super_gradients.train_from_kd_recipe --config-name=cityscapes_kd_base student_architecture=ddrnet_23_slim
  9. # Note: add "student_checkpoint_params.checkpoint_path=<ddrnet23-backbone-pretrained-path>" to use pretrained backbone
  10. #
  11. # Teachers specifications:
  12. # DDRNet39-AL: mIoU: 85.17 notes: trained with Cityscapes coarse data.
  13. #
  14. # Validation mIoU results - Cityscapes, training time:
  15. # DDRNet23: teacher: DDRNet39-AL input-size: [1024, 2048] mIoU: 81.48 4 X RTX A5000, 13 H
  16. # DDRNet23-Slim: teacher: DDRNet39-AL input-size: [1024, 2048] mIoU: 79.41 4 X RTX A5000, 11 H
  17. #
  18. # Pretrained backbones checkpoints:
  19. # https://deci-pretrained-models.s3.amazonaws.com/ddrnet/imagenet_pt_backbones/ddrnet23_bb_imagenet.pth
  20. # https://deci-pretrained-models.s3.amazonaws.com/ddrnet/imagenet_pt_backbones/ddrnet23_slim_bb_imagenet.pth
  21. #
  22. # Logs, tensorboards and network checkpoints:
  23. # DDRNet23: https://deci-pretrained-models.s3.amazonaws.com/ddrnet/cityscapes/ddrnet23_cwd/
  24. # DDRNet23-Slim: https://deci-pretrained-models.s3.amazonaws.com/ddrnet/cityscapes/ddrnet23_slim_cwd/
  25. #
  26. # Learning rate and batch size parameters, using 4 RTX A5000 with DDP:
  27. # DDRNet23: input-size: [1024, 1024] initial_lr: 0.0075 batch-size: 6 * 4gpus = 24
  28. # DDRNet23-Slim: input-size: [1024, 1024] initial_lr: 0.0075 batch-size: 6 * 4gpus = 24
  29. #
  30. # Teachers checkpoints:
  31. # DDRNet39-AL: https://deci-pretrained-models.s3.amazonaws.com/ddrnet/cityscapes/ddrnet39_al/average_model_2023_02_20.pth
  32. #
  33. # Comments:
  34. # * Pretrained backbones were used for the student models.
  35. # * Default hyper-parameters are based on DDRNet model train recipes, for full resolution training [1024 x 2048]
  36. defaults:
  37. - training_hyperparams: cityscapes_default_train_params
  38. - dataset_params: cityscapes_ddrnet_dataset_params
  39. - checkpoint_params: default_checkpoint_params
  40. - _self_
  41. - variable_setup
  42. train_dataloader: cityscapes_train
  43. val_dataloader: cityscapes_val
  44. resume: False
  45. training_hyperparams:
  46. sync_bn: True
  47. max_epochs: 500
  48. initial_lr: 0.0075 # batch size 24
  49. resume: ${resume}
  50. loss:
  51. _target_: super_gradients.training.losses.seg_kd_loss.SegKDLoss
  52. weights: [ 1. ]
  53. kd_loss_weights: [1., 6.]
  54. kd_loss:
  55. _target_: super_gradients.training.losses.cwd_loss.ChannelWiseKnowledgeDistillationLoss
  56. temperature: 3.
  57. normalization_mode: channel_wise
  58. ce_loss:
  59. _target_: torch.nn.CrossEntropyLoss
  60. ignore_index: 19
  61. student_arch_params:
  62. num_classes: 19
  63. use_aux_heads: False
  64. teacher_arch_params:
  65. num_classes: 19
  66. use_aux_heads: False
  67. # KD module arch params
  68. arch_params:
  69. teacher_checkpoint_params:
  70. load_backbone:
  71. checkpoint_path:
  72. strict_load: no_key_matching
  73. pretrained_weights: cityscapes
  74. student_checkpoint_params:
  75. load_backbone: True
  76. checkpoint_path: ??? # ImageNet pretrained checkpoints
  77. strict_load: no_key_matching
  78. pretrained_weights:
  79. run_teacher_on_eval: True
  80. multi_gpu: DDP
  81. num_gpus: 4
  82. architecture: kd_module
  83. student_architecture: ???
  84. teacher_architecture: ddrnet_39
  85. experiment_name: ${student_architecture}_teacher-${teacher_architecture}
Discard
Tip!

Press p or to see the previous file or, n or to see the next file