Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

#761 Fix requirements

Merged
Ghost merged 1 commits into Deci-AI:master from deci-ai:hotfix/SG-000-fix_requirements_onnx
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
  1. # Distillation for semantic segmentation on Cityscapes dataset.
  2. #
  3. # Instructions:
  4. # 0. Make sure that the data is stored in dataset_params.dataset_dir or add "dataset_params.data_dir=<PATH-TO-DATASET>" at the end of the command below (feel free to check ReadMe)
  5. # 1. Move to the project root (where you will find the ReadMe and src folder)
  6. # 2. Run the command:
  7. # DDRNet23: python src/super_gradients/examples/train_from_kd_recipe_example/train_from_kd_recipe.py --config-name=cityscapes_kd_base student_architecture=ddrnet_23
  8. # DDRNet23-Slim: python src/super_gradients/examples/train_from_kd_recipe_example/train_from_kd_recipe.py --config-name=cityscapes_kd_base student_architecture=ddrnet_23_slim
  9. # Note: add "student_checkpoint_params.checkpoint_path=<ddrnet23-backbone-pretrained-path>" to use pretrained backbone
  10. #
  11. # Teachers specifications:
  12. # DDRNet39-AL: mIoU: 85.17 notes: trained with Cityscapes coarse data.
  13. #
  14. # Validation mIoU results - Cityscapes, training time:
  15. # DDRNet23: teacher: DDRNet39-AL input-size: [1024, 2048] mIoU: 81.48 4 X RTX A5000, 13 H
  16. # DDRNet23-Slim: teacher: DDRNet39-AL input-size: [1024, 2048] mIoU: 79.41 4 X RTX A5000, 11 H
  17. #
  18. # Pretrained backbones checkpoints:
  19. # https://deci-pretrained-models.s3.amazonaws.com/ddrnet/imagenet_pt_backbones/ddrnet23_bb_imagenet.pth
  20. # https://deci-pretrained-models.s3.amazonaws.com/ddrnet/imagenet_pt_backbones/ddrnet23_slim_bb_imagenet.pth
  21. #
  22. # Logs, tensorboards and network checkpoints:
  23. # DDRNet23: https://deci-pretrained-models.s3.amazonaws.com/ddrnet/cityscapes/ddrnet23_cwd/
  24. # DDRNet23-Slim: https://deci-pretrained-models.s3.amazonaws.com/ddrnet/cityscapes/ddrnet23_slim_cwd/
  25. #
  26. # Learning rate and batch size parameters, using 4 RTX A5000 with DDP:
  27. # DDRNet23: input-size: [1024, 1024] initial_lr: 0.0075 batch-size: 6 * 4gpus = 24
  28. # DDRNet23-Slim: input-size: [1024, 1024] initial_lr: 0.0075 batch-size: 6 * 4gpus = 24
  29. #
  30. # Teachers checkpoints:
  31. # DDRNet39-AL: https://deci-pretrained-models.s3.amazonaws.com/ddrnet/cityscapes/ddrnet39_al/average_model_2023_02_20.pth
  32. #
  33. # Comments:
  34. # * Pretrained backbones were used for the student models.
  35. # * Default hyper-parameters are based on DDRNet model train recipes, for full resolution training [1024 x 2048]
  36. defaults:
  37. - training_hyperparams: cityscapes_default_train_params
  38. - dataset_params: cityscapes_ddrnet_dataset_params
  39. - checkpoint_params: default_checkpoint_params
  40. - _self_
  41. train_dataloader: cityscapes_train
  42. val_dataloader: cityscapes_val
  43. resume: False
  44. training_hyperparams:
  45. sync_bn: True
  46. max_epochs: 500
  47. initial_lr: 0.0075 # batch size 24
  48. resume: ${resume}
  49. loss:
  50. _target_: super_gradients.training.losses.seg_kd_loss.SegKDLoss
  51. weights: [ 1. ]
  52. kd_loss_weights: [1., 6.]
  53. kd_loss:
  54. _target_: super_gradients.training.losses.cwd_loss.ChannelWiseKnowledgeDistillationLoss
  55. temperature: 3.
  56. normalization_mode: channel_wise
  57. ce_loss:
  58. _target_: torch.nn.CrossEntropyLoss
  59. ignore_index: 19
  60. student_arch_params:
  61. num_classes: 19
  62. use_aux_heads: False
  63. teacher_arch_params:
  64. num_classes: 19
  65. use_aux_heads: False
  66. # KD module arch params
  67. arch_params:
  68. teacher_checkpoint_params:
  69. load_backbone:
  70. checkpoint_path:
  71. strict_load: no_key_matching
  72. pretrained_weights: cityscapes
  73. student_checkpoint_params:
  74. load_backbone: True
  75. checkpoint_path: ??? # ImageNet pretrained checkpoints
  76. strict_load: no_key_matching
  77. pretrained_weights:
  78. run_teacher_on_eval: True
  79. multi_gpu: DDP
  80. num_gpus: 4
  81. architecture: kd_module
  82. student_architecture: ???
  83. teacher_architecture: ddrnet_39
  84. experiment_name: ${student_architecture}_teacher-${teacher_architecture}
  85. ckpt_root_dir:
  86. # THE FOLLOWING PARAMS ARE DIRECTLY USED BY HYDRA
  87. hydra:
  88. run:
  89. # Set the output directory (i.e. where .hydra folder that logs all the input params will be generated)
  90. dir: ${hydra_output_dir:${ckpt_root_dir}, ${experiment_name}}
Discard
Tip!

Press p or to see the previous file or, n or to see the next file