Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

#875 Feature/sg 761 yolo nas

Merged
Ghost merged 1 commits into Deci-AI:master from deci-ai:feature/SG-761-yolo-nas
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
  1. # ResNet50 Imagenet classification training:
  2. # This example trains with batch_size = 192 * 8 GPUs, total 1536.
  3. # Training time on 8 x GeForce RTX A5000 is 9min / epoch.
  4. # Reach => 81.91 Top1 accuracy.
  5. #
  6. # Log and tensorboard at s3://deci-pretrained-models/KD_ResNet50_Beit_Base_ImageNet/average_model.pth
  7. # Instructions:
  8. # 0. Make sure that the data is stored in dataset_params.dataset_dir or add "dataset_params.data_dir=<PATH-TO-DATASET>" at the end of the command below (feel free to check ReadMe)
  9. # 1. Move to the project root (where you will find the ReadMe and src folder)
  10. # 2. Run the command:
  11. # python -m super_gradients.train_from_kd_recipe --config-name=imagenet_resnet50_kd
  12. defaults:
  13. - training_hyperparams: imagenet_resnet50_kd_train_params
  14. - dataset_params: imagenet_resnet50_kd_dataset_params
  15. - checkpoint_params: default_checkpoint_params
  16. - _self_
  17. - variable_setup
  18. train_dataloader: imagenet_train
  19. val_dataloader: imagenet_val
  20. resume: False
  21. training_hyperparams:
  22. resume: ${resume}
  23. loss: kd_loss
  24. criterion_params:
  25. distillation_loss_coeff: 0.8
  26. task_loss_fn:
  27. _target_: super_gradients.training.losses.label_smoothing_cross_entropy_loss.LabelSmoothingCrossEntropyLoss
  28. arch_params:
  29. teacher_input_adapter:
  30. _target_: super_gradients.training.utils.kd_trainer_utils.NormalizationAdapter
  31. mean_original: [0.485, 0.456, 0.406]
  32. std_original: [0.229, 0.224, 0.225]
  33. mean_required: [0.5, 0.5, 0.5]
  34. std_required: [0.5, 0.5, 0.5]
  35. student_arch_params:
  36. num_classes: 1000
  37. teacher_arch_params:
  38. num_classes: 1000
  39. image_size: [224, 224]
  40. patch_size: [16, 16]
  41. teacher_checkpoint_params:
  42. load_backbone: False # whether to load only backbone part of checkpoint
  43. checkpoint_path: # checkpoint path that is not located in super_gradients/checkpoints
  44. strict_load: # key matching strictness for loading checkpoint's weights
  45. _target_: super_gradients.training.sg_trainer.StrictLoad
  46. value: True
  47. pretrained_weights: imagenet
  48. checkpoint_params:
  49. teacher_pretrained_weights: imagenet
  50. student_checkpoint_params:
  51. load_backbone: False # whether to load only backbone part of checkpoint
  52. checkpoint_path: # checkpoint path that is not located in super_gradients/checkpoints
  53. strict_load: # key matching strictness for loading checkpoint's weights
  54. _target_: super_gradients.training.sg_trainer.StrictLoad
  55. value: True
  56. pretrained_weights: # a string describing the dataset of the pretrained weights (for example "imagenent").
  57. run_teacher_on_eval: True
  58. experiment_name: resnet50_imagenet_KD_Model
  59. multi_gpu: DDP
  60. num_gpus: 8
  61. architecture: kd_module
  62. student_architecture: resnet50
  63. teacher_architecture: beit_base_patch16_224
Discard
Tip!

Press p or to see the previous file or, n or to see the next file