Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

#604 fix master installation

Merged
Ghost merged 1 commits into Deci-AI:master from deci-ai:feature/SG-000_fix_master_inastallation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
  1. # ResNet50 Imagenet classification training:
  2. # This example trains with batch_size = 192 * 8 GPUs, total 1536.
  3. # Training time on 8 x GeForce RTX A5000 is 9min / epoch.
  4. # Reach => 81.91 Top1 accuracy.
  5. #
  6. # Log and tensorboard at s3://deci-pretrained-models/KD_ResNet50_Beit_Base_ImageNet/average_model.pth
  7. # Instructions:
  8. # 0. Make sure that the data is stored in dataset_params.dataset_dir or add "dataset_params.data_dir=<PATH-TO-DATASET>" at the end of the command below (feel free to check ReadMe)
  9. # 1. Move to the project root (where you will find the ReadMe and src folder)
  10. # 2. Run the command:
  11. # python src/super_gradients/examples/train_from_kd_recipe_example/train_from_kd_recipe.py --config-name=imagenet_resnet50_kd
  12. defaults:
  13. - training_hyperparams: imagenet_resnet50_kd_train_params
  14. - dataset_params: imagenet_resnet50_kd_dataset_params
  15. - checkpoint_params: default_checkpoint_params
  16. - _self_
  17. train_dataloader: imagenet_train
  18. val_dataloader: imagenet_val
  19. resume: False
  20. training_hyperparams:
  21. resume: ${resume}
  22. loss: kd_loss
  23. criterion_params:
  24. distillation_loss_coeff: 0.8
  25. task_loss_fn:
  26. _target_: super_gradients.training.losses.label_smoothing_cross_entropy_loss.LabelSmoothingCrossEntropyLoss
  27. arch_params:
  28. teacher_input_adapter:
  29. _target_: super_gradients.training.utils.kd_trainer_utils.NormalizationAdapter
  30. mean_original: [0.485, 0.456, 0.406]
  31. std_original: [0.229, 0.224, 0.225]
  32. mean_required: [0.5, 0.5, 0.5]
  33. std_required: [0.5, 0.5, 0.5]
  34. student_arch_params:
  35. num_classes: 1000
  36. teacher_arch_params:
  37. num_classes: 1000
  38. image_size: [224, 224]
  39. patch_size: [16, 16]
  40. teacher_checkpoint_params:
  41. load_backbone: False # whether to load only backbone part of checkpoint
  42. checkpoint_path: # checkpoint path that is not located in super_gradients/checkpoints
  43. strict_load: # key matching strictness for loading checkpoint's weights
  44. _target_: super_gradients.training.sg_trainer.StrictLoad
  45. value: True
  46. pretrained_weights: imagenet
  47. checkpoint_params:
  48. teacher_pretrained_weights: imagenet
  49. student_checkpoint_params:
  50. load_backbone: False # whether to load only backbone part of checkpoint
  51. checkpoint_path: # checkpoint path that is not located in super_gradients/checkpoints
  52. strict_load: # key matching strictness for loading checkpoint's weights
  53. _target_: super_gradients.training.sg_trainer.StrictLoad
  54. value: True
  55. pretrained_weights: # a string describing the dataset of the pretrained weights (for example "imagenent").
  56. run_teacher_on_eval: True
  57. experiment_name: resnet50_imagenet_KD_Model
  58. ckpt_root_dir:
  59. multi_gpu: DDP
  60. num_gpus: 8
  61. architecture: kd_module
  62. student_architecture: resnet50
  63. teacher_architecture: beit_base_patch16_224
  64. # THE FOLLOWING PARAMS ARE DIRECTLY USED BY HYDRA
  65. hydra:
  66. run:
  67. # Set the output directory (i.e. where .hydra folder that logs all the input params will be generated)
  68. dir: ${hydra_output_dir:${ckpt_root_dir}, ${experiment_name}}
Discard
Tip!

Press p or to see the previous file or, n or to see the next file