@@ -309,6 +309,7 @@ across all GPUs after the backward pass.
 
                             #### How to use it ?
                
 
                             You can use SuperGradients to train your model with DDP in just a few lines.
                
 
                            +
                
 
                             *main.py*
                
 
                             ```python
                
 
                             from super_gradients import init_trainer, Trainer
                
@@ -347,6 +348,54 @@ python -m torch.distributed.launch --nproc_per_node=4 main.py
 
                             torchrun --nproc_per_node=4 main.py
                
 
                             ```
                
 
                            +#### Calling functions on a single node
                
 
                            +
                
 
                            +It is often in DDP training that we want to execute code on the master rank (i.e rank 0).
                
 
                            +In SG, users usually execute their own code by triggering "Phase Callbacks" (see "Using phase callbacks" section below).
                
 
                            +One can make sure the desired code will only be ran on rank 0, using ddp_silent_mode or the multi_process_safe decorator.
                
 
                            +For example, consider the simple phase callback below, that uploads the first 3 images of every batch during training to
                
 
                            +the Tensorboard:
                
 
                            +
                
 
                            +```python
                
 
                            +from super_gradients.training.utils.callbacks import PhaseCallback, PhaseContext, Phase
                
 
                            +from super_gradients.common.environment.env_helpers import multi_process_safe
                
 
                            +
                
 
                            +class Upload3TrainImagesCalbback(PhaseCallback):
                
 
                            +    def __init__(
                
 
                            +        self,
                
 
                            +    ):
                
 
                            +        super().__init__(phase=Phase.TRAIN_BATCH_END)
                
 
                            +    
                
 
                            +    @multi_process_safe
                
 
                            +    def __call__(self, context: PhaseContext):
                
 
                            +        batch_imgs = context.inputs.cpu().detach().numpy()
                
 
                            +        tag = "batch_" + str(context.batch_idx) + "_images"
                
 
                            +        context.sg_logger.add_images(tag=tag, images=batch_imgs[: 3], global_step=context.epoch)
                
 
                            +
                
 
                            +```
                
 
                            +The @multi_process_safe decorator ensures that the callback will only be triggered by rank 0. Alternatively, this can also
                
 
                            +be done by the SG trainer boolean attribute (which the phase context has access to), ddp_silent_mode, which is set to False
                
 
                            +iff the current process rank is zero (even after the process group has been killed):
                
 
                            +```python
                
 
                            +from super_gradients.training.utils.callbacks import PhaseCallback, PhaseContext, Phase
                
 
                            +
                
 
                            +class Upload3TrainImagesCalbback(PhaseCallback):
                
 
                            +    def __init__(
                
 
                            +        self,
                
 
                            +    ):
                
 
                            +        super().__init__(phase=Phase.TRAIN_BATCH_END)
                
 
                            +
                
 
                            +    def __call__(self, context: PhaseContext):
                
 
                            +        if not context.ddp_silent_mode:
                
 
                            +            batch_imgs = context.inputs.cpu().detach().numpy()
                
 
                            +            tag = "batch_" + str(context.batch_idx) + "_images"
                
 
                            +            context.sg_logger.add_images(tag=tag, images=batch_imgs[: 3], global_step=context.epoch)
                
 
                            +
                
 
                            +```
                
 
                            +
                
 
                            +Note that ddp_silent_mode can be accessed through SgTrainer.ddp_silent_mode. Hence, it can be used in scripts after calling
                
 
                            +SgTrainer.train() when some part of it should be ran on rank 0 only.
                
 
                            +
                
 
                             #### Good to know
                
 
                             Your total batch size will be (number of gpus x batch size), so you might want to increase your learning rate.
                
 
                             There is no clear rule, but a rule of thumb seems to be to [linearly increase the learning rate with the number of gpus](https://arxiv.org/pdf/1706.02677.pdf)