Are you sure you want to delete this access key?
SuperGradients provides multiple Datasets implementations.
Supports download
from super_gradients.training.datasets import Cifar10
dataset = Cifar10(..., download=True)
Imagenet
├──train
│ ├──n02093991
│ │ ├──n02093991_1001.JPEG
│ │ ├──n02093991_1004.JPEG
│ │ └──...
│ ├──n02093992
│ └──...
└──val
├──n02093991
├──n02093992
└──...
from super_gradients.training.datasets import ImageNetDataset
train_set = ImageNetDataset(root='.../Imagenet/train', ...)
valid_set = ImageNetDataset(root='.../Imagenet/val', ...)
coco
├── annotations
│ ├─ instances_train2017.json
│ ├─ instances_val2017.json
│ └─ ...
└── images
├── train2017
│ ├─ 000000000001.jpg
│ └─ ...
└── val2017
└─ ...
from super_gradients.training.datasets import COCODetectionDataset
train_set = COCODetectionDataset(data_dir='.../coco', subdir='images/train2017', json_file='instances_train2017.json', ...)
valid_set = COCODetectionDataset(data_dir='.../coco', subdir='images/val2017', json_file='instances_val2017.json', ...)
Supports download
from super_gradients.training.datasets import PascalVOCDetectionDataset
train_set = PascalVOCDetectionDataset(download=True, ...)
Dataset Structure:
Dataset structure:
├─images
│ ├─ train2012
│ ├─ val2012
│ ├─ VOCdevkit
│ │ ├─ VOC2007
│ │ │ ├──JPEGImages
│ │ │ ├──SegmentationClass
│ │ │ ├──ImageSets
│ │ │ ├──ImageSets/Segmentation
│ │ │ ├──ImageSets/Main
│ │ │ ├──ImageSets/Layout
│ │ │ ├──Annotations
│ │ │ └──SegmentationObject
│ │ └──VOC2012
│ │ ├──JPEGImages
│ │ ├──SegmentationClass
│ │ ├──ImageSets
│ │ ├──ImageSets/Segmentation
│ │ ├──ImageSets/Main
│ │ ├──ImageSets/Action
│ │ ├──ImageSets/Layout
│ │ ├──Annotations
│ │ └──SegmentationObject
│ ├─train2007
│ ├─test2007
│ └─val2007
└─labels
├─train2012
├─val2012
├─train2007
├─test2007
└─val2007
Download your dataset (can be from https://roboflow.com/universe)
You should have a structure similar to this.
data_dir
└── train/test/val
├── images
│ ├─ 0001.jpg
│ ├─ 0002.jpg
│ └─ ...
└── labels
├─ 0001.txt
├─ 0002.txt
└─ ...
Note: train/test/val folders are not required, any folder structure is supported.
from super_gradients.training.datasets import YoloDarknetFormatDetectionDataset
data_set = YoloDarknetFormatDetectionDataset(data_dir='<path-to>/data_dir', images_dir="<train/test/val>/images", labels_dir="<train/test/val>/labels", classes=["<to-fill>"])
root_dir (in recipe default to /data/cityscapes)
├─── gtFine
│ ├── test
│ │ ├── berlin
│ │ │ ├── berlin_000000_000019_gtFine_color.png
│ │ │ ├── berlin_000000_000019_gtFine_instanceIds.png
│ │ │ └── ...
│ │ ├── bielefeld
│ │ │ └── ...
│ │ └── ...
│ ├─── train
│ │ └── ...
│ └─── val
│ └── ...
└─── leftImg8bit
├── test
│ └── ...
├─── train
│ └── ...
└─── val
└── ...
lists
├── labels.csv
├── test.lst
├── train.lst
├── trainval.lst
├── val.lst
└── auto_labelling.lst
root_dir (in recipe default to /data/cityscapes)
├─── gtFine
│ └── ...
├─── leftImg8bit
│ └── ...
└─── lists
└── ...
from super_gradients.training.datasets import CityscapesDataset
train_set = CityscapesDataset(root_dir='.../root_dir', list_file='lists/train.lst', labels_csv_path='lists/labels.csv', ...)
Cityscapes AutoLabelled dataset were introduced by NVIDIA research group in the paper: "Hierarchical Multi-Scale Attention for Semantic Segmentation".
AutoLabelled refer to the refinement of the Cityscapes coarse data and pseudo labels generation using their suggested Hierarchical multi-scale attention model.
To download the AutoLabelled labels please refer to the original
repo.
Unzip and rename the folder to AutoLabelling
as described bellow.
Download the coarse RGB images from cityscapes official site, leftImg8bit_train_extra: https://www.cityscapes-dataset.com/file-handling/?packageID=4
root_dir (in recipe default to /data/cityscapes)
├─── gtFine
│ ├── test
│ │ └── ...
│ ├─── train
│ │ └── ...
│ └─── val
│ └── ...
├─── leftImg8bit
│ ├── test
│ │ └── ...
│ ├─── train
│ │ └── ...
│ └─── val
│ └── ...
├─── AutoLabelling
│ └─── train_extra
│ └── ...
└─── leftImg8bit
└─── train_extra
└── ...
coco
├── annotations
│ ├─ instances_train2017.json
│ ├─ instances_val2017.json
│ └─ ...
└── images
├── train2017
│ ├─ 000000000001.jpg
│ └─ ...
└── val2017
└─ ...
from super_gradients.training.datasets import CoCoSegmentationDataSet
train_set = CoCoSegmentationDataSet(data_dir='.../coco', subdir='images/train2017', json_file='instances_train2017.json', ...)
valid_set = CoCoSegmentationDataSet(data_dir='.../coco', subdir='images/val2017', json_file='instances_val2017.json', ...)
pascal_voc_2012
└──VOCdevkit
└──VOC2012
├──JPEGImages
├──SegmentationClass
├──ImageSets
│ ├──Segmentation
│ │ └── train.txt
│ ├──Main
│ ├──Action
│ └──Layout
├──Annotations
└──SegmentationObject
from super_gradients.training.datasets import PascalVOC2012SegmentationDataSet
train_set = PascalVOC2012SegmentationDataSet(
root='.../pascal_voc_2012',
list_file='VOCdevkit/VOC2012/ImageSets/Segmentation/train.txt',
samples_sub_directory='VOCdevkit/VOC2012/JPEGImages',
targets_sub_directory='VOCdevkit/VOC2012/SegmentationClass',
...
)
valid_set = PascalVOC2012SegmentationDataSet(
root='.../pascal_voc_2012',
list_file='VOCdevkit/VOC2012/ImageSets/Segmentation/val.txt',
samples_sub_directory='VOCdevkit/VOC2012/JPEGImages',
targets_sub_directory='VOCdevkit/VOC2012/SegmentationClass',
...
)
pascal_voc_2012
└──VOCaug
├── aug.txt
└── dataset
├──inst
├──img
└──cls
from super_gradients.training.datasets import PascalAUG2012SegmentationDataSet
train_set = PascalAUG2012SegmentationDataSet(
root='.../pascal_voc_2012',
list_file='VOCaug/dataset/aug.txt',
samples_sub_directory='VOCaug/dataset/img',
targets_sub_directory='VOCaug/dataset/cls',
...
)
NOTE: this dataset is only available for training. To test, please use PascalVOC2012SegmentationDataSet.
pascal_voc_2012
├─VOCdevkit
│ └──VOC2012
│ ├──JPEGImages
│ ├──SegmentationClass
│ ├──ImageSets
│ │ ├──Segmentation
│ │ │ └── train.txt
│ │ ├──Main
│ │ ├──Action
│ │ └──Layout
│ ├──Annotations
│ └──SegmentationObject
└──VOCaug
├── aug.txt
└── dataset
├──inst
├──img
└──cls
from super_gradients.training.datasets import PascalVOCAndAUGUnifiedDataset
train_set = PascalVOCAndAUGUnifiedDataset(root='.../pascal_voc_2012', ...)
NOTE: this dataset is only available for training. To test, please use PascalVOC2012SegmentationDataSet.
supervisely-persons
├──images
│ ├──image-name.png
│ └──...
├──images_600x800
│ ├──image-name.png
│ └──...
├──masks
└──masks_600x800
from super_gradients.training.datasets import SuperviselyPersonsDataset
train_set = SuperviselyPersonsDataset(root_dir='.../supervisely-persons', list_file='train.csv', ...)
valid_set = SuperviselyPersonsDataset(root_dir='.../supervisely-persons', list_file='val.csv', ...)
NOTE: this dataset is only available for training. To test, please use PascalVOC2012SegmentationDataSet.
coco
├── annotations
│ ├─ person_keypoints_train2017.json
│ ├─ person_keypoints_val2017.json
│ └─ ...
└── images
├── train2017
│ ├─ 000000000001.jpg
│ └─ ...
└── val2017
└─ ...
from super_gradients.training.datasets import COCOKeypointsDataset
train_set = COCOKeypointsDataset(data_dir='.../coco', images_dir='images/train2017', json_file='annotations/instances_train2017.json', ...)
valid_set = COCOKeypointsDataset(data_dir='.../coco', images_dir='images/val2017', json_file='annotations/instances_val2017.json', ...)
Download DOTA dataset: https://captain-whu.github.io/DOTA/dataset.html
Unzip and organize it as below:
dota
└── train
├── images
│ ├─ 000000000001.jpg
│ └─ ...
└── ann
└─ 000000000001.txt
└── val
├── images
│ ├─ 000000000002.jpg
│ └─ ...
└── ann
└─ 000000000002.txt
python src/super_gradients/examples/dota_prepare_dataset/dota_prepare_dataset.py --data_dir <path-to>/dota --output_dir <path-to>/dota_tiles
python -m super_gradients.train_from_recipe --config-name yolo_nas_r_s_dota dataset_params.data_dir=<path-to>/dota_tiles
dataset_params:
train_dataset_params:
data_dir: <path-to>/dota_tiles/train
val_dataset_params:
data_dir: <path-to>/dota_tiles/train
from super_gradients.training.datasets import DOTAOBBDataset
train_loader = DOTAOBBDataset(data_dir="<path-to>/dota_tiles/train", ...)
Press p or to see the previous file or, n or to see the next file
Browsing data directories saved to S3 is possible with DAGsHub. Let's configure your repository to easily display your data in the context of any commit!
super-gradients is now integrated with AWS S3!
Are you sure you want to delete this access key?
Browsing data directories saved to Google Cloud Storage is possible with DAGsHub. Let's configure your repository to easily display your data in the context of any commit!
super-gradients is now integrated with Google Cloud Storage!
Are you sure you want to delete this access key?
Browsing data directories saved to Azure Cloud Storage is possible with DAGsHub. Let's configure your repository to easily display your data in the context of any commit!
super-gradients is now integrated with Azure Cloud Storage!
Are you sure you want to delete this access key?
Browsing data directories saved to S3 compatible storage is possible with DAGsHub. Let's configure your repository to easily display your data in the context of any commit!
super-gradients is now integrated with your S3 compatible storage!
Are you sure you want to delete this access key?