Configuration Files and Recipes
SuperGradients supports YAML formatted configuration files. These files can contain training hyper-parameters, architecture parameters, datasets parameters and any other parameters required by the training process. These parameters will be consumed as dictionaries or as function arguments by different parts of SuperGradients.
You can use SuperGradients without using any configuration files, look into the examples directory to see how.
SuperGradients was designed to expose as many parameters as possible to allow outside configuration without writing a single line of code. You can control the learning-rate, the weight-decay or even the loss function and metric used in the training, but moreover, you can even control which block-type or activation function to use in your model. You can learn about and define all of these parameters from the configuration files.
Here is an example YAML file (training hyper-parameters in this case):
defaults:
- default_train_params
max_epochs: 250
lr_updates:
_target_: numpy.arange
start: 100
stop: 250
step: 50
lr_decay_factor: 0.1
lr_mode: step
lr_warmup_epochs: 0
initial_lr: 0.1
loss: cross_entropy
optimizer: SGD
criterion_params: {}
optimizer_params:
weight_decay: 1e-4
momentum: 0.9
Why using configuration files
Using configuration file might seem too complicated or redundant at first. But, after a short training, you will find it extremely convenient and useful.
Configuration file can help you manage your assets, such as datasets, models and training recipes. Keeping your code files as clean of parameters as possible, allows you to have all of your configuration in one place and reuse the same code to define different objects. In the following example, we define a training set and a validation set of Imagenet. both use the same piece of code with different configurations:
train_dataset_params:
root: /data/Imagenet/train
transforms:
- RandomResizedCropAndInterpolation:
size: 224
interpolation: default
- RandomHorizontalFlip
- ToTensor
- Normalize:
mean: ${dataset_params.img_mean}
std: ${dataset_params.img_std}
val_dataset_params:
root: /data/Imagenet/val
transforms:
- Resize:
size: 256
- CenterCrop:
size: 224
- ToTensor
- Normalize:
mean: ${dataset_params.img_mean}
std: ${dataset_params.img_std}
Configuration file can also help you track the exact settings used for each one of your experiments, tweak and tune these settings, and share them with others. Concentrating all of these configuration parameters in one place, gives you great visibility and control of your experiments.
How to use configuration files
So, if you got so far, we have probably manged to convince you that configuration files are awsome and powerful tools - welcome aboard!
YAML is a human-readable data-serialization language. It is commonly used for configuration files and in applications where data is being stored or transmitted (Wikipedia). We parse each file into dictionaries, lists, and objects, and pass them to the code either as a recursive dictionary or as function arguments.
Let's try running a training session from a configuration file.
python -m super_gradients.train_from_recipe --config-name=cifar10_resnet
The recipe you have just used is a configuration file containing everything SG needs to know in order to train
Resnet18 on Cifar10. The actual YAML file is located in src/super_gradients/recipes/cifar10_resnet.yaml
. In the same recipes
library you can find many more
configuration files defining different models, datasets, and training hyper-parameters.
Try changing the initial_lr
parameter in the file src/super_gradients/recipes/training_hyperparams/cifar10_resnet_train_params.yaml
and launch this scrip again.
You will see a different result now. This is because the parameters from cifar10_resnet_train_params.yaml
are used in cifar10_resnet.yaml
(we will discuss thin in the next section).
Two more useful functionalities are
python -m super_gradients.resume_experiment --experiment_name=cifar10_resnet
that will resume the experiment from the last checkpoint, and
python -m super_gradients.evaluate_from_recipe --config-name=cifar10_resnet
Hydra
Hydra is an open-source Python framework that provides us with many useful functionalities for YAML management. You can learn about Hydra here. We use Hydra to load YAML files and convert them into dictionaries, while instantiating the objects referenced in the YAML. You can see this in the code:
@hydra.main(config_path=pkg_resources.resource_filename("super_gradients.recipes", ""), version_base="1.2")
def main(cfg: DictConfig) -> None:
Trainer.train_from_config(cfg)
def run():
init_trainer()
main()
if __name__ == "__main__":
run()
@hydra.main
decorator is looking for YAML files in the super_gradients.recipes
according to the name of the configuration file provided
in the first arg of the command line.
In the experiment directory a .hydra
subdirectory will be created. The configuration files related to this run will be saved by hydra to that subdirectory.
Two Hydra features worth mentioning are YAML Composition and Command-Line Overrides.
YAML Composition
If you brows the YAML files in the recipes
directory you will see some file containing the saved-key defaults:
at the beginning of the file.
defaults:
- training_hyperparams: cifar10_resnet_train_params
- dataset_params: cifar10_dataset_params
- arch_params: resnet18_cifar_arch_params
- checkpoint_params: default_checkpoint_params
- _self_
The parameters will be referenced inside the YAML according to their origin. i.e. in the example above we can reference training_hyperparams.initial_lr
(initial_lr parameter from the cifar10_resnet_train_params.yaml file)
The aggregated configuration file will be saved in the .hydra
subdirectory.
Command-Line Overrides
When running with Hydra, you can override or even add configuration from the command line. These override will apply to the specific run only.
python -m super_gradients.train_from_recipe --config-name=cifar10_resnet training_hyperparams.initial_lr=0.02 experiment_name=test_lr_002
--
prefix and that each parameter is referenced with its full path in the
configuration tree, concatenated with a .
.
Resolvers
Resolvers are converting the strings from the YAML file into Python objects or values. The most basic resolvers are the Hydra native resolvers. Here are a few simple examples:
a: 1
b: 2
c: 3
a_plus_b: "${add: ${a},${b}}"
a_plus_b_plus_c: "${add: ${a}, ${b}, ${c}}"
my_list: [10, 20, 30, 40, 50]
third_of_list: "${getitem: ${my_list}, 2}"
first_of_list: "${first: ${my_list}}"
last_of_list: "${last: ${my_list}}"
The more advanced resolvers will instantiate objects. In the following example we define a few transforms that will be used to augment a dataset.
train_dataset_params:
transforms:
# for more options see common.factories.transforms_factory.py
- SegColorJitter:
brightness: 0.1
contrast: 0.1
saturation: 0.1
- SegRandomFlip:
prob: 0.5
- SegRandomRescale:
scales: [ 0.4, 1.6 ]
SegColorJitter
, SegRandomFlip
, SegRandomRescale
) is mapped to a type, and the configuration parameters under that key will be passed
to the type constructor by name (as key word arguments).
If you want to see where this magic is happening, you can look for the @resolve_param
decorator in the code
class ImageNetDataset(torch_datasets.ImageFolder):
@resolve_param("transforms", factory=TransformsFactory())
def __init__(self, root: str, transforms: Union[list, dict] = [], *args, **kwargs):
...
...
The @resolve_param
wraps functions and resolves a string or a dictionary argument (in the example above "transforms") to an object.
To do so, it uses a factory object that maps a string or a dictionary to a type. when __init__(..)
will be called, the function will receive
an object, and not a dictionary. The parameters under "transforms" in the YAML will be passed as
arguments for instantiation the objects. We will learn how to add a new type of object into these mappings in the next sections.
Registering a new object
To use a new object from your configuration file, you need to define the mapping of the string to a type. This is done using one of the many registration function supported by SG.
register_model
register_detection_module
register_metric
register_loss
register_dataloader
register_callback
register_transform
register_dataset
These decorator functions can be imported and used as follows:
from super_gradients.common.registry import register_model
@register_model(name="MyNet")
class MyExampleNet(nn.Module):
def __init__(self, num_classes: int):
....
This simple decorator, maps the name "MyNet" to the type MyExampleNet
. Note that if your constructor
include required arguments, you will be expected to provide them when using this string
...
architecture:
MyNet:
num_classes: 8
...
Required Hyper-Parameters
Most parameters can be defined by default when including default_train_params
in you defaults
.
However, the following hyper-parameters are required to launch a training run:
train_dataloader:
val_dataloader:
architecture:
training_hyperparams:
initial_lr:
loss:
experiment_name:
multi_gpu: # When training with multi GPU
num_gpus: # When training with multi GPU
# THE FOLLOWING PARAMS ARE DIRECTLY USED BY HYDRA
hydra:
run:
# Set the output directory (i.e. where .hydra folder that logs all the input params will be generated)
dir: ${hydra_output_dir:${ckpt_root_dir}, ${experiment_name}}
Other parameters may also be required, depending on the specific model, dataset, loss function ect. Follow the error message in case you experiment did not launce properly.
Recipes library structure
The super_gradients/recipes
include the following subdirectories:
- arch_params - containing configuration files for instantiating different models
- checkpoint_params - containing configuration files that define the loaded and saved checkpoints parameters for the training
- conversion_params - containing configuration files for the model conversion scripts (for deployment)
- dataset_params - containing configuration files for instantiating different datasets and dataloaders
- training_hyperparams - containing configuration files holding hyper-parameters for specific recipes
These configuration files will be available for use both in the installed version and in the development version of SG.