modalities.optimizers package
Submodules
modalities.optimizers.lr_schedulers module
modalities.optimizers.optimizer_factory module
- class modalities.optimizers.optimizer_factory.OptimizerFactory[source]
Bases:
object
- static get_fsdp1_checkpointed_optimizer_(checkpoint_loading, checkpoint_path, wrapped_model, optimizer)[source]
Loads an FSDP1-checkpointed optimizer from a checkpoint file.
- Return type:
- Parameters:
checkpoint_loading (FSDP1CheckpointLoadingIF)
checkpoint_path (Path)
wrapped_model (FullyShardedDataParallel)
optimizer (Optimizer)
- Args:
checkpoint_loading (FSDP1CheckpointLoadingIF): The FDSP1 checkpoint loading strategy. checkpoint_path (Path): The path to the checkpoint file. wrapped_model (FSDP1): The FSDP1 model associated with the optimizer. optimizer (Optimizer): The optimizer to load the checkpoint into.
- Returns:
Optimizer: The optimizer loaded from the checkpoint.
- modalities.optimizers.optimizer_factory.get_optimizer_groups(model, weight_decay, weight_decay_groups_excluded)[source]
divide model parameters into optimizer groups (with or without weight decay)
inspired by: - https://github.com/pytorch/pytorch/issues/101343 - https://github.com/karpathy/nanoGPT