modalities.config package

Submodules

modalities.config.component_factory module

class modalities.config.component_factory.ComponentFactory(registry)[source]

Bases: object

Factory class to build the components from a config dictionary.

Initializes the ComponentFactory with a registry.

Args:

registry (Registry): Registry object to get the component and config classes.

Parameters:

registry (Registry)

build_components(config_dict, components_model_type)[source]

Builds the components from a config dictionary. All components specified in components_model_type are built from the config dictionary in a recursive manner.

Return type:

TypeVar(BaseModelChild, bound= BaseModel)

Parameters:
  • config_dict (dict)

  • components_model_type (Type[BaseModelChild])

Args:

config_dict (dict): Dictionary with the configuration of the components. components_model_type (Type[BaseModelChild]): Base model type defining the components to be build.

Returns:

BaseModelChild: Instance of the components_model_type with the built components.

modalities.config.config module

class modalities.config.config.ActivationCheckpointedModelConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • model (Annotated[FullyShardedDataParallel, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca72d0>])

  • activation_checkpointing_modules (list[str] | None)

activation_checkpointing_modules: Optional[list[str]]
model: Annotated[FullyShardedDataParallel]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class modalities.config.config.AdamOptimizerConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • lr (float)

  • wrapped_model (Annotated[Module, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca71d0>])

  • betas (tuple[float, float])

  • eps (float)

  • weight_decay (float)

  • weight_decay_groups_excluded (list[str])

betas: tuple[float, float]
eps: float
lr: float
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

weight_decay: float
weight_decay_groups_excluded: list[str]
wrapped_model: Annotated[Module]
class modalities.config.config.AdamWOptimizerConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • lr (float)

  • wrapped_model (Annotated[Module, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca71d0>])

  • betas (tuple[float, float])

  • eps (float)

  • weight_decay (float)

  • weight_decay_groups_excluded (list[str])

betas: tuple[float, float]
eps: float
lr: float
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

weight_decay: float
weight_decay_groups_excluded: list[str]
wrapped_model: Annotated[Module]
class modalities.config.config.BatchSamplerConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • sampler (Annotated[Sampler, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca7750>])

  • batch_size (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])])

  • drop_last (Literal[True])

batch_size: Annotated[int]
drop_last: Literal[True]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

sampler: Annotated[Sampler]
class modalities.config.config.CLMCrossEntropyLossConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • target_key (str)

  • prediction_key (str)

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

prediction_key: str
target_key: str
class modalities.config.config.CheckpointSavingConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • checkpoint_saving_strategy (Annotated[CheckpointSavingStrategyIF, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca70d0>])

  • checkpoint_saving_execution (Annotated[CheckpointSavingExecutionABC, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca7150>])

checkpoint_saving_execution: Annotated[CheckpointSavingExecutionABC]
checkpoint_saving_strategy: Annotated[CheckpointSavingStrategyIF]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class modalities.config.config.CombinedDatasetConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

datasets (list[Annotated[Dataset, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca7550>]])

datasets: list[Annotated[Dataset]]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class modalities.config.config.CompiledModelConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • model (Annotated[Module, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca71d0>])

  • block_names (list[str])

  • fullgraph (bool | None)

  • debug (bool | None)

block_names: list[str]
debug: Optional[bool]
fullgraph: Optional[bool]
model: Annotated[Module]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class modalities.config.config.ConstantLRSchedulerConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • optimizer (Annotated[Optimizer, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca79d0>])

  • factor (Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0.0), Le(le=1.0)])])

  • total_iters (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])])

  • last_epoch (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=-1)])])

factor: Annotated[float]
last_epoch: Annotated[int]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

optimizer: Annotated[Optimizer]
total_iters: Annotated[int]
class modalities.config.config.CosineAnnealingLRSchedulerConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • optimizer (Annotated[Optimizer, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca79d0>])

  • t_max (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])])

  • eta_min (Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0.0)])])

  • last_epoch (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=-1)])])

eta_min: Annotated[float]
last_epoch: Annotated[int]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

optimizer: Annotated[Optimizer]
t_max: Annotated[int]
class modalities.config.config.DCPAppStateConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • raw_app_state (Annotated[AppState, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efce9050>])

  • checkpoint_dir_path (Path)

checkpoint_dir_path: Path
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

raw_app_state: Annotated[AppState]
class modalities.config.config.DCPCheckpointLoadingConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

global_rank (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])])

global_rank: Annotated[int]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class modalities.config.config.DCPCheckpointSavingConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • checkpoint_path (Path)

  • global_rank (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])])

  • experiment_id (str)

checkpoint_path: Path
experiment_id: str
global_rank: Annotated[int]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class modalities.config.config.DistributedSamplerConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • rank (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])])

  • num_replicas (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])])

  • shuffle (bool)

  • dataset (Annotated[Dataset, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca7550>])

  • seed (int | None)

  • drop_last (Literal[True])

dataset: Annotated[Dataset]
drop_last: Literal[True]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

num_replicas: Annotated[int]
rank: Annotated[int]
seed: Optional[int]
shuffle: bool
class modalities.config.config.DummyLRSchedulerConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

optimizer (Annotated[Optimizer, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca79d0>])

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

optimizer: Annotated[Optimizer]
class modalities.config.config.DummyProgressSubscriberConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class modalities.config.config.DummyResultSubscriberConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class modalities.config.config.FSDP1CheckpointLoadingConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
block_names: list[str]
global_rank: Annotated[int]
mixed_precision_settings: MixedPrecisionSettings
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

classmethod parse_mixed_precision_setting_by_name(name)[source]
classmethod parse_sharding_strategy_by_name(name)[source]
sharding_strategy: ShardingStrategy
class modalities.config.config.FSDP1CheckpointSavingConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • checkpoint_path (Path)

  • global_rank (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])])

  • experiment_id (str)

checkpoint_path: Path
experiment_id: str
global_rank: Annotated[int]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class modalities.config.config.FSDP1CheckpointedModelConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • checkpoint_loading (Annotated[FSDP1CheckpointLoadingIF, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca5890>])

  • checkpoint_path (Path)

  • model (Annotated[Module, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca71d0>])

checkpoint_loading: Annotated[FSDP1CheckpointLoadingIF]
checkpoint_path: Path
model: Annotated[Module]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class modalities.config.config.FSDP1CheckpointedOptimizerConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • checkpoint_loading (Annotated[FSDP1CheckpointLoadingIF, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca5890>])

  • checkpoint_path (Path)

  • wrapped_model (Annotated[Module, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca71d0>])

  • optimizer (Annotated[Optimizer, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca79d0>])

checkpoint_loading: Annotated[FSDP1CheckpointLoadingIF]
checkpoint_path: Path
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

optimizer: Annotated[Optimizer]
wrapped_model: Annotated[Module]
class modalities.config.config.FSDP2WrappedModelConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • model (Annotated[Module, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca71d0>])

  • block_names (list[str])

  • mixed_precision_settings (FSDP2MixedPrecisionSettings)

  • reshard_after_forward (bool)

  • device_mesh (Annotated[DeviceMesh, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efce8f90>])

block_names: list[str]
device_mesh: Annotated[DeviceMesh]
mixed_precision_settings: FSDP2MixedPrecisionSettings
model: Annotated[Module]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

reshard_after_forward: bool
validate_dp_mesh_existence()[source]
validate_mixed_precision_settings()[source]
class modalities.config.config.FSDPWrappedModelConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
block_names: list[str]
mixed_precision_settings: MixedPrecisionSettings
model: Annotated[Module]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

classmethod parse_mixed_precision_setting_by_name(name)[source]
classmethod parse_sharding_strategy_by_name(name)[source]
sharding_strategy: ShardingStrategy
sync_module_states: bool
class modalities.config.config.GPT2LLMCollateFnConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • sample_key (str)

  • target_key (str)

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

sample_key: str
target_key: str
class modalities.config.config.GPT2MFUCalculatorConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • n_layer (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])])

  • sequence_length (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])])

  • n_embd (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])])

  • world_size (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])])

  • wrapped_model (Annotated[FullyShardedDataParallel, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca72d0>] | Annotated[FSDPModule, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca73d0>])

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

n_embd: Annotated[int]
n_layer: Annotated[int]
sequence_length: Annotated[int]
world_size: Annotated[int]
wrapped_model: Union[Annotated[FullyShardedDataParallel], Annotated[FSDPModule]]
class modalities.config.config.LLMDataLoaderConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • dataloader_tag (str)

  • dataset (Annotated[Dataset, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca7550>])

  • batch_sampler (Annotated[Sampler, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca7750>])

  • collate_fn (Annotated[CollateFnIF, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca7850>] | None)

  • num_workers (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])])

  • pin_memory (bool)

batch_sampler: Annotated[Sampler]
collate_fn: Optional[Annotated[CollateFnIF]]
dataloader_tag: str
dataset: Annotated[Dataset]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

num_workers: Annotated[int]
pin_memory: bool
class modalities.config.config.LinearLRSchedulerConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • optimizer (Annotated[Optimizer, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca79d0>])

  • start_factor (Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0.0), Le(le=1.0)])])

  • end_factor (Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0.0), Le(le=1.0)])])

  • total_iters (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])])

  • last_epoch (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=-1)])])

end_factor: Annotated[float]
last_epoch: Annotated[int]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

optimizer: Annotated[Optimizer]
start_factor: Annotated[float]
total_iters: Annotated[int]
class modalities.config.config.MemMapDatasetConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • raw_data_path (Annotated[Path, PathType(path_type=file)])

  • index_path (Annotated[Path, PathType(path_type=file)] | None)

  • tokenizer (Annotated[TokenizerWrapper, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca7490>])

  • jq_pattern (str)

  • sample_key (str)

index_path: Optional[Annotated[Path]]
jq_pattern: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

raw_data_path: Annotated[Path]
sample_key: str
tokenizer: Annotated[TokenizerWrapper]
class modalities.config.config.OneCycleLRSchedulerConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • optimizer (Annotated[Optimizer, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca79d0>])

  • max_lr (Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0.0)])] | list[Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0.0)])]])

  • total_steps (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])] | None)

  • epochs (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])] | None)

  • steps_per_epoch (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])] | None)

  • pct_start (Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0.0), Le(le=1.0)])])

  • anneal_strategy (str)

  • cycle_momentum (bool)

  • base_momentum (Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])] | list[Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0.0)])]])

  • max_momentum (Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0.0)])] | list[Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0.0)])]])

  • div_factor (Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0.0)])])

  • final_div_factor (Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0.0)])])

  • three_phase (bool)

  • last_epoch (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=-1)])])

anneal_strategy: str
base_momentum: Union[Annotated[float], list[Annotated[float]]]
check_totals_steps_and_epchs()[source]
Return type:

OneCycleLRSchedulerConfig

cycle_momentum: bool
div_factor: Annotated[float]
epochs: Optional[Annotated[int]]
final_div_factor: Annotated[float]
last_epoch: Annotated[int]
max_lr: Union[Annotated[float], list[Annotated[float]]]
max_momentum: Union[Annotated[float], list[Annotated[float]]]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

optimizer: Annotated[Optimizer]
pct_start: Annotated[float]
steps_per_epoch: Optional[Annotated[int]]
three_phase: bool
total_steps: Optional[Annotated[int]]
class modalities.config.config.PackedMemMapDatasetContinuousConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • raw_data_path (Path)

  • sequence_length (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=1)])])

  • sample_key (str)

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

raw_data_path: Path
sample_key: str
sequence_length: Annotated[int]
class modalities.config.config.PackedMemMapDatasetMegatronConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • raw_data_path (Path)

  • block_size (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=1)])])

  • sample_key (str)

block_size: Annotated[int]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

raw_data_path: Path
sample_key: str
class modalities.config.config.PassType(value)[source]

Bases: LookupEnum

BY_REFERENCE = 'by_reference'
BY_VALUE = 'by_value'
class modalities.config.config.PreTrainedHFTokenizerConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • pretrained_model_name_or_path (str)

  • max_length (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])] | None)

  • truncation (bool)

  • padding (bool | str)

  • special_tokens (dict[str, str] | None)

max_length: Optional[Annotated[int]]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

padding: bool | str
pretrained_model_name_or_path: str
special_tokens: Optional[dict[str, str]]
truncation: bool
class modalities.config.config.PreTrainedSPTokenizerConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

tokenizer_model_file (str)

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

tokenizer_model_file: str
class modalities.config.config.PrecisionEnum(value)[source]

Bases: LookupEnum

BF16 = torch.bfloat16
FP16 = torch.float16
FP32 = torch.float32
class modalities.config.config.ProcessGroupBackendType(value)[source]

Bases: LookupEnum

nccl = 'nccl'
class modalities.config.config.RawAppStateConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • model (Annotated[Module, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca71d0>])

  • optimizer (Annotated[Optimizer, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca79d0>])

  • lr_scheduler (Annotated[LRScheduler, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca7c90>] | None)

lr_scheduler: Optional[Annotated[LRScheduler]]
model: Annotated[Module]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

optimizer: Annotated[Optimizer]
class modalities.config.config.ReferenceConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
instance_key: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

pass_type: PassType
class modalities.config.config.ResumableDistributedSamplerConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • dataset (Annotated[Dataset, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca7550>])

  • rank (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])])

  • num_replicas (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])])

  • epoch (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])])

  • shuffle (bool | None)

  • seed (int | None)

  • drop_last (Literal[True])

  • skip_num_global_samples (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])])

dataset: Annotated[Dataset]
drop_last: Literal[True]
epoch: Annotated[int]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

num_replicas: Annotated[int]
rank: Annotated[int]
seed: Optional[int]
shuffle: Optional[bool]
skip_num_global_samples: Annotated[int]
class modalities.config.config.RichProgressSubscriberConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • eval_dataloaders (list[Annotated[LLMDataLoader, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca7910>]] | None)

  • train_dataloader_tag (str)

  • num_seen_steps (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])])

  • num_target_steps (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])])

  • global_rank (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])])

eval_dataloaders: Optional[list[Annotated[LLMDataLoader]]]
global_rank: Annotated[int]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

num_seen_steps: Annotated[int]
num_target_steps: Annotated[int]
train_dataloader_tag: str
class modalities.config.config.RichResultSubscriberConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • num_ranks (int)

  • global_rank (int)

global_rank: int
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

num_ranks: int
class modalities.config.config.SaveEveryKStepsCheckpointingStrategyConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

k (Annotated[int, Gt(gt=0)])

k: Annotated[int]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class modalities.config.config.SaveKMostRecentCheckpointsStrategyConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

k (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=-1)])])

k: Annotated[int]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class modalities.config.config.SequentialSamplerConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data_source (Annotated[Dataset, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca7550>])

data_source: Annotated[Dataset]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class modalities.config.config.StepLRSchedulerConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • optimizer (Annotated[Optimizer, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca79d0>])

  • step_size (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])])

  • gamma (Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0.0)])])

  • last_epoch (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=-1)])])

gamma: Annotated[float]
last_epoch: Annotated[int]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

optimizer: Annotated[Optimizer]
step_size: Annotated[int]
class modalities.config.config.TokenizerTypes(value)[source]

Bases: LookupEnum

GPT2TokenizerFast = <class 'transformers.models.gpt2.tokenization_gpt2_fast.GPT2TokenizerFast'>
LlamaTokenizerFast = <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>
class modalities.config.config.TorchCheckpointLoadingConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • device (Annotated[device, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efce8250>])

  • precision (PrecisionEnum | None)

device: Annotated[device]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

classmethod parse_device(device)[source]
Return type:

device

precision: Optional[PrecisionEnum]
class modalities.config.config.WandBEvaluationResultSubscriberConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
config_file_path: Path
directory: Path
experiment_id: str
global_rank: int
mode: WandbMode
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

project: str
class modalities.config.config.WandbMode(value)[source]

Bases: LookupEnum

DISABLED = 'DISABLED'
OFFLINE = 'OFFLINE'
ONLINE = 'ONLINE'
class modalities.config.config.WeightInitializedModelConfig(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • model (Annotated[Module, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca71d0>])

  • model_initializer (Annotated[ModelInitializationIF, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efce8ed0>])

model: Annotated[Module]
model_config: ClassVar[ConfigDict] = {'protected_namespaces': ()}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_initializer: Annotated[ModelInitializationIF]
modalities.config.config.load_app_config_dict(config_file_path, experiment_id=None, additional_resolver_funs=None)[source]

Load the application configuration from the given YAML file. The function defines custom resolvers for the OmegaConf library to resolve environment variables and Modalities-specific variables.

Return type:

dict

Parameters:
Args:

config_file_path (Path): YAML config file. experiment_id (str, optional): The experiment_id of the current run. Defaults to None. additional_resolver_funs (dict[str, Callable], optional): Additional resolver functions. Defaults to None.

Returns:

dict: Dictionary representation of the config file.

modalities.config.instantiation_models module

class modalities.config.instantiation_models.ConsistencyEnforcement(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • enforce_tokens_per_step_consistency (bool)

  • enforce_last_step_logged (bool)

  • enforce_last_step_evaluated (bool)

  • enforce_last_step_checkpointed (bool)

enforce_last_step_checkpointed: bool
enforce_last_step_evaluated: bool
enforce_last_step_logged: bool
enforce_tokens_per_step_consistency: bool
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class modalities.config.instantiation_models.CudaEnvSettings(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • local_rank (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])])

  • world_size (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])])

  • global_rank (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])])

global_rank: Annotated[int]
local_rank: Annotated[int]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

world_size: Annotated[int]
class modalities.config.instantiation_models.Intervals(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • training_log_interval_in_steps (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])])

  • checkpointing_interval_in_steps (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])])

  • evaluation_interval_in_steps (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])])

checkpointing_interval_in_steps: Annotated[int]
evaluation_interval_in_steps: Annotated[int]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

training_log_interval_in_steps: Annotated[int]
class modalities.config.instantiation_models.PackedDatasetComponentsInstantiationModel(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
class PackedDatasetSettings(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • src_path (Annotated[Path, PathType(path_type=file)])

  • dst_path (Path | None)

  • index_path (Annotated[Path, PathType(path_type=file)] | None)

  • jq_pattern (str)

  • num_cpus (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])])

  • eod_token (str)

  • processing_batch_size (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])])

  • raw_samples_queue_size (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])])

  • processed_samples_queue_size (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])])

dst_path: Optional[Path]
eod_token: str
index_path: Optional[Annotated[Path]]
jq_pattern: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

num_cpus: Annotated[int]
processed_samples_queue_size: Annotated[int]
processing_batch_size: Annotated[int]
raw_samples_queue_size: Annotated[int]
src_path: Annotated[Path]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

settings: PackedDatasetSettings
tokenizer: Annotated[TokenizerWrapper]
class modalities.config.instantiation_models.StepProfile(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • gradient_accumulation_steps (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])])

  • local_train_micro_batch_size (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])])

  • sequence_length (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])])

gradient_accumulation_steps: Annotated[int]
local_train_micro_batch_size: Annotated[int]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

sequence_length: Annotated[int]
class modalities.config.instantiation_models.TextGenerationInstantiationModel(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
class TextGenerationSettings(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • model_path (Annotated[Path, PathType(path_type=file)])

  • sequence_length (int)

  • device (Annotated[device, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efce8250>])

  • referencing_keys (dict[str, str])

device: Annotated[device]
model_config: ClassVar[ConfigDict] = {'protected_namespaces': ()}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_path: Annotated[Path]
classmethod parse_device(device)[source]
Return type:

device

referencing_keys: dict[str, str]
sequence_length: int
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

settings: TextGenerationSettings
text_inference_component: Annotated[TextInferenceComponent]
class modalities.config.instantiation_models.TrainingComponentsInstantiationModel(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • settings (Settings)

  • app_state (Annotated[AppState, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efce9050>])

  • loss_fn (Annotated[Loss, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efce8850>])

  • train_dataset (Annotated[Dataset, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca7550>])

  • train_dataloader (Annotated[LLMDataLoader, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca7910>])

  • eval_dataloaders (list[Annotated[LLMDataLoader, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca7910>]])

  • progress_subscriber (Annotated[MessageSubscriberIF, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efce8b90>])

  • evaluation_subscriber (Annotated[MessageSubscriberIF, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efce8b90>])

  • checkpoint_saving (Annotated[CheckpointSaving, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca5650>])

  • gradient_clipper (Annotated[GradientClipperIF, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efce8e10>])

  • mfu_calculator (Annotated[MFUCalculatorABC, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efce9190>] | None)

  • model_raw (Annotated[Module, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f67efca71d0>])

class Settings(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
class DCPWarmstartCheckpointPaths(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

checkpoint_folder_path (Path)

checkpoint_folder_path: Path
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class Paths(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • checkpoint_saving_path (Path)

  • extra_data (Any)

class Config[source]

Bases: object

extra = 'allow'
checkpoint_saving_path: Path
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class WarmstartCheckpointPaths(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • model_checkpoint_path (Path)

  • optimizer_checkpoint_path (Path | None)

model_checkpoint_path: Path
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

optimizer_checkpoint_path: Optional[Path]
config_file_path: Annotated[Path]
consistency_enforcement: ConsistencyEnforcement
cuda_env: CudaEnvSettings
experiment_id: str
intervals: Intervals
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

paths: Paths
referencing_keys: dict[str, str]
step_profile: StepProfile
training_progress: TrainingProgress
training_target: TrainingTarget
warmstart_checkpoint_paths: Union[WarmstartCheckpointPaths, DCPWarmstartCheckpointPaths, None]
app_state: Annotated[AppState]
checkpoint_saving: Annotated[CheckpointSaving]
eval_dataloaders: list[Annotated[LLMDataLoader]]
evaluation_subscriber: Annotated[MessageSubscriberIF]
gradient_clipper: Annotated[GradientClipperIF]
loss_fn: Annotated[Loss]
mfu_calculator: Optional[Annotated[MFUCalculatorABC]]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_raw: Annotated[Module]
progress_subscriber: Annotated[MessageSubscriberIF]
settings: Settings
train_dataloader: Annotated[LLMDataLoader]
train_dataset: Annotated[Dataset]
class modalities.config.instantiation_models.TrainingProgress(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • global_num_seen_tokens (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])])

  • num_seen_steps (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])])

  • num_seen_samples (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])])

  • last_step (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=-1)])])

global_num_seen_tokens: Annotated[int]
last_step: Annotated[int]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

num_seen_samples: Annotated[int]
num_seen_steps: Annotated[int]
class modalities.config.instantiation_models.TrainingReportGenerator(training_target, intervals, step_profile, cuda_env, consistency_enforcement, train_dataset, training_progress)[source]

Bases: object

Parameters:
get_report()[source]
Return type:

str

class modalities.config.instantiation_models.TrainingTarget(**data)[source]

Bases: BaseModel

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:
  • num_target_tokens (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])])

  • num_target_steps (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])])

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

num_target_steps: Annotated[int]
num_target_tokens: Annotated[int]

modalities.config.lookup_enum module

class modalities.config.lookup_enum.LookupEnum(value)[source]

Bases: Enum

modalities.config.pydantic_if_types module

class modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF(third_party_type)[source]

Bases: object

modalities.config.utils module

modalities.config.utils.convert_base_model_config_to_dict(config)[source]

“Converts non-recursively a Pydantic BaseModel to a dictionary.

Return type:

dict[Any, Any]

Parameters:

config (BaseModel)

modalities.config.utils.parse_torch_device(device)[source]
Return type:

device

Parameters:

device (str | int)

Module contents