modalities.config package
Submodules
modalities.config.component_factory module
- class modalities.config.component_factory.ComponentFactory(registry)[source]
- Bases: - object- Factory class to build the components from a config dictionary. - Initializes the ComponentFactory with a registry. - Args:
- registry (Registry): Registry object to get the component and config classes. 
 - Parameters:
- registry (Registry) 
 - build_components(config_dict, components_model_type)[source]
- Builds the components from a config dictionary. All components specified in components_model_type are built from the config dictionary in a recursive manner. - Return type:
- TypeVar(- BaseModelChild, bound=- BaseModel)
- Parameters:
 - Args:
- config_dict (dict): Dictionary with the configuration of the components. components_model_type (Type[BaseModelChild]): Base model type defining the components to be build. 
- Returns:
- BaseModelChild: Instance of the components_model_type with the built components. 
 
 
modalities.config.config module
- class modalities.config.config.ActivationCheckpointedModelConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- ac_variant (ActivationCheckpointingVariants) 
- layers_fqn (str) 
- model (Annotated[Module, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2e910>]) 
- ac_fun_params (FullACParams | SelectiveLayerACParams | SelectiveOpACParams) 
 
 - class FullACParams(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
 - class SelectiveLayerACParams(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- ac_freq (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])]) 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
 - class SelectiveOpACParams(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
 - 
ac_fun_params: FullACParams|SelectiveLayerACParams|SelectiveOpACParams
 - 
ac_variant: ActivationCheckpointingVariants
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.AdamOptimizerConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.AdamWOptimizerConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.BatchSamplerConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.CLMCrossEntropyLossConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.CheckpointSavingConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- checkpoint_saving_strategy (Annotated[CheckpointSavingStrategyIF, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2ddd0>]) 
- checkpoint_saving_execution (Annotated[CheckpointSavingExecutionABC, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2e8d0>]) 
 
 - 
checkpoint_saving_execution: Annotated[CheckpointSavingExecutionABC]
 - 
checkpoint_saving_strategy: Annotated[CheckpointSavingStrategyIF]
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.CombinedDatasetConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- datasets (list[Annotated[Dataset, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2ec90>]]) 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.CompiledModelConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.ConstantLRSchedulerConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- optimizer (Annotated[Optimizer, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2ef10>]) 
- factor (Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0.0), Le(le=1.0)])]) 
- total_iters (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])]) 
- last_epoch (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=-1)])]) 
 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.CosineAnnealingLRSchedulerConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- optimizer (Annotated[Optimizer, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2ef10>]) 
- t_max (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])]) 
- eta_min (Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0.0)])]) 
- last_epoch (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=-1)])]) 
 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.DCPAppStateConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.DCPCheckpointLoadingConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- global_rank (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])]) 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.DCPCheckpointSavingConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.DebuggingEnrichedModelConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.DistributedSamplerConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- rank (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])]) 
- num_replicas (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])]) 
- shuffle (bool) 
- dataset (Annotated[Dataset, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2ec90>]) 
- seed (int | None) 
- drop_last (Literal[True]) 
 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.DummyLRSchedulerConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- optimizer (Annotated[Optimizer, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2ef10>]) 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.DummyProgressSubscriberConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.DummyResultSubscriberConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.EvaluationResultToDiscSubscriberConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- output_file_path (Path) 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.FSDP1ActivationCheckpointedModelConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- model (Annotated[FullyShardedDataParallel, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2e9d0>]) 
 
 - 
model: Annotated[FullyShardedDataParallel]
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.FSDP1CheckpointLoadingConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- global_rank (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])]) 
- mixed_precision_settings (MixedPrecisionSettings) 
- sharding_strategy (ShardingStrategy) 
 
 - 
mixed_precision_settings: MixedPrecisionSettings
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - 
sharding_strategy: ShardingStrategy
 
- class modalities.config.config.FSDP1CheckpointSavingConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.FSDP1CheckpointedModelConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- checkpoint_loading (Annotated[FSDP1CheckpointLoadingIF, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c1ab90>]) 
- checkpoint_path (Path) 
- model (Annotated[Module, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2e910>]) 
 
 - 
checkpoint_loading: Annotated[FSDP1CheckpointLoadingIF]
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.FSDP1CheckpointedOptimizerConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- checkpoint_loading (Annotated[FSDP1CheckpointLoadingIF, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c1ab90>]) 
- checkpoint_path (Path) 
- wrapped_model (Annotated[Module, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2e910>]) 
- optimizer (Annotated[Optimizer, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2ef10>]) 
 
 - 
checkpoint_loading: Annotated[FSDP1CheckpointLoadingIF]
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.FSDP2WrappedModelConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- model (Annotated[Module, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2e910>]) 
- mixed_precision_settings (FSDP2MixedPrecisionSettings) 
- reshard_after_forward (bool) 
- device_mesh (Annotated[DeviceMesh, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2f510>]) 
 
 - 
device_mesh: Annotated[DeviceMesh]
 - 
mixed_precision_settings: FSDP2MixedPrecisionSettings
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.FSDPWrappedModelConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- model (Annotated[Module, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2e910>]) 
- sync_module_states (bool) 
- mixed_precision_settings (MixedPrecisionSettings) 
- sharding_strategy (ShardingStrategy) 
 
 - 
mixed_precision_settings: MixedPrecisionSettings
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - 
sharding_strategy: ShardingStrategy
 
- class modalities.config.config.GPT2LLMCollateFnConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.GPT2MFUCalculatorConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- n_layer (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])]) 
- sequence_length (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])]) 
- n_embd (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])]) 
- world_size (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])]) 
- wrapped_model (Annotated[FullyShardedDataParallel, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2e9d0>] | Annotated[FSDPModule, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2eb10>]) 
 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - 
wrapped_model: Union[Annotated[FullyShardedDataParallel],Annotated[FSDPModule]]
 
- class modalities.config.config.GPT2ModelTPConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- model (Annotated[Module, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2e910>]) 
- device_mesh (Annotated[DeviceMesh, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2f510>]) 
 
 - 
device_mesh: Annotated[DeviceMesh]
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.LLMDataLoaderConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- dataloader_tag (str) 
- dataset (Annotated[Dataset, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2ec90>]) 
- batch_sampler (Annotated[Sampler, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2ed10>]) 
- collate_fn (Annotated[CollateFnIF, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2edd0>] | None) 
- num_workers (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])]) 
- pin_memory (bool) 
 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.LinearLRSchedulerConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- optimizer (Annotated[Optimizer, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2ef10>]) 
- start_factor (Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0.0), Le(le=1.0)])]) 
- end_factor (Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0.0), Le(le=1.0)])]) 
- total_iters (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])]) 
- last_epoch (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=-1)])]) 
 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.MemMapDatasetConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- raw_data_path (Annotated[Path, PathType(path_type=file)]) 
- index_path (Annotated[Path, PathType(path_type=file)] | None) 
- tokenizer (Annotated[TokenizerWrapper, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2ebd0>]) 
- jq_pattern (str) 
- sample_key (str) 
 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - 
tokenizer: Annotated[TokenizerWrapper]
 
- class modalities.config.config.OneCycleLRSchedulerConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- optimizer (Annotated[Optimizer, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2ef10>]) 
- max_lr (Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0.0)])] | list[Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0.0)])]]) 
- total_steps (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])] | None) 
- epochs (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])] | None) 
- steps_per_epoch (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])] | None) 
- pct_start (Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0.0), Le(le=1.0)])]) 
- anneal_strategy (str) 
- cycle_momentum (bool) 
- base_momentum (Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])] | list[Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0.0)])]]) 
- max_momentum (Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0.0)])] | list[Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0.0)])]]) 
- div_factor (Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0.0)])]) 
- final_div_factor (Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0.0)])]) 
- three_phase (bool) 
- last_epoch (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=-1)])]) 
 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.PackedMemMapDatasetContinuousConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.PackedMemMapDatasetMegatronConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.ParallelDegreeConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- device_mesh (Annotated[DeviceMesh, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2f510>]) 
- parallelism_methods (list[ParallelismDegrees]) 
 
 - 
device_mesh: Annotated[DeviceMesh]
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - 
parallelism_methods: list[ParallelismDegrees]
 
- class modalities.config.config.PassType(value)[source]
- Bases: - LookupEnum- BY_REFERENCE = 'by_reference'
 - BY_VALUE = 'by_value'
 
- class modalities.config.config.PreTrainedHFTokenizerConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.PreTrainedSPTokenizerConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- tokenizer_model_file (str) 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.PrecisionEnum(value)[source]
- Bases: - LookupEnum- BF16 = torch.bfloat16
 - FP16 = torch.float16
 - FP32 = torch.float32
 
- class modalities.config.config.ProcessGroupBackendType(value)[source]
- Bases: - LookupEnum- nccl = 'nccl'
 
- class modalities.config.config.RawAppStateConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- model (Annotated[Module, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2e910>]) 
- optimizer (Annotated[Optimizer, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2ef10>]) 
- lr_scheduler (Annotated[LRScheduler, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2efd0>] | None) 
 
 - 
lr_scheduler: Optional[Annotated[LRScheduler]]
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.ReferenceConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.ResumableDistributedSamplerConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- dataset (Annotated[Dataset, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2ec90>]) 
- rank (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])]) 
- num_replicas (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])]) 
- epoch (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])]) 
- shuffle (bool | None) 
- seed (int | None) 
- drop_last (Literal[True]) 
- skip_num_global_samples (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])]) 
 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.RichProgressSubscriberConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- eval_dataloaders (list[Annotated[LLMDataLoader, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2ee90>]] | None) 
- train_dataloader_tag (str) 
- num_seen_steps (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])]) 
- num_target_steps (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])]) 
- global_rank (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])]) 
 
 - 
eval_dataloaders: Optional[list[Annotated[LLMDataLoader]]]
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.RichResultSubscriberConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.SaveEveryKStepsCheckpointingStrategyConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.SaveKMostRecentCheckpointsStrategyConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- k (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=-1)])]) 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.SequentialSamplerConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- data_source (Annotated[Dataset, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2ec90>]) 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.StepLRSchedulerConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- optimizer (Annotated[Optimizer, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2ef10>]) 
- step_size (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Gt(gt=0)])]) 
- gamma (Annotated[float, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0.0)])]) 
- last_epoch (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=-1)])]) 
 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.TokenizerTypes(value)[source]
- Bases: - LookupEnum- GPT2TokenizerFast = <class 'transformers.models.gpt2.tokenization_gpt2_fast.GPT2TokenizerFast'>
 - LlamaTokenizerFast = <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>
 
- class modalities.config.config.TorchCheckpointLoadingConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- device (Annotated[device, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2f210>]) 
- precision (PrecisionEnum | None) 
 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - 
precision: Optional[PrecisionEnum]
 
- class modalities.config.config.WandBEvaluationResultSubscriberConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.config.WandbMode(value)[source]
- Bases: - LookupEnum- DISABLED = 'DISABLED'
 - OFFLINE = 'OFFLINE'
 - ONLINE = 'ONLINE'
 
- class modalities.config.config.WeightInitializedModelConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- model (Annotated[Module, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2e910>]) 
- model_initializer (Annotated[ModelInitializationIF, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2f450>]) 
 
 - model_config: ClassVar[ConfigDict] = {'protected_namespaces': ()}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - 
model_initializer: Annotated[ModelInitializationIF]
 
- modalities.config.config.load_app_config_dict(config_file_path, experiment_id=None, additional_resolver_funs=None)[source]
- Load the application configuration from the given YAML file. The function defines custom resolvers for the OmegaConf library to resolve environment variables and Modalities-specific variables. - Return type:
- Parameters:
 - Args:
- config_file_path (Path): YAML config file. experiment_id (str, optional): The experiment_id of the current run. Defaults to None. additional_resolver_funs (dict[str, Callable], optional): Additional resolver functions. Defaults to None. 
- Returns:
- dict: Dictionary representation of the config file. 
 
modalities.config.instantiation_models module
- class modalities.config.instantiation_models.ConsistencyEnforcement(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.instantiation_models.CudaEnvSettings(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- local_rank (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])]) 
- world_size (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])]) 
- global_rank (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])]) 
 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.instantiation_models.InstructionTuningDataInstantiationModel(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
 - class InstructionDataTransformation(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
 - class Settings(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- src_path (Annotated[Path, PathType(path_type=file)]) 
- dst_path (Path) 
- messages_key (str) 
- split_config (SplitConfig | None) 
- pbin_creation_config_file_path (Annotated[Path, PathType(path_type=file)] | None) 
 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - 
split_config: SplitConfig|None
 
 - 
instruction_data_transformation: InstructionDataTransformation
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.instantiation_models.Intervals(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- training_log_interval_in_steps (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])]) 
- checkpointing_interval_in_steps (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])]) 
- evaluation_interval_in_steps (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])]) 
 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.instantiation_models.PackedDatasetComponentsInstantiationModel(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- tokenizer (Annotated[TokenizerWrapper, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2ebd0>]) 
- settings (PackedDatasetSettings) 
 
 - class PackedDatasetSettings(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- src_path (Annotated[Path, PathType(path_type=file)]) 
- dst_path (Path | None) 
- index_path (Annotated[Path, PathType(path_type=file)] | None) 
- jq_pattern (str) 
- num_cpus (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])]) 
- eod_token (str) 
- processing_batch_size (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])]) 
- raw_samples_queue_size (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])]) 
- processed_samples_queue_size (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])]) 
 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - 
settings: PackedDatasetSettings
 - 
tokenizer: Annotated[TokenizerWrapper]
 
- class modalities.config.instantiation_models.SplitConfig(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.instantiation_models.Splitting(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.instantiation_models.StepProfile(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- gradient_accumulation_steps (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])]) 
- local_train_micro_batch_size (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])]) 
- sequence_length (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])]) 
- dp_degree (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=1)])]) 
 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.instantiation_models.TextGenerationInstantiationModel(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- text_inference_component (Annotated[TextInferenceComponent, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2f2d0>]) 
- settings (TextGenerationSettings) 
 
 - class TextGenerationSettings(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
 - model_config: ClassVar[ConfigDict] = {'protected_namespaces': ()}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - 
settings: TextGenerationSettings
 - 
text_inference_component: Annotated[TextInferenceComponent]
 
- class modalities.config.instantiation_models.TrainingComponentsInstantiationModel(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- settings (Settings) 
- app_state (Annotated[AppState, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2f5d0>]) 
- loss_fn (Annotated[Loss, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2f090>]) 
- train_dataset (Annotated[Dataset, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2ec90>]) 
- train_dataloader (Annotated[LLMDataLoader, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2ee90>]) 
- eval_dataloaders (list[Annotated[LLMDataLoader, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2ee90>]]) 
- progress_subscriber (Annotated[MessageSubscriberIF, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2f150>]) 
- evaluation_subscriber (Annotated[MessageSubscriberIF, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2f150>]) 
- checkpoint_saving (Annotated[CheckpointSaving, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c1a490>]) 
- gradient_clipper (Annotated[GradientClipperIF, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2f390>]) 
- mfu_calculator (Annotated[MFUCalculatorABC, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2f690>] | None) 
- model_raw (Annotated[Module, <modalities.config.pydantic_if_types.PydanticThirdPartyTypeIF object at 0x7f84b6c2e910>]) 
 
 - class Settings(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- experiment_id (str) 
- config_file_path (Annotated[Path, PathType(path_type=file)]) 
- cuda_env (CudaEnvSettings) 
- paths (Paths) 
- intervals (Intervals) 
- consistency_enforcement (ConsistencyEnforcement) 
- step_profile (StepProfile) 
- training_target (TrainingTarget) 
- training_progress (TrainingProgress) 
- warmstart_checkpoint_paths (WarmstartCheckpointPaths | DCPWarmstartCheckpointPaths | None) 
 
 - class DCPWarmstartCheckpointPaths(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- checkpoint_folder_path (Path) 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
 - class Paths(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - model_config: ClassVar[ConfigDict] = {'extra': 'allow'}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
 - class WarmstartCheckpointPaths(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
 - 
consistency_enforcement: ConsistencyEnforcement
 - 
cuda_env: CudaEnvSettings
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - 
step_profile: StepProfile
 - 
training_progress: TrainingProgress
 - 
training_target: TrainingTarget
 - 
warmstart_checkpoint_paths: Union[WarmstartCheckpointPaths,DCPWarmstartCheckpointPaths,None]
 
 - 
checkpoint_saving: Annotated[CheckpointSaving]
 - 
eval_dataloaders: list[Annotated[LLMDataLoader]]
 - 
evaluation_subscriber: Annotated[MessageSubscriberIF]
 - 
gradient_clipper: Annotated[GradientClipperIF]
 - 
mfu_calculator: Optional[Annotated[MFUCalculatorABC]]
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 - 
progress_subscriber: Annotated[MessageSubscriberIF]
 - 
train_dataloader: Annotated[LLMDataLoader]
 
- class modalities.config.instantiation_models.TrainingProgress(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
- global_num_seen_tokens (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])]) 
- num_seen_steps (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])]) 
- num_seen_samples (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=0)])]) 
- last_step (Annotated[int, FieldInfo(annotation=NoneType, required=True, metadata=[Strict(strict=True), Ge(ge=-1)])]) 
 
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict]. 
 
- class modalities.config.instantiation_models.TrainingReportGenerator(training_target, intervals, step_profile, cuda_env, consistency_enforcement, train_dataset, training_progress)[source]
- Bases: - object- Parameters:
- training_target (TrainingTarget) 
- intervals (Intervals) 
- step_profile (StepProfile) 
- cuda_env (CudaEnvSettings) 
- consistency_enforcement (ConsistencyEnforcement) 
- train_dataset (Dataset) 
- training_progress (TrainingProgress) 
 
 
- class modalities.config.instantiation_models.TrainingTarget(**data)[source]
- Bases: - BaseModel- Create a new model by parsing and validating input data from keyword arguments. - Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model. - self is explicitly positional-only to allow self as a field name. - Parameters:
 - model_config: ClassVar[ConfigDict] = {}
- Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].