Welcome to Modalities’ documentation!
We propose a novel training framework for Multimodal Large Language Models (LLMs) that prioritizes code readability and efficiency. The codebase adheres to the principles of “clean code,” minimizing Lines of Code (LoC) while maintaining extensibility. A single, comprehensive configuration file enables easy customization of various model and training parameters.
A key innovation is the adoption of a PyTorch-native training loop integrated with the Fully Sharded Data Parallelism (FSDP) technique. FSDP optimizes memory usage and training speed, enhancing scalability for large-scale multimodal models. By leveraging PyTorch’s native capabilities, our framework simplifies the development process and promotes ease of maintenance.
The framework’s modular design facilitates experimentation with different multimodal architectures and training strategies. Users can seamlessly integrate diverse datasets and model components, allowing for comprehensive exploration of multimodal learning tasks. The combination of clean code, minimal configuration, and PyTorch-native training with FSDP contributes to a user-friendly and efficient platform for developing state-of-the-art multimodal language models.
Note
This project is under active development.
Getting Started
Entrypoints
VSCode Setup
Future Work
API
- modalities
- modalities package
- Subpackages
- modalities.checkpointing package
- Subpackages
- modalities.checkpointing.fsdp package
- modalities.checkpointing.stateful package
- modalities.checkpointing.torch package
- Submodules
- modalities.checkpointing.checkpoint_conversion module
- modalities.checkpointing.checkpoint_loading module
- modalities.checkpointing.checkpoint_saving module
- modalities.checkpointing.checkpoint_saving_execution module
- modalities.checkpointing.checkpoint_saving_instruction module
- modalities.checkpointing.checkpoint_saving_strategies module
- Module contents
- Subpackages
- modalities.config package
- Submodules
- modalities.config.component_factory module
- modalities.config.config module
ActivationCheckpointedModelConfigActivationCheckpointedModelConfig.FullACParamsActivationCheckpointedModelConfig.SelectiveLayerACParamsActivationCheckpointedModelConfig.SelectiveOpACParamsActivationCheckpointedModelConfig.ac_fun_paramsActivationCheckpointedModelConfig.ac_variantActivationCheckpointedModelConfig.layers_fqnActivationCheckpointedModelConfig.modelActivationCheckpointedModelConfig.model_config
AdamOptimizerConfigAdamWOptimizerConfigBatchSamplerConfigCLMCrossEntropyLossConfigCheckpointSavingConfigCombinedDatasetConfigCompiledModelConfigConstantLRSchedulerConfigCosineAnnealingLRSchedulerConfigDCPAppStateConfigDCPCheckpointLoadingConfigDCPCheckpointSavingConfigDebuggingEnrichedModelConfigDistributedSamplerConfigDummyLRSchedulerConfigDummyProgressSubscriberConfigDummyResultSubscriberConfigEvaluationResultToDiscSubscriberConfigFSDP1ActivationCheckpointedModelConfigFSDP1CheckpointLoadingConfigFSDP1CheckpointLoadingConfig.block_namesFSDP1CheckpointLoadingConfig.global_rankFSDP1CheckpointLoadingConfig.mixed_precision_settingsFSDP1CheckpointLoadingConfig.model_configFSDP1CheckpointLoadingConfig.parse_mixed_precision_setting_by_name()FSDP1CheckpointLoadingConfig.parse_sharding_strategy_by_name()FSDP1CheckpointLoadingConfig.sharding_strategy
FSDP1CheckpointSavingConfigFSDP1CheckpointedModelConfigFSDP1CheckpointedOptimizerConfigFSDP2WrappedModelConfigFSDP2WrappedModelConfig.block_namesFSDP2WrappedModelConfig.device_meshFSDP2WrappedModelConfig.mixed_precision_settingsFSDP2WrappedModelConfig.modelFSDP2WrappedModelConfig.model_configFSDP2WrappedModelConfig.reshard_after_forwardFSDP2WrappedModelConfig.validate_dp_mesh_existence()FSDP2WrappedModelConfig.validate_mixed_precision_settings()
FSDPWrappedModelConfigFSDPWrappedModelConfig.block_namesFSDPWrappedModelConfig.mixed_precision_settingsFSDPWrappedModelConfig.modelFSDPWrappedModelConfig.model_configFSDPWrappedModelConfig.parse_mixed_precision_setting_by_name()FSDPWrappedModelConfig.parse_sharding_strategy_by_name()FSDPWrappedModelConfig.sharding_strategyFSDPWrappedModelConfig.sync_module_states
GPT2LLMCollateFnConfigGPT2MFUCalculatorConfigGPT2ModelTPConfigLLMDataLoaderConfigLinearLRSchedulerConfigMemMapDatasetConfigOneCycleLRSchedulerConfigOneCycleLRSchedulerConfig.anneal_strategyOneCycleLRSchedulerConfig.base_momentumOneCycleLRSchedulerConfig.check_totals_steps_and_epchs()OneCycleLRSchedulerConfig.cycle_momentumOneCycleLRSchedulerConfig.div_factorOneCycleLRSchedulerConfig.epochsOneCycleLRSchedulerConfig.final_div_factorOneCycleLRSchedulerConfig.last_epochOneCycleLRSchedulerConfig.max_lrOneCycleLRSchedulerConfig.max_momentumOneCycleLRSchedulerConfig.model_configOneCycleLRSchedulerConfig.optimizerOneCycleLRSchedulerConfig.pct_startOneCycleLRSchedulerConfig.steps_per_epochOneCycleLRSchedulerConfig.three_phaseOneCycleLRSchedulerConfig.total_steps
PackedMemMapDatasetContinuousConfigPackedMemMapDatasetMegatronConfigParallelDegreeConfigPassTypePreTrainedHFTokenizerConfigPreTrainedSPTokenizerConfigPrecisionEnumProcessGroupBackendTypeRawAppStateConfigReferenceConfigResumableDistributedSamplerConfigResumableDistributedSamplerConfig.datasetResumableDistributedSamplerConfig.drop_lastResumableDistributedSamplerConfig.epochResumableDistributedSamplerConfig.model_configResumableDistributedSamplerConfig.num_replicasResumableDistributedSamplerConfig.rankResumableDistributedSamplerConfig.seedResumableDistributedSamplerConfig.shuffleResumableDistributedSamplerConfig.skip_num_global_samples
RichProgressSubscriberConfigRichResultSubscriberConfigSaveEveryKStepsCheckpointingStrategyConfigSaveKMostRecentCheckpointsStrategyConfigSequentialSamplerConfigStepLRSchedulerConfigTokenizerTypesTorchCheckpointLoadingConfigWandBEvaluationResultSubscriberConfigWandBEvaluationResultSubscriberConfig.config_file_pathWandBEvaluationResultSubscriberConfig.directoryWandBEvaluationResultSubscriberConfig.experiment_idWandBEvaluationResultSubscriberConfig.global_rankWandBEvaluationResultSubscriberConfig.modeWandBEvaluationResultSubscriberConfig.model_configWandBEvaluationResultSubscriberConfig.project
WandbModeWeightInitializedModelConfigload_app_config_dict()
- modalities.config.instantiation_models module
ConsistencyEnforcementCudaEnvSettingsInstructionTuningDataInstantiationModelInstructionTuningDataInstantiationModel.InstructionDataTransformationInstructionTuningDataInstantiationModel.SettingsInstructionTuningDataInstantiationModel.Settings.dst_pathInstructionTuningDataInstantiationModel.Settings.messages_keyInstructionTuningDataInstantiationModel.Settings.model_configInstructionTuningDataInstantiationModel.Settings.pbin_creation_config_file_pathInstructionTuningDataInstantiationModel.Settings.split_configInstructionTuningDataInstantiationModel.Settings.src_path
InstructionTuningDataInstantiationModel.chat_template_dataInstructionTuningDataInstantiationModel.instruction_data_transformationInstructionTuningDataInstantiationModel.jinja2_chat_templateInstructionTuningDataInstantiationModel.model_configInstructionTuningDataInstantiationModel.settings
IntervalsPackedDatasetComponentsInstantiationModelPackedDatasetComponentsInstantiationModel.PackedDatasetSettingsPackedDatasetComponentsInstantiationModel.PackedDatasetSettings.dst_pathPackedDatasetComponentsInstantiationModel.PackedDatasetSettings.eod_tokenPackedDatasetComponentsInstantiationModel.PackedDatasetSettings.index_pathPackedDatasetComponentsInstantiationModel.PackedDatasetSettings.jq_patternPackedDatasetComponentsInstantiationModel.PackedDatasetSettings.model_configPackedDatasetComponentsInstantiationModel.PackedDatasetSettings.num_cpusPackedDatasetComponentsInstantiationModel.PackedDatasetSettings.processed_samples_queue_sizePackedDatasetComponentsInstantiationModel.PackedDatasetSettings.processing_batch_sizePackedDatasetComponentsInstantiationModel.PackedDatasetSettings.raw_samples_queue_sizePackedDatasetComponentsInstantiationModel.PackedDatasetSettings.src_path
PackedDatasetComponentsInstantiationModel.model_configPackedDatasetComponentsInstantiationModel.settingsPackedDatasetComponentsInstantiationModel.tokenizer
SplitConfigSplittingStepProfileTextGenerationInstantiationModelTextGenerationInstantiationModel.TextGenerationSettingsTextGenerationInstantiationModel.TextGenerationSettings.deviceTextGenerationInstantiationModel.TextGenerationSettings.model_configTextGenerationInstantiationModel.TextGenerationSettings.model_pathTextGenerationInstantiationModel.TextGenerationSettings.parse_device()TextGenerationInstantiationModel.TextGenerationSettings.referencing_keysTextGenerationInstantiationModel.TextGenerationSettings.sequence_length
TextGenerationInstantiationModel.model_configTextGenerationInstantiationModel.settingsTextGenerationInstantiationModel.text_inference_component
TrainingComponentsInstantiationModelTrainingComponentsInstantiationModel.SettingsTrainingComponentsInstantiationModel.Settings.DCPWarmstartCheckpointPathsTrainingComponentsInstantiationModel.Settings.PathsTrainingComponentsInstantiationModel.Settings.WarmstartCheckpointPathsTrainingComponentsInstantiationModel.Settings.config_file_pathTrainingComponentsInstantiationModel.Settings.consistency_enforcementTrainingComponentsInstantiationModel.Settings.cuda_envTrainingComponentsInstantiationModel.Settings.debuggingTrainingComponentsInstantiationModel.Settings.experiment_idTrainingComponentsInstantiationModel.Settings.intervalsTrainingComponentsInstantiationModel.Settings.model_configTrainingComponentsInstantiationModel.Settings.pathsTrainingComponentsInstantiationModel.Settings.referencing_keysTrainingComponentsInstantiationModel.Settings.step_profileTrainingComponentsInstantiationModel.Settings.training_progressTrainingComponentsInstantiationModel.Settings.training_targetTrainingComponentsInstantiationModel.Settings.warmstart_checkpoint_paths
TrainingComponentsInstantiationModel.app_stateTrainingComponentsInstantiationModel.checkpoint_savingTrainingComponentsInstantiationModel.device_meshTrainingComponentsInstantiationModel.eval_dataloadersTrainingComponentsInstantiationModel.evaluation_subscriberTrainingComponentsInstantiationModel.gradient_clipperTrainingComponentsInstantiationModel.loss_fnTrainingComponentsInstantiationModel.mfu_calculatorTrainingComponentsInstantiationModel.model_configTrainingComponentsInstantiationModel.model_rawTrainingComponentsInstantiationModel.progress_subscriberTrainingComponentsInstantiationModel.scheduled_pipelineTrainingComponentsInstantiationModel.settingsTrainingComponentsInstantiationModel.train_dataloaderTrainingComponentsInstantiationModel.train_dataset
TrainingProgressTrainingReportGeneratorTrainingTarget
- modalities.config.lookup_enum module
- modalities.config.pydantic_if_types module
- modalities.config.utils module
- Module contents
- modalities.conversion package
- Subpackages
- modalities.conversion.gpt2 package
- Submodules
- modalities.conversion.gpt2.configuration_gpt2 module
- modalities.conversion.gpt2.conversion_code module
- modalities.conversion.gpt2.conversion_model module
- modalities.conversion.gpt2.conversion_tokenizer module
- modalities.conversion.gpt2.convert_gpt2 module
- modalities.conversion.gpt2.modeling_gpt2 module
- Module contents
- modalities.conversion.gpt2 package
- Module contents
- Subpackages
- modalities.dataloader package
- Subpackages
- Submodules
- modalities.dataloader.apply_chat_template module
- modalities.dataloader.create_index module
- modalities.dataloader.create_instruction_tuning_data module
- modalities.dataloader.create_packed_data module
- modalities.dataloader.dataloader module
- modalities.dataloader.dataloader_factory module
- modalities.dataloader.dataset module
CombinedDatasetDatasetDummyDatasetDummyDatasetConfigDummySampleConfigDummySampleDataTypeMemMapDatasetPackedMemMapDatasetBasePackedMemMapDatasetBase.DATA_SECTION_LENGTH_IN_BYTESPackedMemMapDatasetBase.HEADER_SIZE_IN_BYTESPackedMemMapDatasetBase.TOKEN_SIZE_DESCRIPTOR_LENGTH_IN_BYTESPackedMemMapDatasetBase.np_dtype_of_tokens_on_disk_from_bytesPackedMemMapDatasetBase.token_size_in_bytesPackedMemMapDatasetBase.type_converter_for_torch
PackedMemMapDatasetContinuousPackedMemMapDatasetMegatron
- modalities.dataloader.dataset_factory module
- modalities.dataloader.filter_packed_data module
- modalities.dataloader.large_file_lines_reader module
- modalities.dataloader.sampler_factory module
ResumableDistributedMultiDimSamplerConfigResumableDistributedMultiDimSamplerConfig.data_parallel_keyResumableDistributedMultiDimSamplerConfig.datasetResumableDistributedMultiDimSamplerConfig.device_meshResumableDistributedMultiDimSamplerConfig.drop_lastResumableDistributedMultiDimSamplerConfig.epochResumableDistributedMultiDimSamplerConfig.model_configResumableDistributedMultiDimSamplerConfig.seedResumableDistributedMultiDimSamplerConfig.shuffleResumableDistributedMultiDimSamplerConfig.skip_num_global_samples
SamplerFactory
- modalities.dataloader.samplers module
- Module contents
- modalities.inference package
- Subpackages
- modalities.inference.text package
- Submodules
- modalities.inference.text.config module
TextInferenceComponentConfigTextInferenceComponentConfig.deviceTextInferenceComponentConfig.eod_tokenTextInferenceComponentConfig.modelTextInferenceComponentConfig.model_configTextInferenceComponentConfig.parse_device()TextInferenceComponentConfig.prompt_templateTextInferenceComponentConfig.sequence_lengthTextInferenceComponentConfig.temperatureTextInferenceComponentConfig.tokenizer
- modalities.inference.text.inference_component module
- Module contents
- modalities.inference.text package
- Submodules
- modalities.inference.inference module
- Module contents
- Subpackages
- modalities.logging_broker package
- Subpackages
- modalities.logging_broker.subscriber_impl package
- Submodules
- modalities.logging_broker.subscriber_impl.progress_subscriber module
- modalities.logging_broker.subscriber_impl.results_subscriber module
- modalities.logging_broker.subscriber_impl.subscriber_factory module
- Module contents
- modalities.logging_broker.subscriber_impl package
- Submodules
- modalities.logging_broker.message_broker module
- modalities.logging_broker.messages module
- modalities.logging_broker.publisher module
- modalities.logging_broker.subscriber module
- Module contents
- Subpackages
- modalities.models package
- Subpackages
- modalities.models.coca package
- Submodules
- modalities.models.coca.attention_pooling module
- modalities.models.coca.coca_model module
CoCaCoCaConfigCoCaConfig.bias_attn_poolCoCaConfig.epsilon_attn_poolCoCaConfig.model_configCoCaConfig.n_pool_headCoCaConfig.n_vision_queriesCoCaConfig.prediction_keyCoCaConfig.text_cls_prediction_keyCoCaConfig.text_decoder_configCoCaConfig.text_embd_prediction_keyCoCaConfig.vision_cls_prediction_keyCoCaConfig.vision_embd_prediction_keyCoCaConfig.vision_encoder_config
TextDecoderConfigTextDecoderConfig.activationTextDecoderConfig.attention_configTextDecoderConfig.biasTextDecoderConfig.block_sizeTextDecoderConfig.dropoutTextDecoderConfig.epsilonTextDecoderConfig.ffn_hiddenTextDecoderConfig.model_configTextDecoderConfig.n_embdTextDecoderConfig.n_headTextDecoderConfig.n_layer_multimodal_textTextDecoderConfig.n_layer_textTextDecoderConfig.prediction_keyTextDecoderConfig.sample_keyTextDecoderConfig.vocab_size
- modalities.models.coca.collator module
- modalities.models.coca.multi_modal_decoder module
- modalities.models.coca.text_decoder module
- Module contents
- modalities.models.components package
- modalities.models.gpt2 package
- Submodules
- modalities.models.gpt2.collator module
- modalities.models.gpt2.gpt2_model module
AttentionConfigAttentionConfig.QueryKeyValueTransformConfigAttentionConfig.QueryKeyValueTransformConfig.IdentityTransformConfigAttentionConfig.QueryKeyValueTransformConfig.RotaryTransformConfigAttentionConfig.QueryKeyValueTransformConfig.RotaryTransformConfig.base_freqAttentionConfig.QueryKeyValueTransformConfig.RotaryTransformConfig.model_configAttentionConfig.QueryKeyValueTransformConfig.RotaryTransformConfig.n_embdAttentionConfig.QueryKeyValueTransformConfig.RotaryTransformConfig.n_headAttentionConfig.QueryKeyValueTransformConfig.RotaryTransformConfig.seq_length_dim
AttentionConfig.QueryKeyValueTransformConfig.configAttentionConfig.QueryKeyValueTransformConfig.model_configAttentionConfig.QueryKeyValueTransformConfig.parse_sharding_strategy_by_name()AttentionConfig.QueryKeyValueTransformConfig.type_hint
AttentionConfig.model_configAttentionConfig.qk_norm_configAttentionConfig.qkv_transforms
AttentionImplementationCausalSelfAttentionGPT2BlockGPT2LLMGPT2LLMConfigGPT2LLMConfig.activation_typeGPT2LLMConfig.attention_configGPT2LLMConfig.attention_implementationGPT2LLMConfig.attention_norm_configGPT2LLMConfig.biasGPT2LLMConfig.check_divisibility()GPT2LLMConfig.dropoutGPT2LLMConfig.enforce_swiglu_hidden_dim_multiple_ofGPT2LLMConfig.ffn_hiddenGPT2LLMConfig.ffn_norm_configGPT2LLMConfig.lm_head_norm_configGPT2LLMConfig.model_configGPT2LLMConfig.n_embdGPT2LLMConfig.n_head_kvGPT2LLMConfig.n_head_qGPT2LLMConfig.n_layerGPT2LLMConfig.poe_typeGPT2LLMConfig.prediction_keyGPT2LLMConfig.sample_keyGPT2LLMConfig.seedGPT2LLMConfig.sequence_lengthGPT2LLMConfig.use_meta_deviceGPT2LLMConfig.use_weight_tyingGPT2LLMConfig.validate_sizes()GPT2LLMConfig.vocab_size
IdentityTransformLayerNormWrapperConfigLayerNormsPositionTypesQueryKeyValueTransformQueryKeyValueTransformTypeRotaryTransformTransformerMLPmanual_scaled_dot_product_attention()
- Module contents
- modalities.models.huggingface package
- Submodules
- modalities.models.huggingface.huggingface_model module
HuggingFaceModelTypesHuggingFacePretrainedModelHuggingFacePretrainedModelConfigHuggingFacePretrainedModelConfig.huggingface_prediction_subscription_keyHuggingFacePretrainedModelConfig.kwargsHuggingFacePretrainedModelConfig.model_argsHuggingFacePretrainedModelConfig.model_configHuggingFacePretrainedModelConfig.model_nameHuggingFacePretrainedModelConfig.model_typeHuggingFacePretrainedModelConfig.prediction_keyHuggingFacePretrainedModelConfig.sample_key
- Module contents
- modalities.models.huggingface_adapters package
- modalities.models.parallelism package
- Submodules
- modalities.models.parallelism.pipeline_parallelism module
- modalities.models.parallelism.pipeline_parallelism_configs module
- modalities.models.parallelism.stages_generator module
- modalities.models.parallelism.stages_generator_configs module
- Module contents
- modalities.models.vision_transformer package
- Submodules
- modalities.models.vision_transformer.vision_transformer_model module
ImagePatchEmbeddingVisionTransformerVisionTransformerBlockVisionTransformerConfigVisionTransformerConfig.add_cls_tokenVisionTransformerConfig.attention_configVisionTransformerConfig.biasVisionTransformerConfig.dropoutVisionTransformerConfig.img_sizeVisionTransformerConfig.model_configVisionTransformerConfig.n_classesVisionTransformerConfig.n_embdVisionTransformerConfig.n_headVisionTransformerConfig.n_img_channelsVisionTransformerConfig.n_layerVisionTransformerConfig.patch_sizeVisionTransformerConfig.patch_strideVisionTransformerConfig.prediction_keyVisionTransformerConfig.sample_key
- Module contents
- modalities.models.coca package
- Submodules
- modalities.models.model module
- modalities.models.model_factory module
GPT2ModelFactoryModelFactoryModelFactory.get_activation_checkpointed_fsdp1_model_()ModelFactory.get_activation_checkpointed_fsdp2_model_()ModelFactory.get_compiled_model()ModelFactory.get_debugging_enriched_model()ModelFactory.get_fsdp1_checkpointed_model()ModelFactory.get_fsdp1_wrapped_model()ModelFactory.get_fsdp2_wrapped_model()ModelFactory.get_weight_initialized_model()
- modalities.models.utils module
- Module contents
- Subpackages
- modalities.nn package
- Subpackages
- modalities.nn.model_initialization package
- Submodules
- modalities.nn.model_initialization.composed_initialization module
ComposedInitializationRoutinesComposedModelInitializationConfigComposedModelInitializationConfig.hidden_dimComposedModelInitializationConfig.meanComposedModelInitializationConfig.model_configComposedModelInitializationConfig.model_typeComposedModelInitializationConfig.num_layersComposedModelInitializationConfig.stdComposedModelInitializationConfig.weight_init_type
ModelInitializerWrapperModelInitializerWrapperConfig
- modalities.nn.model_initialization.initialization_if module
- modalities.nn.model_initialization.initialization_routines module
- modalities.nn.model_initialization.parameter_name_filters module
- Module contents
- modalities.nn.model_initialization package
- Submodules
- modalities.nn.attention module
- modalities.nn.mlp module
- Module contents
- Subpackages
- modalities.optimizers package
- modalities.preprocessing package
- modalities.registry package
- modalities.running_env package
- Subpackages
- modalities.running_env.fsdp package
- Submodules
- modalities.running_env.fsdp.device_mesh module
DeviceMeshConfigDeviceMeshConfig.context_parallel_degreeDeviceMeshConfig.data_parallel_replicate_degreeDeviceMeshConfig.data_parallel_shard_degreeDeviceMeshConfig.device_typeDeviceMeshConfig.enable_loss_parallelDeviceMeshConfig.model_configDeviceMeshConfig.pipeline_parallel_degreeDeviceMeshConfig.tensor_parallel_degreeDeviceMeshConfig.world_size
ParallelismDegreesget_device_mesh()get_mesh_for_parallelism_method()get_parallel_degree()get_parallel_rank()has_parallelism_method()
- modalities.running_env.fsdp.fsdp_auto_wrapper module
- modalities.running_env.fsdp.reducer module
- Module contents
- modalities.running_env.fsdp package
- Submodules
- modalities.running_env.cuda_env module
- modalities.running_env.env_utils module
- Module contents
- Subpackages
- modalities.tokenization package
- modalities.training package
- Subpackages
- modalities.training.activation_checkpointing package
- modalities.training.gradient_clipping package
- Submodules
- modalities.training.gradient_clipping.fsdp_gradient_clipper module
- modalities.training.gradient_clipping.fsdp_gradient_clipper_config module
- modalities.training.gradient_clipping.gradient_clipper module
- Module contents
- Submodules
- modalities.training.training_progress module
TrainingProgressTrainingProgress.num_seen_steps_current_runTrainingProgress.num_seen_steps_previous_runTrainingProgress.num_seen_steps_totalTrainingProgress.num_seen_tokens_current_runTrainingProgress.num_seen_tokens_previous_runTrainingProgress.num_seen_tokens_totalTrainingProgress.num_target_stepsTrainingProgress.num_target_tokens
- Module contents
- Subpackages
- modalities.utils package
- Subpackages
- modalities.utils.benchmarking package
- modalities.utils.profilers package
- Submodules
- modalities.utils.profilers.batch_generator module
- modalities.utils.profilers.modalities_profiler module
- modalities.utils.profilers.steppable_component_configs module
- modalities.utils.profilers.steppable_components module
- modalities.utils.profilers.steppable_components_if module
- Module contents
- Submodules
- modalities.utils.communication_test module
- modalities.utils.debug module
- modalities.utils.debug_components module
- modalities.utils.debugging_configs module
- modalities.utils.file_ops module
- modalities.utils.logger_utils module
- modalities.utils.mfu module
- modalities.utils.number_conversion module
LocalNumBatchesFromNumSamplesConfigLocalNumBatchesFromNumTokensConfigNumSamplesFromNumTokensConfigNumStepsFromNumSamplesConfigNumStepsFromNumTokensConfigNumStepsFromRawDatasetIndexConfigNumTokensFromNumStepsConfigNumTokensFromPackedMemMapDatasetContinuousConfigNumTokensFromPackedMemMapDatasetContinuousConfig.dataset_pathNumTokensFromPackedMemMapDatasetContinuousConfig.dp_degreeNumTokensFromPackedMemMapDatasetContinuousConfig.gradient_accumulation_stepsNumTokensFromPackedMemMapDatasetContinuousConfig.local_micro_batch_sizeNumTokensFromPackedMemMapDatasetContinuousConfig.model_configNumTokensFromPackedMemMapDatasetContinuousConfig.reuse_last_targetNumTokensFromPackedMemMapDatasetContinuousConfig.sample_keyNumTokensFromPackedMemMapDatasetContinuousConfig.sequence_length
NumberConversionNumberConversion.get_global_num_seen_tokens_from_checkpoint_path()NumberConversion.get_global_num_target_tokens_from_checkpoint_path()NumberConversion.get_last_step_from_checkpoint_path()NumberConversion.get_local_num_batches_from_num_samples()NumberConversion.get_local_num_batches_from_num_tokens()NumberConversion.get_num_samples_from_num_tokens()NumberConversion.get_num_seen_steps_from_checkpoint_path()NumberConversion.get_num_steps_from_num_samples()NumberConversion.get_num_steps_from_num_tokens()NumberConversion.get_num_steps_from_raw_dataset_index()NumberConversion.get_num_target_steps_from_checkpoint_path()NumberConversion.get_num_tokens_from_num_steps()NumberConversion.get_num_tokens_from_packed_mem_map_dataset_continuous()
NumberConversionFromCheckpointPathConfig
- modalities.utils.seeding module
- modalities.utils.typing_utils module
- modalities.utils.verify_tokenization_consistency module
- Module contents
- Subpackages
- modalities.checkpointing package
- Submodules
- modalities.api module
FileExistencePolicyconvert_pytorch_to_hf_checkpoint()create_filtered_tokenized_dataset()create_raw_data_index()create_shuffled_dataset_chunk()create_shuffled_jsonl_dataset_chunk()enforce_file_existence_policy()generate_text()merge_packed_data_files()pack_encoded_data()shuffle_jsonl_data()shuffle_tokenized_data()
- modalities.batch module
- modalities.evaluator module
- modalities.exceptions module
- modalities.gym module
- modalities.loss_functions module
- modalities.main module
- modalities.trainer module
- modalities.util module
AggregatorTimeRecorderTimeRecorderStatesformat_metrics_to_gb()get_experiment_id_from_config()get_local_number_of_trainable_parameters()get_module_class_from_name()get_synced_experiment_id_of_run()get_synced_string()get_total_number_of_trainable_parameters()parse_enum_by_name()print_rank_0()warn_rank_0()
- Module contents
- Subpackages
- modalities package