modalities.models package
Subpackages
- modalities.models.coca package
- Submodules
- modalities.models.coca.attention_pooling module
- modalities.models.coca.coca_model module
CoCaCoCaConfigCoCaConfig.bias_attn_poolCoCaConfig.epsilon_attn_poolCoCaConfig.model_configCoCaConfig.n_pool_headCoCaConfig.n_vision_queriesCoCaConfig.prediction_keyCoCaConfig.text_cls_prediction_keyCoCaConfig.text_decoder_configCoCaConfig.text_embd_prediction_keyCoCaConfig.vision_cls_prediction_keyCoCaConfig.vision_embd_prediction_keyCoCaConfig.vision_encoder_config
TextDecoderConfigTextDecoderConfig.activationTextDecoderConfig.attention_configTextDecoderConfig.biasTextDecoderConfig.block_sizeTextDecoderConfig.dropoutTextDecoderConfig.epsilonTextDecoderConfig.ffn_hiddenTextDecoderConfig.model_configTextDecoderConfig.n_embdTextDecoderConfig.n_headTextDecoderConfig.n_layer_multimodal_textTextDecoderConfig.n_layer_textTextDecoderConfig.prediction_keyTextDecoderConfig.sample_keyTextDecoderConfig.vocab_size
- modalities.models.coca.collator module
- modalities.models.coca.multi_modal_decoder module
- modalities.models.coca.text_decoder module
- Module contents
- modalities.models.components package
- modalities.models.gpt2 package
- Submodules
- modalities.models.gpt2.collator module
- modalities.models.gpt2.gpt2_model module
AttentionConfigAttentionImplementationCausalSelfAttentionGPT2BlockGPT2LLMGPT2LLMConfigGPT2LLMConfig.activation_typeGPT2LLMConfig.attention_configGPT2LLMConfig.attention_implementationGPT2LLMConfig.attention_norm_configGPT2LLMConfig.biasGPT2LLMConfig.check_divisibility()GPT2LLMConfig.dropoutGPT2LLMConfig.enforce_swiglu_hidden_dim_multiple_ofGPT2LLMConfig.ffn_hiddenGPT2LLMConfig.ffn_norm_configGPT2LLMConfig.lm_head_norm_configGPT2LLMConfig.model_configGPT2LLMConfig.n_embdGPT2LLMConfig.n_head_kvGPT2LLMConfig.n_head_qGPT2LLMConfig.n_layerGPT2LLMConfig.poe_typeGPT2LLMConfig.prediction_keyGPT2LLMConfig.sample_keyGPT2LLMConfig.seedGPT2LLMConfig.sequence_lengthGPT2LLMConfig.use_meta_deviceGPT2LLMConfig.use_weight_tyingGPT2LLMConfig.validate_sizes()GPT2LLMConfig.vocab_size
IdentityTransformLayerNormWrapperConfigLayerNormsPositionTypesQueryKeyValueTransformQueryKeyValueTransformTypeRotaryTransformTransformerMLPmanual_scaled_dot_product_attention()
- modalities.models.gpt2.llama3_like_initialization module
- Module contents
- modalities.models.huggingface package
- Submodules
- modalities.models.huggingface.huggingface_model module
HuggingFaceModelTypesHuggingFacePretrainedModelHuggingFacePretrainedModelConfigHuggingFacePretrainedModelConfig.huggingface_prediction_subscription_keyHuggingFacePretrainedModelConfig.kwargsHuggingFacePretrainedModelConfig.model_argsHuggingFacePretrainedModelConfig.model_configHuggingFacePretrainedModelConfig.model_nameHuggingFacePretrainedModelConfig.model_typeHuggingFacePretrainedModelConfig.prediction_keyHuggingFacePretrainedModelConfig.sample_key
- Module contents
- modalities.models.huggingface_adapters package
- modalities.models.parallelism package
- modalities.models.vision_transformer package
- Submodules
- modalities.models.vision_transformer.vision_transformer_model module
ImagePatchEmbeddingVisionTransformerVisionTransformerBlockVisionTransformerConfigVisionTransformerConfig.add_cls_tokenVisionTransformerConfig.attention_configVisionTransformerConfig.biasVisionTransformerConfig.dropoutVisionTransformerConfig.img_sizeVisionTransformerConfig.model_configVisionTransformerConfig.n_classesVisionTransformerConfig.n_embdVisionTransformerConfig.n_headVisionTransformerConfig.n_img_channelsVisionTransformerConfig.n_layerVisionTransformerConfig.patch_sizeVisionTransformerConfig.patch_strideVisionTransformerConfig.prediction_keyVisionTransformerConfig.sample_key
- Module contents
Submodules
modalities.models.model module
- class modalities.models.model.ActivationType(value)[source]
-
Enum class representing different activation types.
- Attributes:
GELU (str): GELU activation type. SWIGLU (str): SWIGLU activation type.
- GELU = 'gelu'
- SWIGLU = 'swiglu'
- class modalities.models.model.NNModel(seed=None, weight_decay_groups=None)[source]
Bases:
ModuleNNModel class to define a base model.
Initializes an NNModel object.
- Args:
seed (int, optional): The seed value for random number generation. Defaults to None. weight_decay_groups (Optional[WeightDecayGroups], optional): The weight decay groups. Defaults to None.
- abstractmethod forward(inputs)[source]
Forward pass of the model.
- Args:
inputs (dict[str, torch.Tensor]): A dictionary containing input tensors.
- Returns:
dict[str, torch.Tensor]: A dictionary containing output tensors.
- class modalities.models.model.SwiGLU(n_embd, ffn_hidden, bias, enforce_swiglu_hidden_dim_multiple_of=256)[source]
Bases:
ModuleSwiGLU class to define the SwiGLU activation function.
Initializes the SwiGLU object.
- Args:
n_embd (int): The number of embedding dimensions. ffn_hidden (int): The number of hidden dimensions in the feed-forward network. Best practice: 4 * n_embd (https://arxiv.org/pdf/1706.03762) bias (bool): Whether to include bias terms in the linear layers. enforce_swiglu_hidden_dim_multiple_of (int): The multiple of which the hidden
dimension should be enforced. Defaults to 256. This is required for FSDP + TP as the combincation does not support uneven sharding (yet). Defaults to 256 if not provided.
- modalities.models.model.model_predict_batch(model, batch)[source]
Predicts the output for a batch of samples using the given model.
- Return type:
- Parameters:
model (Module)
batch (DatasetBatch)
- Args:
model (nn.Module): The model used for prediction. batch (DatasetBatch): The batch of samples to be predicted.
- Returns:
InferenceResultBatch: The batch of inference results containing the predicted targets and predictions.