GPTConfig#

class tfts.models.gpt.GPTConfig(hidden_size: int = 64, num_layers: int = 2, num_attention_heads: int = 4, ffn_intermediate_size: int = 256, hidden_act: str = 'gelu', hidden_dropout_prob: float = 0.0, attention_probs_dropout_prob: float = 0.0, max_position_embeddings: int = 512, type_vocab_size: int = 2, initializer_range: float = 0.02, layer_norm_eps: float = 1e-12, pad_token_id: int = 0, positional_type: str = 'absolute', use_cache: bool = True, dense_units: Tuple[int] = (512, 1024), classifier_dropout: float | None = None, **kwargs: Dict[str, object])[source]#

Bases: BaseConfig

Configuration class for GPT decoder model, inheriting from BaseConfig.

Parameters:
  • hidden_size – The size of the hidden layers. Default is 64.

  • num_hidden_layers – The number of hidden layers in the transformer encoder. Default is 2.

  • num_attention_heads – The number of attention heads in each attention layer. Default is 4.

  • ffn_intermediate_size – The size of the intermediate (feed-forward) layer. Default is 256.

  • hidden_act – The activation function for hidden layers. Default is “gelu”.

  • hidden_dropout_prob – The dropout probability for hidden layers. Default is 0.1.

  • attention_probs_dropout_prob – The dropout probability for attention probabilities. Default is 0.1.

  • max_position_embeddings – The maximum length of the input sequences. Default is 512.

  • type_vocab_size – The vocabulary size for token types (usually 2). Default is 2.

  • initializer_range – The standard deviation for weight initialization. Default is 0.02.

  • layer_norm_eps – The epsilon value for layer normalization. Default is 1e-12.

  • pad_token_id – The ID for the padding token. Default is 0.

  • positional_type – The type of position embedding (“absolute” or “relative”). Default is “absolute”.

  • use_cache – Whether to use the cache during inference. Default is True.

  • classifier_dropout – Dropout probability for the classifier layer. Default is None.

  • **kwargs – Additional keyword arguments passed to the parent BaseConfig class.

Inherited-members:

Methods

from_dict(config_dict)

from_json(json_file)

from_pretrained(pretrained_model_name_or_path)

save_pretrained(save_directory)

to_dict()

to_json(json_file)

update(config_dict)

Attributes

attribute_map

model_type