RWKVConfig#
- class tfts.models.rwkv.RWKVConfig(num_layers: int = 25, hidden_size: int = 64, dense_hidden_size: int = 32, dropout: float = 0.0, max_position_embeddings: int = 512, initializer_range: float = 0.02, layer_norm_eps: float = 1e-12, pad_token_id: int = 0, **kwargs)[source]#
Bases:
BaseConfigInitializes the configuration for the RWKV model with the specified parameters.
- Parameters:
num_layers – The number of stacked RWKV layers.
hidden_size – Size of each attention head.
dense_hidden_size – The size of the dense hidden layer following the RWKV.
dropout – Dropout rate for regularization.
max_position_embeddings – Maximum sequence length for positional embeddings.
initializer_range – Standard deviation for weight initialization.
layer_norm_eps – Epsilon for layer normalization.
pad_token_id – ID for padding token.
- Inherited-members:
Methods
from_dict(config_dict)from_json(json_file)from_pretrained(pretrained_model_name_or_path)save_pretrained(save_directory)to_dict()to_json(json_file)update(config_dict)Attributes
attribute_mapmodel_type