RWKVConfig#

class tfts.models.rwkv.RWKVConfig(num_layers: int = 25, hidden_size: int = 64, dense_hidden_size: int = 32, dropout: float = 0.0, max_position_embeddings: int = 512, initializer_range: float = 0.02, layer_norm_eps: float = 1e-12, pad_token_id: int = 0, **kwargs)[source]#

Bases: BaseConfig

Initializes the configuration for the RWKV model with the specified parameters.

Parameters:
  • num_layers – The number of stacked RWKV layers.

  • hidden_size – Size of each attention head.

  • dense_hidden_size – The size of the dense hidden layer following the RWKV.

  • dropout – Dropout rate for regularization.

  • max_position_embeddings – Maximum sequence length for positional embeddings.

  • initializer_range – Standard deviation for weight initialization.

  • layer_norm_eps – Epsilon for layer normalization.

  • pad_token_id – ID for padding token.

Inherited-members:

Methods

from_dict(config_dict)

from_json(json_file)

from_pretrained(pretrained_model_name_or_path)

save_pretrained(save_directory)

to_dict()

to_json(json_file)

update(config_dict)

Attributes

attribute_map

model_type