Attention#
- class tfts.layers.attention_layer.Attention(*args, **kwargs)[source]#
Bases:
LayerMulti-head attention layer
Initialize the Attention layer.
Parameters:#
- hidden_sizeint
The number of hidden units, hidden_size = attention_dim_each_head x num_attention_heads.
- num_attention_headsint
The number of attention heads.
- attention_probs_dropout_probfloat, optional
Dropout rate for the attention weights. Defaults to 0.0.
- inherited-members:
Methods
add_loss(loss)Can be called inside of the call() method to add a scalar loss.
add_metric(*args, **kwargs)add_variable(shape, initializer[, dtype, ...])Add a weight variable to the layer.
add_weight([shape, initializer, dtype, ...])Add a weight variable to the layer.
build(input_shape)build_from_config(config)Builds the layer's states with the supplied config dict.
call(q, k, v[, mask, past_key_value, ...])use query and key generating an attention multiplier for value, multi_heads to repeat it
compute_mask(inputs, previous_mask)compute_output_shape(input_shape)compute_output_spec(*args, **kwargs)count_params()Count the total number of scalars composing the weights.
from_config(config)Creates an operation from its config.
get_build_config()Returns a dictionary with the layer's input shape.
Returns the config of the object.
get_weights()Return the values of layer.weights as a list of NumPy arrays.
load_own_variables(store)Loads the state of the layer.
quantize(mode[, type_check])quantized_build(input_shape, mode)quantized_call(*args, **kwargs)rematerialized_call(layer_call, *args, **kwargs)Enable rematerialization dynamically for layer's call method.
save_own_variables(store)Saves the state of the layer.
set_weights(weights)Sets the values of layer.weights from a list of NumPy arrays.
stateless_call(trainable_variables, ...[, ...])Call the layer without any side effects.
symbolic_call(*args, **kwargs)Attributes
compute_dtypeThe dtype of the computations performed by the layer.
dtypeAlias of layer.variable_dtype.
dtype_policyinputRetrieves the input tensor(s) of a symbolic operation.
input_dtypeThe dtype layer inputs should be converted to.
input_speclossesList of scalar losses from add_loss, regularizers and sublayers.
metricsList of all metrics.
metrics_variablesList of all metric variables.
non_trainable_variablesList of all non-trainable layer state.
non_trainable_weightsList of all non-trainable weight variables of the layer.
outputRetrieves the output tensor(s) of a layer.
pathThe path of the layer.
quantization_modeThe quantization mode of this layer, None if not quantized.
supports_maskingWhether this layer supports computing a mask using compute_mask.
trainableSettable boolean, whether this layer should be trainable or not.
trainable_variablesList of all trainable layer state.
trainable_weightsList of all trainable weight variables of the layer.
variable_dtypeThe dtype of the state (weights) of the layer.
variablesList of all layer state, including random seeds.
weightsList of all weight variables of the layer.
- call(q: Tensor, k: Tensor, v: Tensor, mask: Tensor | None = None, past_key_value=None, training: bool | None = None, return_attention_scores: bool = False, use_causal_mask: bool = False, **kwargs)[source]#
use query and key generating an attention multiplier for value, multi_heads to repeat it
- Parameters:
q (tf.Tenor) – Query with shape batch * seq_q * fea
k (tf.Tensor) – Key with shape batch * seq_k * fea
v (tf.Tensor) – Value with shape batch * seq_v * fea
mask (tf.Tensor, optional) – important to avoid the leaks, by default None
- Returns:
Tensor with shape batch * seq_q * (units * num_attention_heads)
- Return type:
tf.Tensor