Attention#

class tfts.layers.attention_layer.Attention(*args, **kwargs)[source]#

Bases: Layer

Multi-head attention layer

Initialize the Attention layer.

Parameters:#

hidden_sizeint

The number of hidden units, hidden_size = attention_dim_each_head x num_attention_heads.

num_attention_headsint

The number of attention heads.

attention_probs_dropout_probfloat, optional

Dropout rate for the attention weights. Defaults to 0.0.

inherited-members:

Methods

add_loss(loss)

Can be called inside of the call() method to add a scalar loss.

add_metric(*args, **kwargs)

add_variable(shape, initializer[, dtype, ...])

Add a weight variable to the layer.

add_weight([shape, initializer, dtype, ...])

Add a weight variable to the layer.

build(input_shape)

build_from_config(config)

Builds the layer's states with the supplied config dict.

call(q, k, v[, mask, past_key_value, ...])

use query and key generating an attention multiplier for value, multi_heads to repeat it

compute_mask(inputs, previous_mask)

compute_output_shape(input_shape)

compute_output_spec(*args, **kwargs)

count_params()

Count the total number of scalars composing the weights.

from_config(config)

Creates an operation from its config.

get_build_config()

Returns a dictionary with the layer's input shape.

get_config()

Returns the config of the object.

get_weights()

Return the values of layer.weights as a list of NumPy arrays.

load_own_variables(store)

Loads the state of the layer.

quantize(mode[, type_check])

quantized_build(input_shape, mode)

quantized_call(*args, **kwargs)

rematerialized_call(layer_call, *args, **kwargs)

Enable rematerialization dynamically for layer's call method.

save_own_variables(store)

Saves the state of the layer.

set_weights(weights)

Sets the values of layer.weights from a list of NumPy arrays.

stateless_call(trainable_variables, ...[, ...])

Call the layer without any side effects.

symbolic_call(*args, **kwargs)

Attributes

compute_dtype

The dtype of the computations performed by the layer.

dtype

Alias of layer.variable_dtype.

dtype_policy

input

Retrieves the input tensor(s) of a symbolic operation.

input_dtype

The dtype layer inputs should be converted to.

input_spec

losses

List of scalar losses from add_loss, regularizers and sublayers.

metrics

List of all metrics.

metrics_variables

List of all metric variables.

non_trainable_variables

List of all non-trainable layer state.

non_trainable_weights

List of all non-trainable weight variables of the layer.

output

Retrieves the output tensor(s) of a layer.

path

The path of the layer.

quantization_mode

The quantization mode of this layer, None if not quantized.

supports_masking

Whether this layer supports computing a mask using compute_mask.

trainable

Settable boolean, whether this layer should be trainable or not.

trainable_variables

List of all trainable layer state.

trainable_weights

List of all trainable weight variables of the layer.

variable_dtype

The dtype of the state (weights) of the layer.

variables

List of all layer state, including random seeds.

weights

List of all weight variables of the layer.

call(q: Tensor, k: Tensor, v: Tensor, mask: Tensor | None = None, past_key_value=None, training: bool | None = None, return_attention_scores: bool = False, use_causal_mask: bool = False, **kwargs)[source]#

use query and key generating an attention multiplier for value, multi_heads to repeat it

Parameters:
  • q (tf.Tenor) – Query with shape batch * seq_q * fea

  • k (tf.Tensor) – Key with shape batch * seq_k * fea

  • v (tf.Tensor) – Value with shape batch * seq_v * fea

  • mask (tf.Tensor, optional) – important to avoid the leaks, by default None

Returns:

Tensor with shape batch * seq_q * (units * num_attention_heads)

Return type:

tf.Tensor

get_config()[source]#

Returns the config of the object.

An object config is a Python dictionary (serializable) containing the information needed to re-instantiate it.