contrib.seq2seq.AttentionWrapper

tf.contrib.seq2seq.AttentionWrapper

class tf.contrib.seq2seq.AttentionWrapper

Defined in tensorflow/contrib/seq2seq/python/ops/attention_wrapper.py.

Wraps another RNNCell with attention.

Properties

graph

losses

non_trainable_variables

non_trainable_weights

output_size

scope_name

state_size

trainable_variables

trainable_weights

updates

variables

Returns the list of all layer variables/weights.

Returns:

A list of variables.

weights

Returns the list of all layer variables/weights.

Returns:

A list of variables.

Methods

__init__

__init__(
    cell,
    attention_mechanism,
    attention_layer_size=None,
    alignment_history=False,
    cell_input_fn=None,
    output_attention=True,
    initial_cell_state=None,
    name=None
)

Construct the AttentionWrapper.

Args:

  • cell: An instance of RNNCell.
  • attention_mechanism: An instance of AttentionMechanism.
  • attention_layer_size: Python integer, the depth of the attention (output) layer. If None (default), use the context as attention at each time step. Otherwise, feed the context and cell output into the attention layer to generate attention at each time step.
  • alignment_history: Python boolean, whether to store alignment history from all time steps in the final output state (currently stored as a time major TensorArray on which you must call stack()).
  • cell_input_fn: (optional) A callable. The default is: lambda inputs, attention: array_ops.concat([inputs, attention], -1).
  • output_attention: Python bool. If True (default), the output at each time step is the attention value. This is the behavior of Luong-style attention mechanisms. If False, the output at each time step is the output of cell. This is the beahvior of Bhadanau-style attention mechanisms. In both cases, the attention tensor is propagated to the next time step via the state and is used there. This flag only controls whether the attention mechanism is propagated up to the next cell in an RNN stack or to the top RNN output.
  • initial_cell_state: The initial state value to use for the cell when the user calls zero_state(). Note that if this value is provided now, and the user uses a batch_size argument of zero_state which does not match the batch size of initial_cell_state, proper behavior is not guaranteed.
  • name: Name to use when creating ops.

__call__

__call__(
    inputs,
    state,
    scope=None
)

Run this RNN cell on inputs, starting from the given state.

Args:

  • inputs: 2-D tensor with shape [batch_size x input_size].
  • state: if self.state_size is an integer, this should be a 2-D Tensor with shape [batch_size x self.state_size]. Otherwise, if self.state_size is a tuple of integers, this should be a tuple with shapes [batch_size x s] for s in self.state_size.
  • scope: VariableScope for the created subgraph; defaults to class name.

Returns:

A pair containing:

  • Output: A 2-D tensor with shape [batch_size x self.output_size].
  • New state: Either a single 2-D tensor, or a tuple of tensors matching the arity and shapes of state.

__deepcopy__

__deepcopy__(memo)

add_loss

add_loss(
    losses,
    inputs=None
)

Add loss tensor(s), potentially dependent on layer inputs.

Some losses (for instance, activity regularization losses) may be dependent on the inputs passed when calling a layer. Hence, when reusing a same layer on different inputs a and b, some entries in layer.losses may be dependent on a and some on b. This method automatically keeps track of dependencies.

The get_losses_for method allows to retrieve the losses relevant to a specific set of inputs.

Arguments:

  • losses: Loss tensor, or list/tuple of tensors.
  • inputs: Optional input tensor(s) that the loss(es) depend on. Must match the inputs argument passed to the __call__ method at the time the losses are created. If None is passed, the losses are assumed to be unconditional, and will apply across all dataflows of the layer (e.g. weight regularization losses).

add_update

add_update(
    updates,
    inputs=None
)

Add update op(s), potentially dependent on layer inputs.

Weight updates (for instance, the updates of the moving mean and variance in a BatchNormalization layer) may be dependent on the inputs passed when calling a layer. Hence, when reusing a same layer on different inputs a and b, some entries in layer.updates may be dependent on a and some on b. This method automatically keeps track of dependencies.

The get_updates_for method allows to retrieve the updates relevant to a specific set of inputs.

Arguments:

  • updates: Update op, or list/tuple of update ops.
  • inputs: Optional input tensor(s) that the update(s) depend on. Must match the inputs argument passed to the __call__ method at the time the updates are created. If None is passed, the updates are assumed to be unconditional, and will apply across all dataflows of the layer.

add_variable

add_variable(
    name,
    shape,
    dtype=None,
    initializer=None,
    regularizer=None,
    trainable=True
)

Adds a new variable to the layer, or gets an existing one; returns it.

Arguments:

  • name: variable name.
  • shape: variable shape.
  • dtype: The type of the variable. Defaults to self.dtype.
  • initializer: initializer instance (callable).
  • regularizer: regularizer instance (callable).
  • trainable: whether the variable should be part of the layer's "trainable_variables" (e.g. variables, biases) or "non_trainable_variables" (e.g. BatchNorm mean, stddev).

Returns:

The created variable.

apply

apply(
    inputs,
    *args,
    **kwargs
)

Apply the layer on a input.

This simply wraps self.__call__.

Arguments:

  • inputs: Input tensor(s). args: additional positional arguments to be passed to self.call.
    *kwargs: additional keyword arguments to be passed to self.call.

Returns:

Output tensor(s).

build

build(_)

call

call(
    inputs,
    state
)

Perform a step of attention-wrapped RNN.

  • Step 1: Mix the inputs and previous step's attention output via cell_input_fn.
  • Step 2: Call the wrapped cell with this input and its previous state.
  • Step 3: Score the cell's output with attention_mechanism.
  • Step 4: Calculate the alignments by passing the score through the normalizer.
  • Step 5: Calculate the context vector as the inner product between the alignments and the attention_mechanism's values (memory).
  • Step 6: Calculate the attention output by concatenating the cell output and context through the attention layer (a linear layer with attention_size outputs).

Args:

  • inputs: (Possibly nested tuple of) Tensor, the input at this time step.
  • state: An instance of AttentionWrapperState containing tensors from the previous time step.

Returns:

A tuple (attention_or_cell_output, next_state), where:

  • attention_or_cell_output depending on output_attention.
  • next_state is an instance of DynamicAttentionWrapperState containing the state calculated at this time step.

get_losses_for

get_losses_for(inputs)

Retrieves losses relevant to a specific set of inputs.

Arguments:

  • inputs: Input tensor or list/tuple of input tensors. Must match the inputs argument passed to the __call__ method at the time the losses were created. If you pass inputs=None, unconditional losses are returned, such as weight regularization losses.

Returns:

List of loss tensors of the layer that depend on inputs.

get_updates_for

get_updates_for(inputs)

Retrieves updates relevant to a specific set of inputs.

Arguments:

  • inputs: Input tensor or list/tuple of input tensors. Must match the inputs argument passed to the __call__ method at the time the updates were created. If you pass inputs=None, unconditional updates are returned.

Returns:

List of update ops of the layer that depend on inputs.

zero_state

zero_state(
    batch_size,
    dtype
)

© 2017 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/AttentionWrapper

在线笔记
App下载
App下载

扫描二维码

下载编程狮App

公众号
微信公众号

编程狮公众号

意见反馈
返回顶部