contrib.rnn.CoupledInputForgetGateLSTMCell
tf.contrib.rnn.CoupledInputForgetGateLSTMCell
class tf.contrib.rnn.CoupledInputForgetGateLSTMCell
Defined in tensorflow/contrib/rnn/python/ops/rnn_cell.py
.
See the guide: RNN and Cells (contrib) > Core RNN Cell wrappers (RNNCells that wrap other RNNCells)
Long short-term memory unit (LSTM) recurrent network cell.
The default non-peephole implementation is based on:
http://deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf
S. Hochreiter and J. Schmidhuber. "Long Short-Term Memory". Neural Computation, 9(8):1735-1780, 1997.
The peephole implementation is based on:
https://research.google.com/pubs/archive/43905.pdf
Hasim Sak, Andrew Senior, and Francoise Beaufays. "Long short-term memory recurrent neural network architectures for large scale acoustic modeling." INTERSPEECH, 2014.
The coupling of input and forget gate is based on:
http://arxiv.org/pdf/1503.04069.pdf
Greff et al. "LSTM: A Search Space Odyssey"
The class uses optional peep-hole connections, and an optional projection layer.
Properties
graph
losses
non_trainable_variables
non_trainable_weights
output_size
scope_name
state_size
trainable_variables
trainable_weights
updates
variables
Returns the list of all layer variables/weights.
Returns:
A list of variables.
weights
Returns the list of all layer variables/weights.
Returns:
A list of variables.
Methods
__init__
__init__( num_units, use_peepholes=False, initializer=None, num_proj=None, proj_clip=None, num_unit_shards=1, num_proj_shards=1, forget_bias=1.0, state_is_tuple=True, activation=tf.tanh, reuse=None )
Initialize the parameters for an LSTM cell.
Args:
-
num_units
: int, The number of units in the LSTM cell -
use_peepholes
: bool, set True to enable diagonal/peephole connections. -
initializer
: (optional) The initializer to use for the weight and projection matrices. -
num_proj
: (optional) int, The output dimensionality for the projection matrices. If None, no projection is performed. -
proj_clip
: (optional) A float value. Ifnum_proj > 0
andproj_clip
is provided, then the projected values are clipped elementwise to within[-proj_clip, proj_clip]
. -
num_unit_shards
: How to split the weight matrix. If >1, the weight matrix is stored across num_unit_shards. -
num_proj_shards
: How to split the projection matrix. If >1, the projection matrix is stored across num_proj_shards. -
forget_bias
: Biases of the forget gate are initialized by default to 1 in order to reduce the scale of forgetting at the beginning of the training. -
state_is_tuple
: If True, accepted and returned states are 2-tuples of thec_state
andm_state
. By default (False), they are concatenated along the column axis. This default behavior will soon be deprecated. -
activation
: Activation function of the inner states. -
reuse
: (optional) Python boolean describing whether to reuse variables in an existing scope. If notTrue
, and the existing scope already has the given variables, an error is raised.
__call__
__call__( inputs, state, scope=None )
Run this RNN cell on inputs, starting from the given state.
Args:
-
inputs
:2-D
tensor with shape[batch_size x input_size]
. -
state
: ifself.state_size
is an integer, this should be a2-D Tensor
with shape[batch_size x self.state_size]
. Otherwise, ifself.state_size
is a tuple of integers, this should be a tuple with shapes[batch_size x s] for s in self.state_size
. -
scope
: VariableScope for the created subgraph; defaults to class name.
Returns:
A pair containing:
- Output: A
2-D
tensor with shape[batch_size x self.output_size]
. - New state: Either a single
2-D
tensor, or a tuple of tensors matching the arity and shapes ofstate
.
__deepcopy__
__deepcopy__(memo)
add_loss
add_loss( losses, inputs=None )
Add loss tensor(s), potentially dependent on layer inputs.
Some losses (for instance, activity regularization losses) may be dependent on the inputs passed when calling a layer. Hence, when reusing a same layer on different inputs a
and b
, some entries in layer.losses
may be dependent on a
and some on b
. This method automatically keeps track of dependencies.
The get_losses_for
method allows to retrieve the losses relevant to a specific set of inputs.
Arguments:
-
losses
: Loss tensor, or list/tuple of tensors. -
inputs
: Optional input tensor(s) that the loss(es) depend on. Must match theinputs
argument passed to the__call__
method at the time the losses are created. IfNone
is passed, the losses are assumed to be unconditional, and will apply across all dataflows of the layer (e.g. weight regularization losses).
add_update
add_update( updates, inputs=None )
Add update op(s), potentially dependent on layer inputs.
Weight updates (for instance, the updates of the moving mean and variance in a BatchNormalization layer) may be dependent on the inputs passed when calling a layer. Hence, when reusing a same layer on different inputs a
and b
, some entries in layer.updates
may be dependent on a
and some on b
. This method automatically keeps track of dependencies.
The get_updates_for
method allows to retrieve the updates relevant to a specific set of inputs.
Arguments:
-
updates
: Update op, or list/tuple of update ops. -
inputs
: Optional input tensor(s) that the update(s) depend on. Must match theinputs
argument passed to the__call__
method at the time the updates are created. IfNone
is passed, the updates are assumed to be unconditional, and will apply across all dataflows of the layer.
add_variable
add_variable( name, shape, dtype=None, initializer=None, regularizer=None, trainable=True )
Adds a new variable to the layer, or gets an existing one; returns it.
Arguments:
-
name
: variable name. -
shape
: variable shape. -
dtype
: The type of the variable. Defaults toself.dtype
. -
initializer
: initializer instance (callable). -
regularizer
: regularizer instance (callable). -
trainable
: whether the variable should be part of the layer's "trainable_variables" (e.g. variables, biases) or "non_trainable_variables" (e.g. BatchNorm mean, stddev).
Returns:
The created variable.
apply
apply( inputs, *args, **kwargs )
Apply the layer on a input.
This simply wraps self.__call__
.
Arguments:
-
inputs
: Input tensor(s). args: additional positional arguments to be passed toself.call
.
*kwargs: additional keyword arguments to be passed toself.call
.
Returns:
Output tensor(s).
build
build(_)
call
call( inputs, state )
Run one step of LSTM.
Args:
-
inputs
: input Tensor, 2D, batch x num_units. -
state
: ifstate_is_tuple
is False, this must be a state Tensor,2-D, batch x state_size
. Ifstate_is_tuple
is True, this must be a tuple of state Tensors, both2-D
, with column sizesc_state
andm_state
.
Returns:
A tuple containing: - A 2-D, [batch x output_dim]
, Tensor representing the output of the LSTM after reading inputs
when previous state was state
. Here output_dim is: num_proj if num_proj was set, num_units otherwise. - Tensor(s) representing the new state of LSTM after reading inputs
when the previous state was state
. Same type and shape(s) as state
.
Raises:
-
ValueError
: If input size cannot be inferred from inputs via static shape inference.
get_losses_for
get_losses_for(inputs)
Retrieves losses relevant to a specific set of inputs.
Arguments:
-
inputs
: Input tensor or list/tuple of input tensors. Must match theinputs
argument passed to the__call__
method at the time the losses were created. If you passinputs=None
, unconditional losses are returned, such as weight regularization losses.
Returns:
List of loss tensors of the layer that depend on inputs
.
get_updates_for
get_updates_for(inputs)
Retrieves updates relevant to a specific set of inputs.
Arguments:
-
inputs
: Input tensor or list/tuple of input tensors. Must match theinputs
argument passed to the__call__
method at the time the updates were created. If you passinputs=None
, unconditional updates are returned.
Returns:
List of update ops of the layer that depend on inputs
.
zero_state
zero_state( batch_size, dtype )
Return zero-filled state tensor(s).
Args:
-
batch_size
: int, float, or unit Tensor representing the batch size. -
dtype
: the data type to use for the state.
Returns:
If state_size
is an int or TensorShape, then the return value is a N-D
tensor of shape [batch_size x state_size]
filled with zeros.
If state_size
is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D
tensors with the shapes [batch_size x s]
for each s in state_size
.
© 2017 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
https://www.tensorflow.org/api_docs/python/tf/contrib/rnn/CoupledInputForgetGateLSTMCell