modeling¶
-
position_encoding_init
(n_position, d_pos_vec, dtype='float32')[source]¶ Generate the initial values for the sinusoid position encoding table.
-
class
WordEmbedding
(vocab_size, emb_dim, bos_idx=0)[source]¶ Bases:
paddle.fluid.dygraph.layers.Layer
Word Embedding + Scale
-
class
PositionalEmbedding
(emb_dim, max_length, bos_idx=0)[source]¶ Bases:
paddle.fluid.dygraph.layers.Layer
Positional Embedding
-
class
CrossEntropyCriterion
(label_smooth_eps, pad_idx=0)[source]¶ Bases:
paddle.fluid.dygraph.layers.Layer
-
class
TransformerDecodeCell
(decoder, word_embedding=None, pos_embedding=None, linear=None, dropout=0.1)[source]¶ Bases:
paddle.fluid.dygraph.layers.Layer
-
class
TransformerBeamSearchDecoder
(cell, start_token, end_token, beam_size, var_dim_in_state)[source]¶ Bases:
paddle.fluid.layers.rnn.BeamSearchDecoder
-
static
tile_beam_merge_with_batch
(t, beam_size)[source]¶ Tile the batch dimension of a tensor. Specifically, this function takes a tensor t shaped
[batch_size, s0, s1, ...]
composed of minibatch entriest[0], ..., t[batch_size - 1]
and tiles it to have a shape[batch_size * beam_size, s0, s1, ...]
composed of minibatch entriest[0], t[0], ..., t[1], t[1], ...
where each minibatch entry is repeatedbeam_size
times.- Parameters
x (Variable) – A tensor with shape
[batch_size, ...]
. The data type should be float32, float64, int32, int64 or bool.beam_size (int) – The beam width used in beam search.
- Returns
- A tensor with shape
[batch_size * beam_size, ...]
, whose data type is same as
x
.
- A tensor with shape
- Return type
Variable
-
step
(time, inputs, states, **kwargs)[source]¶ Perform a beam search decoding step, which uses
cell
to get probabilities, and follows a beam search step to calculate scores and select candidate token ids.- Parameters
time (Variable) – An
int64
tensor with shape[1]
provided by the caller, representing the current time step number of decoding.inputs (Variable) – A tensor variable. It is same as
initial_inputs
returned byinitialize()
for the first decoding step andnext_inputs
returned bystep()
for the others.states (Variable) – A structure of tensor variables. It is same as the
initial_states
returned byinitialize()
for the first decoding step andbeam_search_state
returned bystep()
for the others.**kwargs – Additional keyword arguments, provided by the caller.
- Returns
- A tuple(
(beam_search_output, beam_search_state, next_inputs, finished)
). beam_search_state
andnext_inputs
have the same structure, shape and data type as the input argumentsstates
andinputs
separately.beam_search_output
is a namedtuple(including scores, predicted_ids, parent_ids as fields) of tensor variables, wherescores, predicted_ids, parent_ids
all has a tensor value shaped[batch_size, beam_size]
with data typefloat32, int64, int64
.finished
is abool
tensor with shape[batch_size, beam_size]
.
- A tuple(
- Return type
tuple
-
static
-
class
TransformerModel
(src_vocab_size, trg_vocab_size, max_length, n_layer, n_head, d_model, d_inner_hid, dropout, weight_sharing, bos_id=0, eos_id=1)[source]¶ Bases:
paddle.fluid.dygraph.layers.Layer
model
-
class
InferTransformerModel
(src_vocab_size, trg_vocab_size, max_length, n_layer, n_head, d_model, d_inner_hid, dropout, weight_sharing, bos_id=0, eos_id=1, beam_size=4, max_out_len=256)[source]¶ Bases:
paddlenlp.transformers.transformer.modeling.TransformerModel