modeling¶

position_encoding_init(n_position, d_pos_vec, dtype='float32')[source]¶: Generate the initial values for the sinusoid position encoding table.

class WordEmbedding(vocab_size, emb_dim, bos_idx=0)[source]¶

Bases: paddle.fluid.dygraph.layers.Layer

Word Embedding + Scale

forward(word)[source]¶

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters

*inputs (tuple) – unpacked tuple arguments
**kwargs (dict) – unpacked dict arguments

class PositionalEmbedding(emb_dim, max_length, bos_idx=0)[source]¶

Bases: paddle.fluid.dygraph.layers.Layer

Positional Embedding

forward(pos)[source]¶

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters

*inputs (tuple) – unpacked tuple arguments
**kwargs (dict) – unpacked dict arguments

class CrossEntropyCriterion(label_smooth_eps, pad_idx=0)[source]¶

Bases: paddle.fluid.dygraph.layers.Layer

forward(predict, label)[source]¶

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters

*inputs (tuple) – unpacked tuple arguments
**kwargs (dict) – unpacked dict arguments

class TransformerDecodeCell(decoder, word_embedding=None, pos_embedding=None, linear=None, dropout=0.1)[source]¶

Bases: paddle.fluid.dygraph.layers.Layer

forward(inputs, states, static_cache, trg_src_attn_bias, memory)[source]¶

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters

*inputs (tuple) – unpacked tuple arguments
**kwargs (dict) – unpacked dict arguments

class TransformerBeamSearchDecoder(cell, start_token, end_token, beam_size, var_dim_in_state)[source]¶

Bases: paddle.fluid.layers.rnn.BeamSearchDecoder

static tile_beam_merge_with_batch(t, beam_size)[source]¶

Tile the batch dimension of a tensor. Specifically, this function takes a tensor t shaped [batch_size, s0, s1, ...] composed of minibatch entries t[0], ..., t[batch_size - 1] and tiles it to have a shape [batch_size * beam_size, s0, s1, ...] composed of minibatch entries t[0], t[0], ..., t[1], t[1], ... where each minibatch entry is repeated beam_size times.

Parameters

x (Variable) – A tensor with shape [batch_size, ...]. The data type should be float32, float64, int32, int64 or bool.
beam_size (int) – The beam width used in beam search.

Returns

A tensor with shape [batch_size * beam_size, ...], whose: data type is same as x.

Return type

Variable

step(time, inputs, states, **kwargs)[source]¶

Perform a beam search decoding step, which uses cell to get probabilities, and follows a beam search step to calculate scores and select candidate token ids.

Parameters

time (Variable) – An int64 tensor with shape [1] provided by the caller, representing the current time step number of decoding.
inputs (Variable) – A tensor variable. It is same as initial_inputs returned by initialize() for the first decoding step and next_inputs returned by step() for the others.
states (Variable) – A structure of tensor variables. It is same as the initial_states returned by initialize() for the first decoding step and beam_search_state returned by step() for the others.
**kwargs – Additional keyword arguments, provided by the caller.

Returns

A tuple( (beam_search_output, beam_search_state, next_inputs, finished) ).: beam_search_state and next_inputs have the same structure, shape and data type as the input arguments states and inputs separately. beam_search_output is a namedtuple(including scores, predicted_ids, parent_ids as fields) of tensor variables, where scores, predicted_ids, parent_ids all has a tensor value shaped [batch_size, beam_size] with data type float32, int64, int64. finished is a bool tensor with shape [batch_size, beam_size].

Return type

tuple

class TransformerModel(src_vocab_size, trg_vocab_size, max_length, n_layer, n_head, d_model, d_inner_hid, dropout, weight_sharing, bos_id=0, eos_id=1)[source]¶

Bases: paddle.fluid.dygraph.layers.Layer

model

forward(src_word, trg_word)[source]¶

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters

*inputs (tuple) – unpacked tuple arguments
**kwargs (dict) – unpacked dict arguments

class InferTransformerModel(src_vocab_size, trg_vocab_size, max_length, n_layer, n_head, d_model, d_inner_hid, dropout, weight_sharing, bos_id=0, eos_id=1, beam_size=4, max_out_len=256)[source]¶

Bases: paddlenlp.transformers.transformer.modeling.TransformerModel

forward(src_word)[source]¶

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters

*inputs (tuple) – unpacked tuple arguments
**kwargs (dict) – unpacked dict arguments