senta

class Senta(network, vocab_size, num_classes, emb_dim=128, pad_token_id=0)[source]

Bases: paddle.fluid.dygraph.layers.Layer

forward(text, seq_len=None)[source]

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters
  • *inputs (tuple) – unpacked tuple arguments

  • **kwargs (dict) – unpacked dict arguments

class BoWModel(vocab_size, num_classes, emb_dim=128, padding_idx=0, hidden_size=128, fc_hidden_size=96)[source]

Bases: paddle.fluid.dygraph.layers.Layer

This class implements the Bag of Words Classification Network model to classify texts. At a high level, the model starts by embedding the tokens and running them through a word embedding. Then, we encode these epresentations with a BoWEncoder. Lastly, we take the output of the encoder to create a final representation, which is passed through some feed-forward layers to output a logits (output_layer).

forward(text, seq_len=None)[source]

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters
  • *inputs (tuple) – unpacked tuple arguments

  • **kwargs (dict) – unpacked dict arguments

class LSTMModel(vocab_size, num_classes, emb_dim=128, padding_idx=0, lstm_hidden_size=198, direction='forward', lstm_layers=1, dropout_rate=0.0, pooling_type=None, fc_hidden_size=96)[source]

Bases: paddle.fluid.dygraph.layers.Layer

forward(text, seq_len)[source]

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters
  • *inputs (tuple) – unpacked tuple arguments

  • **kwargs (dict) – unpacked dict arguments

class GRUModel(vocab_size, num_classes, emb_dim=128, padding_idx=0, gru_hidden_size=198, direction='forward', gru_layers=1, dropout_rate=0.0, pooling_type=None, fc_hidden_size=96)[source]

Bases: paddle.fluid.dygraph.layers.Layer

forward(text, seq_len)[source]

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters
  • *inputs (tuple) – unpacked tuple arguments

  • **kwargs (dict) – unpacked dict arguments

class RNNModel(vocab_size, num_classes, emb_dim=128, padding_idx=0, rnn_hidden_size=198, direction='forward', rnn_layers=1, dropout_rate=0.0, pooling_type=None, fc_hidden_size=96)[source]

Bases: paddle.fluid.dygraph.layers.Layer

forward(text, seq_len)[source]

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters
  • *inputs (tuple) – unpacked tuple arguments

  • **kwargs (dict) – unpacked dict arguments

class BiLSTMAttentionModel(attention_layer, vocab_size, num_classes, emb_dim=128, lstm_hidden_size=196, fc_hidden_size=96, lstm_layers=1, dropout_rate=0.0, padding_idx=0)[source]

Bases: paddle.fluid.dygraph.layers.Layer

forward(text, seq_len)[source]

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters
  • *inputs (tuple) – unpacked tuple arguments

  • **kwargs (dict) – unpacked dict arguments

class SelfAttention(hidden_size)[source]

Bases: paddle.fluid.dygraph.layers.Layer

A close implementation of attention network of ACL 2016 paper, Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification (Zhou et al., 2016). ref: https://www.aclweb.org/anthology/P16-2034/ :param hidden_size: The number of expected features in the input x. :type hidden_size: int

forward(input, mask=None)[source]
Parameters
  • input (paddle.Tensor) of shape (batch, seq_len, input_size) – Tensor containing the features of the input sequence.

  • mask (paddle.Tensor) of shape (batch, seq_len) – Tensor is a bool tensor, whose each element identifies whether the input word id is pad token or not. Defaults to None.

class SelfInteractiveAttention(hidden_size)[source]

Bases: paddle.fluid.dygraph.layers.Layer

A close implementation of attention network of NAACL 2016 paper, Hierarchical Attention Networks for Document Classification (Yang et al., 2016). ref: https://www.cs.cmu.edu/~./hovy/papers/16HLT-hierarchical-attention-networks.pdf :param hidden_size: The number of expected features in the input x. :type hidden_size: int

forward(input, mask=None)[source]
Parameters
  • input (paddle.Tensor) of shape (batch, seq_len, input_size) – Tensor containing the features of the input sequence.

  • mask (paddle.Tensor) of shape (batch, seq_len) – Tensor is a bool tensor, whose each element identifies whether the input word id is pad token or not. Defaults to `None

class CNNModel(vocab_size, num_classes, emb_dim=128, padding_idx=0, num_filter=128, ngram_filter_sizes=(3), fc_hidden_size=96)[source]

Bases: paddle.fluid.dygraph.layers.Layer

This class implements the Convolution Neural Network model. At a high level, the model starts by embedding the tokens and running them through a word embedding. Then, we encode these epresentations with a CNNEncoder. The CNN has one convolution layer for each ngram filter size. Each convolution operation gives out a vector of size num_filter. The number of times a convolution layer will be used is num_tokens - ngram_size + 1. The corresponding maxpooling layer aggregates all these outputs from the convolution layer and outputs the max. Lastly, we take the output of the encoder to create a final representation, which is passed through some feed-forward layers to output a logits (output_layer).

forward(text, seq_len=None)[source]

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters
  • *inputs (tuple) – unpacked tuple arguments

  • **kwargs (dict) – unpacked dict arguments

class TextCNNModel(vocab_size, num_classes, emb_dim=128, padding_idx=0, num_filter=128, ngram_filter_sizes=(1, 2, 3), fc_hidden_size=96)[source]

Bases: paddle.fluid.dygraph.layers.Layer

This class implements the Text Convolution Neural Network model. At a high level, the model starts by embedding the tokens and running them through a word embedding. Then, we encode these epresentations with a CNNEncoder. The CNN has one convolution layer for each ngram filter size. Each convolution operation gives out a vector of size num_filter. The number of times a convolution layer will be used is num_tokens - ngram_size + 1. The corresponding maxpooling layer aggregates all these outputs from the convolution layer and outputs the max. Lastly, we take the output of the encoder to create a final representation, which is passed through some feed-forward layers to output a logits (output_layer).

forward(text, seq_len=None)[source]

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters
  • *inputs (tuple) – unpacked tuple arguments

  • **kwargs (dict) – unpacked dict arguments