senta¶
-
class
Senta
(network, vocab_size, num_classes, emb_dim=128, pad_token_id=0)[source]¶ Bases:
paddle.fluid.dygraph.layers.Layer
-
class
BoWModel
(vocab_size, num_classes, emb_dim=128, padding_idx=0, hidden_size=128, fc_hidden_size=96)[source]¶ Bases:
paddle.fluid.dygraph.layers.Layer
This class implements the Bag of Words Classification Network model to classify texts. At a high level, the model starts by embedding the tokens and running them through a word embedding. Then, we encode these epresentations with a
BoWEncoder
. Lastly, we take the output of the encoder to create a final representation, which is passed through some feed-forward layers to output a logits (output_layer
).
-
class
LSTMModel
(vocab_size, num_classes, emb_dim=128, padding_idx=0, lstm_hidden_size=198, direction='forward', lstm_layers=1, dropout_rate=0.0, pooling_type=None, fc_hidden_size=96)[source]¶ Bases:
paddle.fluid.dygraph.layers.Layer
-
class
GRUModel
(vocab_size, num_classes, emb_dim=128, padding_idx=0, gru_hidden_size=198, direction='forward', gru_layers=1, dropout_rate=0.0, pooling_type=None, fc_hidden_size=96)[source]¶ Bases:
paddle.fluid.dygraph.layers.Layer
-
class
RNNModel
(vocab_size, num_classes, emb_dim=128, padding_idx=0, rnn_hidden_size=198, direction='forward', rnn_layers=1, dropout_rate=0.0, pooling_type=None, fc_hidden_size=96)[source]¶ Bases:
paddle.fluid.dygraph.layers.Layer
-
class
BiLSTMAttentionModel
(attention_layer, vocab_size, num_classes, emb_dim=128, lstm_hidden_size=196, fc_hidden_size=96, lstm_layers=1, dropout_rate=0.0, padding_idx=0)[source]¶ Bases:
paddle.fluid.dygraph.layers.Layer
-
class
SelfAttention
(hidden_size)[source]¶ Bases:
paddle.fluid.dygraph.layers.Layer
A close implementation of attention network of ACL 2016 paper, Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification (Zhou et al., 2016). ref: https://www.aclweb.org/anthology/P16-2034/ :param hidden_size: The number of expected features in the input x. :type hidden_size: int
-
forward
(input, mask=None)[source]¶ - Parameters
input (paddle.Tensor) of shape (batch, seq_len, input_size) – Tensor containing the features of the input sequence.
mask (paddle.Tensor) of shape (batch, seq_len) – Tensor is a bool tensor, whose each element identifies whether the input word id is pad token or not. Defaults to
None
.
-
-
class
SelfInteractiveAttention
(hidden_size)[source]¶ Bases:
paddle.fluid.dygraph.layers.Layer
A close implementation of attention network of NAACL 2016 paper, Hierarchical Attention Networks for Document Classification (Yang et al., 2016). ref: https://www.cs.cmu.edu/~./hovy/papers/16HLT-hierarchical-attention-networks.pdf :param hidden_size: The number of expected features in the input x. :type hidden_size: int
-
forward
(input, mask=None)[source]¶ - Parameters
input (paddle.Tensor) of shape (batch, seq_len, input_size) – Tensor containing the features of the input sequence.
mask (paddle.Tensor) of shape (batch, seq_len) – Tensor is a bool tensor, whose each element identifies whether the input word id is pad token or not. Defaults to `None
-
-
class
CNNModel
(vocab_size, num_classes, emb_dim=128, padding_idx=0, num_filter=128, ngram_filter_sizes=(3), fc_hidden_size=96)[source]¶ Bases:
paddle.fluid.dygraph.layers.Layer
This class implements the Convolution Neural Network model. At a high level, the model starts by embedding the tokens and running them through a word embedding. Then, we encode these epresentations with a
CNNEncoder
. The CNN has one convolution layer for each ngram filter size. Each convolution operation gives out a vector of size num_filter. The number of times a convolution layer will be used isnum_tokens - ngram_size + 1
. The corresponding maxpooling layer aggregates all these outputs from the convolution layer and outputs the max. Lastly, we take the output of the encoder to create a final representation, which is passed through some feed-forward layers to output a logits (output_layer
).
-
class
TextCNNModel
(vocab_size, num_classes, emb_dim=128, padding_idx=0, num_filter=128, ngram_filter_sizes=(1, 2, 3), fc_hidden_size=96)[source]¶ Bases:
paddle.fluid.dygraph.layers.Layer
This class implements the Text Convolution Neural Network model. At a high level, the model starts by embedding the tokens and running them through a word embedding. Then, we encode these epresentations with a
CNNEncoder
. The CNN has one convolution layer for each ngram filter size. Each convolution operation gives out a vector of size num_filter. The number of times a convolution layer will be used isnum_tokens - ngram_size + 1
. The corresponding maxpooling layer aggregates all these outputs from the convolution layer and outputs the max. Lastly, we take the output of the encoder to create a final representation, which is passed through some feed-forward layers to output a logits (output_layer
).