simnet¶
-
class
SimNet
(network, vocab_size, num_classes, emb_dim=128, pad_token_id=0)[source]¶ Bases:
paddle.fluid.dygraph.layers.Layer
-
class
BoWModel
(vocab_size, num_classes, emb_dim=128, padding_idx=0, fc_hidden_size=128)[source]¶ Bases:
paddle.fluid.dygraph.layers.Layer
This class implements the Bag of Words Classification Network model to classify texts. At a high level, the model starts by embedding the tokens and running them through a word embedding. Then, we encode these epresentations with a
BoWEncoder
. Lastly, we take the output of the encoder to create a final representation, which is passed through some feed-forward layers to output a logits (output_layer
). :param vocab_size (obj:int
): The vocabulary size. :param emb_dim (obj:int
, optional, defaults to 128): The embedding dimension. :param padding_idx (obj:int
, optinal, defaults to 0) : The pad token index. :param hidden_size (obj:int
, optional, defaults to 128): The first full-connected layer hidden size. :param fc_hidden_size (obj:int
, optional, defaults to 96): The second full-connected layer hidden size. :param num_classes (obj:int
): All the labels that the data has.
-
class
LSTMModel
(vocab_size, num_classes, emb_dim=128, padding_idx=0, lstm_hidden_size=128, direction='forward', lstm_layers=1, dropout_rate=0.0, pooling_type=None, fc_hidden_size=128)[source]¶ Bases:
paddle.fluid.dygraph.layers.Layer
-
class
GRUModel
(vocab_size, num_classes, emb_dim=128, padding_idx=0, gru_hidden_size=128, direction='forward', gru_layers=1, dropout_rate=0.0, pooling_type=None, fc_hidden_size=96)[source]¶ Bases:
paddle.fluid.dygraph.layers.Layer
-
class
CNNModel
(vocab_size, num_classes, emb_dim=128, padding_idx=0, num_filter=256, ngram_filter_sizes=(3), fc_hidden_size=128)[source]¶ Bases:
paddle.fluid.dygraph.layers.Layer
This class implements the
Convolution Neural Network model.
At a high level, the model starts by embedding the tokens and running them through a word embedding. Then, we encode these epresentations with a
CNNEncoder
. The CNN has one convolution layer for each ngram filter size. Each convolution operation gives out a vector of size num_filter. The number of times a convolution layer will be used isnum_tokens - ngram_size + 1
. The corresponding maxpooling layer aggregates all these outputs from the convolution layer and outputs the max. Lastly, we take the output of the encoder to create a final representation, which is passed through some feed-forward layers to output a logits (output_layer
). :param vocab_size (obj:int
): The vocabulary size. :param emb_dim (obj:int
, optional, defaults to 128): The embedding dimension. :param padding_idx (obj:int
, optinal, defaults to 0) : The pad token index. :param num_classes (obj:int
): All the labels that the data has.