modeling¶
-
class
GPT2Model
(vocab_size, hidden_size=768, num_hidden_layers=12, num_attention_heads=12, intermediate_size=3072, hidden_act='gelu', hidden_dropout_prob=0.1, attention_probs_dropout_prob=0.1, max_position_embeddings=512, type_vocab_size=16, initializer_range=0.02, pad_token_id=0)[source]¶ Bases:
paddlenlp.transformers.gpt2.modeling.GPT2PretrainedModel
The base model of gpt2.
-
class
GPT2PretrainedModel
(name_scope=None, dtype='float32')[source]¶ Bases:
paddlenlp.transformers.model_utils.PretrainedModel
An abstract class for pretrained GPT2 models. It provides GPT2 related
model_config_file
,resource_files_names
,pretrained_resource_files_map
,pretrained_init_configuration
,base_model_prefix
for downloading and loading pretrained models. SeePretrainedModel
for more details.-
base_model_class
¶
-
-
class
GPT2ForPretraining
(gpt2)[source]¶ Bases:
paddlenlp.transformers.gpt2.modeling.GPT2PretrainedModel
The pretraining model of GPT2.
It returns some logits and cached_kvs.
-
forward
(input_ids, position_ids=None, attention_mask=None, masked_positions=None, use_cache=False, cache=None)[source]¶ Defines the computation performed at every call. Should be overridden by all subclasses.
- Parameters
*inputs (tuple) – unpacked tuple arguments
**kwargs (dict) – unpacked dict arguments
-