bleu¶
-
class
BLEU(trans_func=None, vocab=None, n_size=4, weights=None, name='bleu')[source]¶ Bases:
paddle.metric.metrics.MetricBLEU (bilingual evaluation understudy) is an algorithm for evaluating the quality of text which has been machine-translated from one natural language to another. This metric uses a modified form of precision to compare a candidate translation against multiple reference translations.
BLEU could be used as
paddle.metric.Metricclass, or an ordinary class. When BLEU is used aspaddle.metric.Metricclass. A function is needed that transforms the network output to reference string list, and transforms the label to candidate string. By default, a default functiondefault_trans_funcis provided, which gets target sequence id by calculating the maximum probability of each step. In this case, user must providevocab. It should be noted that the BLEU here is different from the BLEU calculated in prediction, and it is only for observation during training and evaluation.\[ \begin{align}\begin{aligned}\begin{split}BP & = \begin{cases} 1, & \text{if }c>r \\ e_{1-r/c}, & \text{if }c\leq r \end{cases}\end{split}\\BLEU & = BP\exp(\sum_{n=1}^N w_{n} \log{p_{n}})\end{aligned}\end{align} \]where
cis the length of candidate sentence, and ‘r’ is the length of refrence sentence.- Parameters
trans_func (callable, optional) –
trans_functransforms the network output to string to calculate.vocab (dict|paddlenlp.data.vocab, optional) – Vocab for target language. If
trans_funcis None and BLEU is used aspaddle.metric.Metricinstance,default_trans_funcwill be performed andvocabmust be provided.n_size (int, optional) – Number of gram for BLEU metric. Default: 4.
weights (list, optional) – The weights of precision of each gram. Default: None.
name (str, optional) – Name of
paddle.metric.Metricinstance. Default: “bleu”.
Examples
1. Using as a general evaluation object. .. code-block:: python
from paddlenlp.metrics import BLEU bleu = BLEU() cand = [“The”,”cat”,”The”,”cat”,”on”,”the”,”mat”] ref_list = [[“The”,”cat”,”is”,”on”,”the”,”mat”], [“There”,”is”,”a”,”cat”,”on”,”the”,”mat”]] bleu.add_inst(cand, ref_list) print(bleu.score()) # 0.4671379777282001
Using as an instance of
paddle.metric.Metric.
# You could add the code below to Seq2Seq example in this repo to # use BLEU as `paddlenlp.metric.Metric' class. If you run the # following code alone, you may get an error. # log example: # Epoch 1/12 # step 100/507 - loss: 308.7948 - Perplexity: 541.5600 - bleu: 2.2089e-79 - 923ms/step # step 200/507 - loss: 264.2914 - Perplexity: 334.5099 - bleu: 0.0093 - 865ms/step # step 300/507 - loss: 236.3913 - Perplexity: 213.2553 - bleu: 0.0244 - 849ms/step from paddlenlp.data import Vocab from paddlenlp.metrics import BLEU bleu_metric = BLEU(vocab=src_vocab.idx_to_token) model.prepare(optimizer, CrossEntropyCriterion(), [ppl_metric, bleu_metric])
-
update(output, label, seq_mask=None)[source]¶ Update states for metric
Inputs of
updateis the outputs ofMetric.compute, ifcomputeis not defined, the inputs ofupdatewill be flatten arguments of output of mode and label from data:update(output1, output2, ..., label1, label2,...)see
Metric.compute
-
class
BLEUForDuReader(n_size=4, alpha=1.0, beta=1.0)[source]¶ Bases:
paddlenlp.metrics.bleu.BLEUBLEU metric with bonus for DuReader contest.
Please refer to `DuReader Homepage<https://ai.baidu.com//broad/subordinate?dataset=dureader>`_ for more details.