yahoo_answer_100k¶
-
class
YahooAnswer100K
(lazy=None, name=None, **config)[source]¶ Bases:
paddlenlp.datasets.dataset.DatasetBuilder
The data is from https://arxiv.org/pdf/1702.08139.pdf, which samples 100k documents from original Yahoo Answer data, and vocabulary size is 200k.