SQuAD

The Stanford Question and Answering Dataset (Version 1.0: https://arxiv.org/abs/1606.05250, Version 2.0: https://arxiv.org/abs/1806.03822) are a reading comprehension dataset of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to each question is segment of text from reading passage. Version 1.0 had 100k+ pairs and Version 2.0 added 50k+ unanswerable questions writeen adversarially by crowdworkers to look similar to unanswerable. This allows testing models for knowing when they shouldn’t know the answer.