Tasks

List of ParlAI tasks defined in the file task_list.py:

QA

AQuA

Tag: #AQuA

Full Path: aqua

Group Tags: #All, #QA

Description: Dataset containing algebraic word problems with rationales for their answers. From Ling et. al. 2017, Link: https://arxiv.org/pdf/1705.04146.pdf

bAbI 1k

Tag: #bAbI-1k

Full Path: babi:All1k

Group Tags: #All, #QA

Description: 20 synthetic tasks that each test a unique aspect of text and reasoning, and hence test different capabilities of learning models. From Weston et al. ‘16. Link: http://arxiv.org/abs/1502.05698

Notes: You can access just one of the bAbI tasks with e.g. ‘babi:Task1k:3’ for task 3.

bAbI 10k

Tag: #bAbI-10k

Full Path: babi:All10k

Group Tags: #All, #QA

Description: 20 synthetic tasks that each test a unique aspect of text and reasoning, and hence test different capabilities of learning models. From Weston et al. ‘16. Link: http://arxiv.org/abs/1502.05698

Notes: You can access just one of the bAbI tasks with e.g. ‘babi:Task10k:3’ for task 3.

MCTest

Tag: #MCTest

Full Path: mctest

Group Tags: #All, #QA

Description: Questions about short children’s stories, from Richardson et al. ‘13. Link: https://www.microsoft.com/en-us/research/publication/mctest-challenge-dataset-open-domain-machine-comprehension-text/

Movie Dialog QA

Tag: #MovieDD-QA

Full Path: moviedialog:Task:1

Group Tags: #All, #QA, #MovieDD

Description: Closed-domain QA dataset asking templated questions about movies, answerable from Wikipedia, similar to WikiMovies. From Dodge et al. ‘15. Link: https://arxiv.org/abs/1511.06931

Movie Dialog Recommendations

Tag: #MovieDD-Recs

Full Path: moviedialog:Task:2

Group Tags: #All, #QA, #MovieDD

Description: Questions asking for movie recommendations. From Dodge et al. ‘15. Link: https://arxiv.org/abs/1511.06931

MTurk WikiMovies

Tag: #MTurkWikiMovies

Full Path: mturkwikimovies

Group Tags: #All, #QA

Description: Closed-domain QA dataset asking MTurk-derived questions about movies, answerable from Wikipedia. From Li et al. ‘16. Link: https://arxiv.org/abs/1611.09823

NarrativeQA

Tag: #NarrativeQA

Full Path: narrative_qa

Group Tags: #All, #QA

Description: A dataset and set of tasks in which the reader must answer questions about stories by reading entire books or movie scripts. From Kočiský et. al. ‘17. Link: https://arxiv.org/abs/1712.07040

Notes: You can access summaries only task for NarrativeQA by using task ‘narrative_qa:summaries’. By default, only stories are provided.

Simple Questions

Tag: #SimpleQuestions

Full Path: simplequestions

Group Tags: #All, #QA

Description: Open-domain QA dataset based on Freebase triples from Bordes et al. ‘15. Link: https://arxiv.org/abs/1506.02075

SQuAD

Tag: #SQuAD

Full Path: squad

Group Tags: #All, #QA

Description: Open-domain QA dataset answerable from a given paragraph from Wikipedia, from Rajpurkar et al. ‘16. Link: https://arxiv.org/abs/1606.05250

TriviaQA

Tag: #TriviaQA

Full Path: triviaqa

Group Tags: #All, #QA

Description: Open-domain QA dataset with question-answer-evidence triples, from Joshi et al. ‘17. Link: https://arxiv.org/abs/1705.03551

Web Questions

Tag: #WebQuestions

Full Path: webquestions

Group Tags: #All, #QA

Description: Open-domain QA dataset from Web queries from Berant et al. ‘13. Link: http://www.aclweb.org/anthology/D13-1160

WikiMovies

Tag: #WikiMovies

Full Path: wikimovies

Group Tags: #All, #QA

Description: Closed-domain QA dataset asking templated questions about movies, answerable from Wikipedia. From Miller et al. ‘16. Link: https://arxiv.org/abs/1606.03126

WikiQA

Tag: #WikiQA

Full Path: wikiqa

Group Tags: #All, #QA

Description: Open domain QA from Wikipedia dataset from Yang et al. ‘15. Link: https://www.microsoft.com/en-us/research/publication/wikiqa-a-challenge-dataset-for-open-domain-question-answering/

InsuranceQA

Tag: #InsuranceQA

Full Path: insuranceqa

Group Tags: #All, #QA

Description: Task which requires agents to identify high quality answers composed by professionals with deep domain knowledge. From Feng et al. ‘15. Link: https://arxiv.org/abs/1508.01585

MS_MARCO

Tag: #MS_MARCO

Full Path: ms_marco

Group Tags: #All, #QA

Description: A large scale Machine Reading Comprehension Dataset with questions sampled from real anonymized user queries and contexts from web documents. From Nguyen et al. ‘16. Link: https://arxiv.org/abs/1611.09268

Cloze

BookTest

Tag: #BookTest

Full Path: booktest

Group Tags: #All, #Cloze

Description: Sentence completion given a few sentences as context from a book. A larger version of CBT. From Bajgar et al., 16. Link: https://arxiv.org/abs/1610.00956

Children’s Book Test (CBT)

Tag: #CBT

Full Path: cbt

Group Tags: #All, #Cloze

Description: Sentence completion given a few sentences as context from a children’s book. From Hill et al., ‘16. Link: https://arxiv.org/abs/1511.02301

QA CNN

Tag: #QACNN

Full Path: qacnn

Group Tags: #All, #Cloze

Description: Cloze dataset based on a missing (anonymized) entity phrase from a CNN article, Hermann et al. ‘15. Link: https://arxiv.org/abs/1506.03340

QA Daily Mail

Tag: #QADailyMail

Full Path: qadailymail

Group Tags: #All, #Cloze

Description: Cloze dataset based on a missing (anonymized) entity phrase from a Daily Mail article, Hermann et al. ‘15. Link: https://arxiv.org/abs/1506.03340

Goal

Dialog Based Language Learning: bAbI Task

Tag: #DBLL-bAbI

Full Path: dbll_babi

Group Tags: #All, #Goal

Description: Short dialogs based on the bAbI tasks, but in the form of a question from a teacher, the answer from the student, and finally a comment on the answer from the teacher. The aim is to find learning models that use the comments to improve. From Weston ‘16. Link: https://arxiv.org/abs/1604.06045. Tasks can be accessed with a format like: ‘python examples/display_data.py -t dbll_babi:task:2_p0.5’ which specifies task 2, and policy with 0.5 answers correct, see the paper for more details of the tasks.

Dialog Based Language Learning: WikiMovies Task

Tag: #DBLL-Movie

Full Path: dbll_movie

Group Tags: #All, #Goal

Description: Short dialogs based on WikiMovies, but in the form of a question from a teacher, the answer from the student, and finally a comment on the answer from the teacher. The aim is to find learning models that use the comments to improve. From Weston ‘16. Link: https://arxiv.org/abs/1604.06045

Dialog bAbI

Tag: #dialog-bAbI

Full Path: dialog_babi

Group Tags: #All, #Goal

Description: Simulated dialogs of restaurant booking, from Bordes et al. ‘16. Link: https://arxiv.org/abs/1605.07683

Dialog bAbI+

Tag: #dialog-bAbI-plus

Full Path: dialog_babi_plus

Group Tags: #All, #Goal

Description: bAbI+ is an extension of the bAbI Task 1 dialogues with everyday incremental dialogue phenomena (hesitations, restarts, and corrections) which model the disfluencies and communication problems in everyday spoken interaction in real-world environments. See https://www.researchgate.net/publication/319128941_Challenging_Neural_Dialogue_Models_with_Natural_Data_Memory_Networks_Fail_on_Incremental_Phenomena, http://aclweb.org/anthology/D17-1235

MutualFriends

Tag: #MutualFriends

Full Path: mutualfriends

Group Tags: #All, #Goal

Description: Task where two agents must discover which friend of theirs is mutual based on the friends’s attributes. From He He et al. ‘17. Link: https://stanfordnlp.github.io/cocoa/

Movie Dialog QA Recommendations

Tag: #MovieDD-QARecs

Full Path: moviedialog:Task:3

Group Tags: #All, #Goal, #MovieDD

Description: Dialogs discussing questions about movies as well as recommendations. From Dodge et al. ‘15. Link: https://arxiv.org/abs/1511.06931

Personalized Dialog Full Set

Tag: #personalized-dialog-full

Full Path: personalized_dialog:AllFull

Group Tags: #All, #Goal, #Personalization

Description: Simulated dataset of restaurant booking focused on personalization based on user profiles. From Joshi et al. ‘17. Link: https://arxiv.org/abs/1706.07503

Personalized Dialog Small Set

Tag: #personalized-dialog-small

Full Path: personalized_dialog:AllSmall

Group Tags: #All, #Goal, #Personalization

Description: Simulated dataset of restaurant booking focused on personalization based on user profiles. From Joshi et al. ‘17. Link: https://arxiv.org/abs/1706.07503

Task N’ Talk

Tag: #TaskNTalk

Full Path: taskntalk

Group Tags: #All, #Goal

Description: Dataset of synthetic shapes described by attributes, for agents to play a cooperative QA game, from Kottur et al. ‘17. Link: https://arxiv.org/abs/1706.08502

SCAN

Tag: #SCAN

Full Path: scan

Group Tags: #Goal, #All

Description: SCAN is a set of simple language-driven navigation tasks for studying compositional learning and zero-shot generalization. The SCAN tasks were inspired by the CommAI environment, which is the origin of the acronym (Simplified versions of the CommAI Navigation tasks). See the paper: https://arxiv.org/abs/1711.00350 or data: https://github.com/brendenlake/SCAN

ChitChat

Cornell Movie

Tag: #CornellMovie

Full Path: cornell_movie

Group Tags: #All, #ChitChat

Description: Fictional conversations extracted from raw movie scripts. Danescu-Niculescu-Mizil & Lee, ‘11. Link: https://arxiv.org/abs/1106.3077

Movie Dialog Reddit

Tag: #MovieDD-Reddit

Full Path: moviedialog:Task:4

Group Tags: #All, #ChitChat, #MovieDD

Description: Dialogs discussing Movies from Reddit (the Movies SubReddit). From Dodge et al. ‘15. Link: https://arxiv.org/abs/1511.06931

Open Subtitles

Tag: #OpenSubtitles

Full Path: opensubtitles

Group Tags: #All, #ChitChat

Description: Dataset of dialogs from movie scripts. Version 2018: http://opus.lingfil.uu.se/OpenSubtitles2018.php, version 2009: http://opus.lingfil.uu.se/OpenSubtitles.php. A variant of the dataset used in Vinyals & Le ‘15, https://arxiv.org/abs/1506.05869.

Ubuntu

Tag: #Ubuntu

Full Path: ubuntu

Group Tags: #All, #ChitChat

Description: Dialogs between an Ubuntu user and an expert trying to fix issue, from Lowe et al. ‘15. Link: https://arxiv.org/abs/1506.08909

ConvAI2

Tag: #ConvAI2

Full Path: convai2

Group Tags: #All, #ChitChat

Description: A chit-chat dataset based on PersonaChat (https://arxiv.org/abs/1801.07243) for a NIPS 2018 competition. Link: http://convai.io/.

ConvAI_ChitChat

Tag: #ConvAI_ChitChat

Full Path: convai_chitchat

Group Tags: #All, #ChitChat

Description: Human-bot dialogues containing free discussions of randomly chosen paragraphs from SQuAD. Link to dataset: http://convai.io/data/

Persona-Chat

Tag: #Persona-Chat

Full Path: personachat

Group Tags: #ChitChat, #All

Description: A chit-chat dataset where paired Turkers are given assigned personas and chat to try to get to know each other. See the paper: https://arxiv.org/abs/1801.07243

Twitter

Tag: #Twitter

Full Path: twitter

Group Tags: #All, #ChitChat

Description: Twitter data from: https://github.com/Marsan-Ma/chat_corpus/. No train/valid/test split was provided so 10k for valid and 10k for test was chosen at random.

Negotiation

Deal or No Deal

Tag: #DealNoDeal

Full Path: dealnodeal

Group Tags: #All, #Negotiation

Description: End-to-end negotiation task which requires two agents to agree on how to divide a set of items, with each agent assigning different values to each item. From Lewis et al. ‘17. Link: https://arxiv.org/abs/1706.05125

Visual

FVQA

Tag: #FVQA

Full Path: fvqa

Group Tags: #All, #Visual

Description: The FVQA, a VQA dataset which requires, and supports, much deeper reasoning. We extend a conventional visual question answering dataset, which contains image-question-answer triplets, through additional image-question-answer-supporting fact tuples. The supporting fact is represented as a structural triplet, such as <Cat,CapableOf,ClimbingTrees>. Link: https://arxiv.org/abs/1606.05433

VQAv1

Tag: #VQAv1

Full Path: vqa_v1

Group Tags: #All, #Visual

Description: Open-ended question answering about visual content. From Agrawal et al. ‘15. Link: https://arxiv.org/abs/1505.00468

VQAv2

Tag: #VQAv2

Full Path: vqa_v2

Group Tags: #All, #Visual

Description: Bigger, more balanced version of the original VQA dataset. From Goyal et al. ‘16. Link: https://arxiv.org/abs/1612.00837

VisDial

Tag: #VisDial

Full Path: visdial

Group Tags: #All, #Visual

Description: Task which requires agents to hold a meaningful dialog about visual content. From Das et al. ‘16. Link: https://arxiv.org/abs/1611.08669

MNIST_QA

Tag: #MNIST_QA

Full Path: mnist_qa

Group Tags: #All, #Visual

Description: Task which requires agents to identify which number they are seeing. From the MNIST dataset.

CLEVR

Tag: #CLEVR

Full Path: clevr

Group Tags: #All, #Visual

Description: A visual reasoning dataset that tests abilities such as attribute identification, counting, comparison, spatial relationships, and logical operations. From Johnson et al. ‘16. Link: https://arxiv.org/abs/1612.06890

nlvr

Tag: #nlvr

Full Path: nlvr

Group Tags: #All, #Visual

Description: Cornell Natural Language Visual Reasoning (NLVR) is a language grounding dataset based on pairs of natural language statements grounded in synthetic images. From Suhr et al. ‘17. Link: http://lic.nlp.cornell.edu/nlvr/