Advanced Scripts

These are the more obscure and advanced scripts in parlai.

build_candidates

Short description: Build the candidate responses for a retrieval model

Build the candidate responses for a retrieval model.

Examples

parlai build_candidates -t convai2 --outfile /tmp/cands.txt

CLI Arguments

Argument

Description

-o, --init-opt

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

-t, --task

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

-dt, --datatype

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train:evalmode.

-bs, --batchsize

Batch size for minibatch training schemes
Default: 1.

-dynb, --dynamic-batching

Use dynamic batching
Choices: full, batchsort, None

--verbose

Print all messages

-dp, --datapath

Path to datasets, defaults to {parlai_dir}/data

-n, --num-examples

Total number of exs to convert, -1 to convert all examples
Default: -1.

-of, --outfile

Output file where to save, by default will be created in /tmp

-ltim, --log-every-n-secs

Default: 2.


build_dict

Short description: Build a dictionary.

Generates a dictionary file from the training data.

Examples

# learn the vocabulary from one task, then train on another task.
parlai build_dict -t convai2 --dict-file premade.dict
parlai train_model -t squad --dict-file premade.dict -m seq2seq

CLI Arguments

Argument

Description

-o, --init-opt

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

-t, --task

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

-dt, --datatype

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train.

-bs, --batchsize

Batch size for minibatch training schemes
Default: 1.

-dynb, --dynamic-batching

Use dynamic batching
Choices: full, batchsort, None

--verbose

Print all messages

-dp, --datapath

Path to datasets, defaults to {parlai_dir}/data

-m, --model

The model class name. can match parlai/agents/ for agents in that directory, or can provide a fully specified module for from X import Y via -m X:Y (e.g. -m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent)

-mf, --model-file

Model file name for loading and saving models

-im, --init-model

Initialize model weights and dict from this file

--dict-maxexs

Max number of examples to build dict on
Default: -1.

--dict-include-valid

Include validation set in dictionary building for task.

--dict-include-test

Include test set in dictionary building for task.

-ltim, --log-every-n-secs

Default: 10.

--bpe-vocab

Path to pre-trained tokenizer vocab

--bpe-merge

Path to pre-trained tokenizer merge


convert_to_parlai

Short description: Dump a task to a standardized format

Convert a dataset into the ParlAI text format.

Examples

parlai convert_data_to_parlai_format -t babi:task1k:1 --outfile /tmp/dump

CLI Arguments

Argument

Description

-o, --init-opt

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

-t, --task

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

-dt, --datatype

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train:stream.

-bs, --batchsize

Batch size for minibatch training schemes
Default: 1.

-dynb, --dynamic-batching

Use dynamic batching
Choices: full, batchsort, None

--verbose

Print all messages

-dp, --datapath

Path to datasets, defaults to {parlai_dir}/data

-n, --num-examples

Total number of exs to convert, -1 to convert all examples
Default: -1.

-of, --outfile

Output file where to save, by default will be created in tmp

-if, --ignore-fields

Ignore these fields from the message (returned with .act() )
Default: id.

-ltim, --log-every-n-secs

Default: 2.


convo_render

Short description: Render data as HTML

CLI Arguments

Argument

Description

--init-opt

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

-t, --task

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

-dt, --datatype

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train.

-bs, --batchsize

Batch size for minibatch training schemes
Default: 1.

-dynb, --dynamic-batching

Use dynamic batching
Choices: full, batchsort, None

--verbose

Print all messages

-dp, --datapath

Path to datasets, defaults to {parlai_dir}/data

--input, -i

Input file to read conversations from

--output, -o

Output file to write conversations to. One of [.pdf, .png, .html] only

--width, -wd

Width of output file
Default: 8.

--height, -ht

Height of output file
Default: 10.

--user-icon, -uic

Absolute Path/URL to user image icon
Default: https://emojipedia-us.s3.dualstack.us-west-1.amazonaws.com/thumbs/160/apple/76/woman_1f469.png.

--alt-icon, -aic

Absolute Path/URL to alternate image icon
Default: https://emojipedia-us.s3.dualstack.us-west-1.amazonaws.com/thumbs/160/facebook/230/parrot_1f99c.png.

--num-examples, -ne

Number of conversations to render
Default: 10.


data_stats

Short description: Compute data statistics

Count and display statistics of the data.

Examples

parlai data_stats -t convai2 -dt train:ordered

CLI Arguments

Argument

Description

-o, --init-opt

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

-t, --task

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

-dt, --datatype

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train:ordered.

-bs, --batchsize

Batch size for minibatch training schemes
Default: 1.

-dynb, --dynamic-batching

Use dynamic batching
Choices: full, batchsort, None

--verbose

Print all messages

-dp, --datapath

Path to datasets, defaults to {parlai_dir}/data

-n, -ne, --num-examples

Default: -1.

-ltim, --log-every-n-secs

Default: 2.

--agent

Use teacher (agent 0) or model (agent 1)
Choices: 0, 1

--new-line-new-utt

New lines treat substrings as separate utterances.

--ignore-tokens

Ignore tokens containings these substrings (comma-separated)

--bpe-vocab

Path to pre-trained tokenizer vocab

--bpe-merge

Path to pre-trained tokenizer merge


detect_offensive

Short description: Check task for offensive language

Basic example which iterates through the tasks specified and checks them for offensive language.

Examples

parlai detect_offensive_language -t "convai_chitchat" --display-examples True

CLI Arguments

Argument

Description

-o, --init-opt

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

-t, --task

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

-dt, --datatype

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train:ordered.

-bs, --batchsize

Batch size for minibatch training schemes
Default: 1.

-dynb, --dynamic-batching

Use dynamic batching
Choices: full, batchsort, None

--verbose

Print all messages

-dp, --datapath

Path to datasets, defaults to {parlai_dir}/data

-m, --model

The model class name. can match parlai/agents/ for agents in that directory, or can provide a fully specified module for from X import Y via -m X:Y (e.g. -m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent)
Default: repeat_query.

-mf, --model-file

Model file name for loading and saving models

-im, --init-model

Initialize model weights and dict from this file

-ltim, --log-every-n-secs

Default: 2.

-d, --display-examples

--safety

Type of safety detector to apply to messages
Choices: all, classifier, string_matcher
Default: all.


eval_wordstat

Short description: Compute statistics from model predictions

This helper script can be used alone with modelfile and task: the output will contain the word statistics of the model outputs. One can also use the function defined here in other places in order to get such statistic for any agent given the agent object (with corr. dict) and a sequence.

Additionally provides function get_word_stats that can be used in other parts of runtime code since it depends only on the agent object. For example:

from parlai.scripts.eval_wordstat import get_word_stats
reqs, cnt = get_word_stats(predictions.tolist(), self.dict)

Examples

parlai eval_wordstat -mf data/model -t convai2:self --freq-bins 10,100,1000

CLI Arguments

Argument

Description

-o, --init-opt

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

-t, --task

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

-dt, --datatype

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: valid.

-bs, --batchsize

Batch size for minibatch training schemes
Default: 1.

-dynb, --dynamic-batching

Use dynamic batching
Choices: full, batchsort, None

--verbose

Print all messages

-dp, --datapath

Path to datasets, defaults to {parlai_dir}/data

-m, --model

The model class name. can match parlai/agents/ for agents in that directory, or can provide a fully specified module for from X import Y via -m X:Y (e.g. -m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent)

-mf, --model-file

Model file name for loading and saving models

-im, --init-model

Initialize model weights and dict from this file

--bpe-vocab

Path to pre-trained tokenizer vocab

--bpe-merge

Path to pre-trained tokenizer merge

-ne, --num-examples

Default: -1.

-ltim, --log-every-n-secs

Default: 2.

-ed, --external-dict

External dictionary for stat computation

-fb, --freq-bins

Bins boundaries for rare words stat
Default: 0,100,1000,10000.

-dup, --dump-predictions-path

Dump predictions into file

-cun, --compute-unique

Compute %% of unique responses from the model
Default: True.

-tblog, --tensorboard-log

Tensorboard logging of metrics, default is False

-tblogdir, --tensorboard-logdir

Tensorboard logging directory, defaults to model_file.tensorboard


extract_image_feature

Short description: Load/extract image features

Basic example which iterates through the tasks specified and load/extract the image features.

For more options, check parlai.core.image_featurizers

Examples

To extract the image feature of COCO images:

parlai extract_image_feature -t vqa_v1 -im resnet152

CLI Arguments

Argument

Description

-o, --init-opt

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

-t, --task

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

-dt, --datatype

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train.

-bs, --batchsize

Batch size for minibatch training schemes
Default: 1.

-dynb, --dynamic-batching

Use dynamic batching
Choices: full, batchsort, None

--verbose

Print all messages

-dp, --datapath

Path to datasets, defaults to {parlai_dir}/data


interactive_web

Short description: Interactive chat with a model in a web browser

Aliases: iweb Talk with a model using a web UI.

Examples

parlai interactive_web -mf "zoo:tutorial_transformer_generator/model"

CLI Arguments

Argument

Description

-o, --init-opt

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

-t, --task

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”
Default: interactive.

-dt, --datatype

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train.

-bs, --batchsize

Batch size for minibatch training schemes
Default: 1.

-dynb, --dynamic-batching

Use dynamic batching
Choices: full, batchsort, None

--verbose

Print all messages

-dp, --datapath

Path to datasets, defaults to {parlai_dir}/data

-m, --model

The model class name. can match parlai/agents/ for agents in that directory, or can provide a fully specified module for from X import Y via -m X:Y (e.g. -m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent)

-mf, --model-file

Model file name for loading and saving models

-im, --init-model

Initialize model weights and dict from this file

-d, --display-examples

--display-prettify

Set to use a prettytable when displaying examples with text candidates

--display-add-fields

Display these fields when verbose is off (e.g., “–display-add-fields label_candidates,beam_texts”)

-it, --interactive-task

Create interactive version of task
Default: True.

--outfile

Saves a jsonl file containing all of the task examples and model replies. Set to the empty string to not save at all

--save-format

Format to save logs in. conversations is a jsonl format, parlai is a text format.
Choices: conversations, parlai
Default: conversations.

-fixedCands, --local-human-candidates-file

File of label_candidates to send to other agent

--single-turn

If on, assumes single turn episodes.

--log-keep-fields

Fields to keep when logging. Should be a comma separated list
Default: all.

--port

Port to listen on.
Default: 8080.

--host

Host from which allow requests, use 0.0.0.0 to allow all IPs
Default: localhost.


multiprocessing_eval

Short description: Evaluate a model

Aliases: mp_eval Main launch script for single-host, multi-GPU evaluation.

This is a drop-in replacement for [eval_model]. This script will launch N subprocess, each which runs the full eval loop independently.

Uses torch.nn.parallel.DistributedDataParallel for its main uses. Agents must specifically implement the wrapper of DistributedDataParallel, but all TorchRankerAgents and TorchGeneratorAgents support this.

Examples

parlai multiprocessing_eval -mf "zoo:tutorial_transformer_generator/model" -bs 16 -t convai2

CLI Arguments

Argument

Description

-o, --init-opt

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

-t, --task

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

-dt, --datatype

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: valid.

-bs, --batchsize

Batch size for minibatch training schemes
Default: 1.

-dynb, --dynamic-batching

Use dynamic batching
Choices: full, batchsort, None

--verbose

Print all messages

-dp, --datapath

Path to datasets, defaults to {parlai_dir}/data

-m, --model

The model class name. can match parlai/agents/ for agents in that directory, or can provide a fully specified module for from X import Y via -m X:Y (e.g. -m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent)

-mf, --model-file

Model file name for loading and saving models

-im, --init-model

Initialize model weights and dict from this file

-rf, --report-filename

Saves a json file of the evaluation report either as an extension to the model-file (if begins with a “.”) or a whole file path. Set to the empty string to not save at all.

--save-world-logs

Saves a jsonl file containing all of the task examples and model replies. Must also specify –report-filename.

--save-format

Choices: conversations, parlai
Default: conversations.

-ne, --num-examples

Default: -1.

-d, --display-examples

-ltim, --log-every-n-secs

Default: 10.

-mcs, --metrics

List of metrics to show/compute, e.g. all, default,or give a list split by , like ppl,f1,accuracy,hits@1,rouge,bleuthe rouge metrics will be computed as rouge-1, rouge-2 and rouge-l
Default: default.

-micro, --aggregate-micro

Report micro-averaged metrics instead of macro averaged metrics.

--log-keep-fields

Fields to keep when logging. Should be a comma separated list
Default: all.

-tblog, --tensorboard-log

Tensorboard logging of metrics, default is False

-tblogdir, --tensorboard-logdir

Tensorboard logging directory, defaults to model_file.tensorboard

--distributed-world-size

Number of workers.


multiprocessing_train

Short description: Train a model

Aliases: mp_train Main launch script for single-host, multi-GPU training.

This is a drop-in replacement for [train_model]. This script will launch N subprocess, each which runs the full training loop independently.

Uses torch.nn.parallel.DistributedDataParallel for its main uses. Agents must specifically implement the wrapper of DistributedDatParallel, but all TorchRankerAgents and TorchGeneratorAgents support this.

Examples

parlai multiprocessing_train -m transformer/generator -bs 16 -t convai2 -mf /tmp/mymodel

CLI Arguments

Argument

Description

-o, --init-opt

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

-t, --task

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

-dt, --datatype

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train.

-bs, --batchsize

Batch size for minibatch training schemes
Default: 1.

-dynb, --dynamic-batching

Use dynamic batching
Choices: full, batchsort, None

--verbose

Print all messages

-dp, --datapath

Path to datasets, defaults to {parlai_dir}/data

-m, --model

The model class name. can match parlai/agents/ for agents in that directory, or can provide a fully specified module for from X import Y via -m X:Y (e.g. -m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent)

-mf, --model-file

Model file name for loading and saving models

-im, --init-model

Initialize model weights and dict from this file

-et, --evaltask

Task to use for valid/test (defaults to the one used for training)

-eps, --num-epochs

Default: -1.

-ttim, --max-train-time

Default: -1.

-vtim, --validation-every-n-secs

Validate every n seconds. Saves model to model_file (if set) whenever best val metric is found
Default: -1.

-stim, --save-every-n-secs

Saves the model to model_file.checkpoint after every n seconds (default -1, never).
Default: -1.

-sval, --save-after-valid

Saves the model to model_file.checkpoint after every validation (default False).

-veps, --validation-every-n-epochs

Validate every n epochs. Saves model to model_file (if set) whenever best val metric is found
Default: -1.

-vp, --validation-patience

Number of iterations of validation where result does not improve before we stop training
Default: 10.

-vmt, --validation-metric

Key into report table for selecting best validation
Default: accuracy.

-vmm, --validation-metric-mode

How to optimize validation metric (max or min)
Choices: max, min

-mcs, --metrics

List of metrics to show/compute, e.g. all, default,or give a list split by , like ppl,f1,accuracy,hits@1,rouge,bleuthe rouge metrics will be computed as rouge-1, rouge-2 and rouge-l
Default: default.

-micro, --aggregate-micro

Report micro-averaged metrics instead of macro averaged metrics.

-tblog, --tensorboard-log

Tensorboard logging of metrics, default is False

-tblogdir, --tensorboard-logdir

Tensorboard logging directory, defaults to model_file.tensorboard

--bpe-vocab

Path to pre-trained tokenizer vocab

--bpe-merge

Path to pre-trained tokenizer merge

--distributed-world-size

Number of workers.


party

Short description: Throw a party!

Aliases: parrot Throw a party.

Examples

parlai party

CLI Arguments

Argument

Description

-n, --seconds

Number of seconds to party
Default: -1.


profile_interactive

Short description: Interactive chat with a model

Basic script which allows to profile interaction with a model using repeat_query to avoid human interaction (so we can time it, only).

CLI Arguments

Argument

Description

-o, --init-opt

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

-t, --task

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”
Default: interactive.

-dt, --datatype

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train.

-bs, --batchsize

Batch size for minibatch training schemes
Default: 1.

-dynb, --dynamic-batching

Use dynamic batching
Choices: full, batchsort, None

--verbose

Print all messages

-dp, --datapath

Path to datasets, defaults to {parlai_dir}/data

-m, --model

The model class name. can match parlai/agents/ for agents in that directory, or can provide a fully specified module for from X import Y via -m X:Y (e.g. -m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent)

-mf, --model-file

Model file name for loading and saving models

-im, --init-model

Initialize model weights and dict from this file

-d, --display-examples

Default: True.

-ne, --num-examples

Default: 5.

--display-prettify

Set to use a prettytable when displaying examples with text candidates

--display-add-fields

Display these fields when verbose is off (e.g., “–display-add-fields label_candidates,beam_texts”)

-it, --interactive-task

Create interactive version of task
Default: True.


profile_train

Short description: cProfile a training run

Run the python or pytorch profiler and prints the results.

Examples

To make sure that bAbI task 1 (1k exs) loads one can run and to see a few of them:

parlai profile_train -t babi:task1k:1 -m seq2seq --dict-file /tmp/dict

CLI Arguments

Argument

Description

-o, --init-opt

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

-t, --task

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

-dt, --datatype

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train.

-bs, --batchsize

Batch size for minibatch training schemes
Default: 1.

-dynb, --dynamic-batching

Use dynamic batching
Choices: full, batchsort, None

--verbose

Print all messages

-dp, --datapath

Path to datasets, defaults to {parlai_dir}/data

-m, --model

The model class name. can match parlai/agents/ for agents in that directory, or can provide a fully specified module for from X import Y via -m X:Y (e.g. -m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent)

-mf, --model-file

Model file name for loading and saving models

-im, --init-model

Initialize model weights and dict from this file

-et, --evaltask

Task to use for valid/test (defaults to the one used for training)

-eps, --num-epochs

Default: 1.

-ttim, --max-train-time

Default: -1.

-vtim, --validation-every-n-secs

Validate every n seconds. Saves model to model_file (if set) whenever best val metric is found
Default: -1.

-stim, --save-every-n-secs

Saves the model to model_file.checkpoint after every n seconds (default -1, never).
Default: -1.

-sval, --save-after-valid

Saves the model to model_file.checkpoint after every validation (default False).

-veps, --validation-every-n-epochs

Validate every n epochs. Saves model to model_file (if set) whenever best val metric is found
Default: -1.

-vp, --validation-patience

Number of iterations of validation where result does not improve before we stop training
Default: 10.

-vmt, --validation-metric

Key into report table for selecting best validation
Default: accuracy.

-vmm, --validation-metric-mode

How to optimize validation metric (max or min)
Choices: max, min

-mcs, --metrics

List of metrics to show/compute, e.g. all, default,or give a list split by , like ppl,f1,accuracy,hits@1,rouge,bleuthe rouge metrics will be computed as rouge-1, rouge-2 and rouge-l
Default: default.

-micro, --aggregate-micro

Report micro-averaged metrics instead of macro averaged metrics.

-tblog, --tensorboard-log

Tensorboard logging of metrics, default is False

-tblogdir, --tensorboard-logdir

Tensorboard logging directory, defaults to model_file.tensorboard

--bpe-vocab

Path to pre-trained tokenizer vocab

--bpe-merge

Path to pre-trained tokenizer merge

--torch

If true, use the torch profiler. Otherwise use cProfile.

--torch-cuda

If true, use the torch cuda profiler. Otherwise use cProfile.

--debug

If true, enter debugger at end of run.


token_stats

Short description: Compute tokenized stats.

CLI Arguments

Argument

Description

-o, --init-opt

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

-t, --task

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

-dt, --datatype

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train:stream:ordered.

-bs, --batchsize

Batch size for minibatch training schemes
Default: 1.

-dynb, --dynamic-batching

Use dynamic batching
Choices: full, batchsort, None

--verbose

Print all messages

-dp, --datapath

Path to datasets, defaults to {parlai_dir}/data

-m, --model

The model class name. can match parlai/agents/ for agents in that directory, or can provide a fully specified module for from X import Y via -m X:Y (e.g. -m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent)
Default: test_agents/null.

-mf, --model-file

Model file name for loading and saving models

-im, --init-model

Initialize model weights and dict from this file

--num-examples, -n

Default: -1.

-ltim, --log-every-n-secs

Default: 10.

--field

Default: text.

--final-only


vacuum

Short description: Shrink a model file for release.

Reduces the size of a model file by stripping the optimizer.

Assumes we are working with a TorchAgent

CLI Arguments

Argument

Description

-mf, --model-file

Path to model file.


verify_data

Short description: Check tasks for common errors

Verify data doesn’t have basic mistakes, like empty text fields or empty label candidates.

Examples

parlai verify_data -t convai2 -dt train:stream:ordered

CLI Arguments

Argument

Description

-o, --init-opt

Path to json file of options. Note: Further Command-line arguments override file-based options.

--allow-missing-init-opts

Warn instead of raising if an argument passed in with –init-opt is not in the target opt.

-t, --task

ParlAI task(s), e.g. “babi:Task1” or “babi,cbt”

-dt, --datatype

Choose from: train, train:ordered, valid, test. to stream data add “:stream” to any option (e.g., train:stream). by default train is random with replacement, valid is ordered, test is ordered.
Choices: train, train:stream, train:ordered, train:ordered:stream, train:stream:ordered, train:evalmode, train:evalmode:stream, train:evalmode:ordered, train:evalmode:ordered:stream, train:evalmode:stream:ordered, valid, valid:stream, test, test:stream
Default: train:stream:ordered.

-bs, --batchsize

Batch size for minibatch training schemes
Default: 1.

-dynb, --dynamic-batching

Use dynamic batching
Choices: full, batchsort, None

--verbose

Print all messages

-dp, --datapath

Path to datasets, defaults to {parlai_dir}/data

-m, --model

The model class name. can match parlai/agents/ for agents in that directory, or can provide a fully specified module for from X import Y via -m X:Y (e.g. -m parlai.agents.seq2seq.seq2seq:Seq2SeqAgent)

-mf, --model-file

Model file name for loading and saving models

-im, --init-model

Initialize model weights and dict from this file

-ltim, --log-every-n-secs

Default: 2.

-d, --display-examples