This section presents the set of 6 tasks for testing end-to-end dialog systems in the restaurant domain described in the paper:
Each task tests a unique aspect of dialog. Tasks are designed to complement the set of 20 bAbI tasks for story understanding of the previous section.
For each task, there are 1000 dialogs for training, 1000 for development and 1000 for testing. For tasks 1-5, we also include a second test set (with suffix -OOV.txt) that contains dialogs including entities not present in training and development sets.
The file format for each task is as follows:
ID user_utterance [tab] bot_utterance ... The IDs for a given dialog start at 1 and increase. When the IDs in a file reset back to 1 you can consider the following sentences as a new dialog. When the bot speaks two times in a row, we used the special token “<SILENCE>” to fill in for the missing user utterance. See more details in the README included with the dataset. The goal of the tasks is to predict the bot utterances, that can be sentences or API calls (sentences starting with the special token “api_call”). Here is an example of dialog (from Task 1):
1 hi hello what can i help you with today 2 can you make a restaurant reservation with italian cuisine for six people in a cheap price range i'm on it 3 <SILENCE>where should it be 4 rome please ok let me look into some options for you 5 <SILENCE> api_call italian rome six cheap