graviti
Products
Resources
About us
SimpleQuestions v2
Text
NLP
|...
License: Unknown

Overview

TheSimpleQuestions, a dataset collected for research in automatic question answering with human
generated questions. Details and baseline results on this dataset can be found in the paper:

Antoine Bordes, Nicolas Usunier, Sumit Chopra and Jason Weston. Large-Scale Simple Question
answering with Memory Networks
,
arXiv:1506.02075.

The dataset consists of a total of 108,442 questions written in natural
language by human English-speaking annotators each paired with a corresponding fact, formatted
as (subject, relationship, object), that provides the answer but also a complete explanation.
Facts have been extracted from the Knowledge Base Freebase. We
randomly shuffle these questions and use 70% of them (75910) as training set, 10% as validation
set (10845), and the remaining 20% as test set.

Here are some examples of questions and facts:

* What American cartoonist is the creator of Andy Lippincott?
  Fact: (andy_lippincott, character_created_by, garry_trudeau)
* Which forest is Fires Creek in?
  Fact: (fires_creek, containedby, nantahala_national_forest)
* What does Jimmy Neutron do?
  Fact: (jimmy_neutron, fictional_character_occupation, inventor)
* What dietary restriction is incompatible with kimchi?
  Fact: (kimchi, incompatible_with_dietary_restrictions, veganism)

Citation

Please use the following citation when referencing the dataset:

@article{bordes2015large,
  title={Large-scale simple question answering with memory networks},
  author={Bordes, Antoine and Usunier, Nicolas and Chopra, Sumit and Weston, Jason},
  journal={arXiv preprint arXiv:1506.02075},
  year={2015}
}
Data Summary
Type
Text,
Amount
--
Size
403.82MB
Provided by
Facebook Research
Giving people the power to build community through research and innovation
| Amount -- | Size 403.82MB
SimpleQuestions v2
Text
NLP
License: Unknown

Overview

TheSimpleQuestions, a dataset collected for research in automatic question answering with human
generated questions. Details and baseline results on this dataset can be found in the paper:

Antoine Bordes, Nicolas Usunier, Sumit Chopra and Jason Weston. Large-Scale Simple Question
answering with Memory Networks
,
arXiv:1506.02075.

The dataset consists of a total of 108,442 questions written in natural
language by human English-speaking annotators each paired with a corresponding fact, formatted
as (subject, relationship, object), that provides the answer but also a complete explanation.
Facts have been extracted from the Knowledge Base Freebase. We
randomly shuffle these questions and use 70% of them (75910) as training set, 10% as validation
set (10845), and the remaining 20% as test set.

Here are some examples of questions and facts:

* What American cartoonist is the creator of Andy Lippincott?
  Fact: (andy_lippincott, character_created_by, garry_trudeau)
* Which forest is Fires Creek in?
  Fact: (fires_creek, containedby, nantahala_national_forest)
* What does Jimmy Neutron do?
  Fact: (jimmy_neutron, fictional_character_occupation, inventor)
* What dietary restriction is incompatible with kimchi?
  Fact: (kimchi, incompatible_with_dietary_restrictions, veganism)

Citation

Please use the following citation when referencing the dataset:

@article{bordes2015large,
  title={Large-scale simple question answering with memory networks},
  author={Bordes, Antoine and Usunier, Nicolas and Chopra, Sumit and Weston, Jason},
  journal={arXiv preprint arXiv:1506.02075},
  year={2015}
}
0
Start building your AI now
graviti
wechat-QR
Long pressing the QR code to follow wechat official account

Copyright@Graviti