We introduce Mintaka, a complex, natural, and multilingual dataset designed for experimenting with end-to-end question-answering models. Mintaka is composed of 20,000 question-answer pairs collected in English, annotated with Wikidata entities, and translated into Arabic, French, German, Hindi, Italian, Japanese, Portuguese, and Spanish for a total of 180,000 samples. Mintaka includes 8 types of complex questions, including superlative, intersection, and multi-hop questions, which were naturally elicited from crowd workers. We run baselines over Mintaka, the best of which achieves 38% hits@1 in English and 31% hits@1 multilingually, showing that existing models have room for improvement. We release Mintaka at https://github.com/amazon-research/mintaka.

介绍了Mintaka，这是一个设计用于评估端到端问答模型的复杂、自然且多语言的数据集，包含20,000个问题-答案对，包含8种类型的复杂问题，其中包括最高级，交集和多跳问题，并在9种不同的语言（英语，阿拉伯语，法语，德语，印地语，意大利语，日语，葡萄牙语和西班牙语）进行了翻译。针对Mintaka运行了基线，其中最好的英文命中率@1为38％，多语言命中率@1为31％，表明现有模型仍有改进空间。

Mintaka: 一个复杂、自然的多语言端到端问答数据集