ToM QA Dataset: Evaluating Theory of Mind in Question Answering

ToM QA Dataset: Evaluating Theory of Mind in Question Answering

The ToM QA Dataset is designed to evaluate question-answering models' ability to reason about beliefs. It includes 3 task types and 4 question types, creating 12 total scenarios. The dataset is inspired by theory-of-mind experiments in developmental psychology and is used to test models' understanding of beliefs and inconsistent states of the world.

ToM QA Dataset: Evaluating Theory of Mind in Question Answering

Gedetailleerde introductie

The ToM QA Dataset, introduced in the EMNLP 2018 paper 'Evaluating Theory of Mind in Question Answering', provides a comprehensive set of scenarios to test question-answering models. The dataset includes first-order and second-order belief questions, as well as memory and reality questions, to ensure models have a correct understanding of the state of the world and others' beliefs. It is available in four versions: easy with noise, easy without noise, hard with noise, and hard without noise.

Meer
Dataset

Hugging Face Dataset - bfuzzy1/gunny_x
Bekijk details

Hugging Face Dataset - bfuzzy1/gunny_x

Every veteran knows and has had a 'Gunny': Semper Fidelis. This dataset is designed for conversational AI systems to assist veterans from various military branches, including U.S. and U.K. armed forces.

SAMHDA: Substance Abuse and Mental Health Data Archive
Bekijk details

SAMHDA: Substance Abuse and Mental Health Data Archive

The Substance Abuse and Mental Health Data Archive (SAMHDA) provides a comprehensive collection of data sets related to mental health and substance use. It includes ongoing studies, population surveys, treatment facility surveys, and client-level data, offering valuable insights for researchers and policymakers.

HuggingFaceFW/fineweb-2
Bekijk details

HuggingFaceFW/fineweb-2

FineWeb-2 is a dataset of over 15 trillion tokens of cleaned and deduplicated English web data from CommonCrawl. This is the second iteration of the popular 🍷 FineWeb dataset, bringing high quality pretraining data to over 1000 🗣️ languages.The 🥂 FineWeb2 dataset is fully reproducible, available under the permissive ODC-By 1.0 license and extensively validated through hundreds of ablation experiments.In particular, on the set of 9 diverse languages we used to guide our processing decisions, 🥂 FineWeb2 outperforms other popular pretraining datasets covering multiple languages (such as CC-100, mC4, CulturaX or HPLT, while being substantially larger) and, in some cases, even performs better than some datasets specifically curated for a single one of these languages, in our diverse set of carefully selected evaluation tasks: FineTasks.

Categorieën

Trefwoorden

ToM QA DatasetTheory of MindQuestion AnsweringBelief ReasoningDevelopmental PsychologyEMNLP 2018

Delen