MentalManip: 心理操纵检测数据集

MentalManip数据集是由Wang等人（2024b）引入的，专门用于检测和分类心理操纵的对话数据集。该数据集包含4000个多轮虚构对话，来源于在线电影剧本，并进行了多层次的标注，包括操纵的存在、操纵技巧和目标脆弱性。数据集的创建旨在通过高质量的标注确保数据的一致性和准确性，从而支持心理操纵检测的研究。

Podrobný úvod

MentalManip数据集是一个高质量的对话数据集，专门用于检测和分类心理操纵行为。该数据集包含4000个多轮虚构对话，来源于在线电影剧本，并进行了多层次的标注，包括操纵的存在、操纵技巧和目标脆弱性。该数据集主要应用于心理健康领域，旨在通过早期检测心理操纵行为，保护个体的心理健康。

Visit Website

Více
Datová sada

tartuNLP/Reddit Anhedonia Dataset - hf-mirror

tartuNLP/reddit-anhedonia by huggingface-mirror (hf-mirror)

HuggingFaceFW/fineweb-2

FineWeb-2 is a dataset of over 15 trillion tokens of cleaned and deduplicated English web data from CommonCrawl. This is the second iteration of the popular 🍷 FineWeb dataset, bringing high quality pretraining data to over 1000 🗣️ languages.The 🥂 FineWeb2 dataset is fully reproducible, available under the permissive ODC-By 1.0 license and extensively validated through hundreds of ablation experiments.In particular, on the set of 9 diverse languages we used to guide our processing decisions, 🥂 FineWeb2 outperforms other popular pretraining datasets covering multiple languages (such as CC-100, mC4, CulturaX or HPLT, while being substantially larger) and, in some cases, even performs better than some datasets specifically curated for a single one of these languages, in our diverse set of carefully selected evaluation tasks: FineTasks.

ToM QA Dataset: Evaluating Theory of Mind in Question Answering

The ToM QA Dataset is designed to evaluate question-answering models' ability to reason about beliefs. It includes 3 task types and 4 question types, creating 12 total scenarios. The dataset is inspired by theory-of-mind experiments in developmental psychology and is used to test models' understanding of beliefs and inconsistent states of the world.

URL webu

https://github.com/Anton-Jiayuan-MA/Manip-IAP

Kategorie

Datová sada Umělá inteligence LLM

Klíčová slova

MentalManip心理操纵检测对话数据集心理健康早期检测多轮对话虚构对话多层次标注