MentalManip数据集是由Wang等人(2024b)引入的,专门用于检测和分类心理操纵的对话数据集。该数据集包含4000个多轮虚构对话,来源于在线电影剧本,并进行了多层次的标注,包括操纵的存在、操纵技巧和目标脆弱性。数据集的创建旨在通过高质量的标注确保数据的一致性和准确性,从而支持心理操纵检测的研究。
MentalManip数据集是一个高质量的对话数据集,专门用于检测和分类心理操纵行为。该数据集包含4000个多轮虚构对话,来源于在线电影剧本,并进行了多层次的标注,包括操纵的存在、操纵技巧和目标脆弱性。该数据集主要应用于心理健康领域,旨在通过早期检测心理操纵行为,保护个体的心理健康。
FineWeb is a dataset of over 15 trillion tokens of cleaned and deduplicated English web data from CommonCrawl. It is optimized for LLM performance and processed using the datatrove library. The dataset aims to provide high-quality data for training large language models and outperforms other commonly used web datasets.We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Psychology LLM、LLM、The Big Model of Mental Health、Finetune、InternLM2、InternLM2.5、Qwen、ChatGLM、Baichuan、DeepSeek、Mixtral、LLama3、GLM4、Qwen2 - SmartFlowAI/EmoLLM
This project implements the conversion algorithm from the ToMi dataset to the T4D (Thinking is for Doing) dataset, as introduced in the paper https://arxiv.org/abs/2310.03051. It filters examples with Theory of Mind (ToM) questions and adapts the algorithm to account for second-order false beliefs.