Psychology Wiki Dataset

Psychology Wiki Datasetpsychology_wiki数据集的构建基于心理学领域的英文维基百科内容，通过系统化的数据采集与整理，确保了信息的广泛覆盖与深度挖掘。数据集中的每一篇文章均经过严格的筛选与标注，涵盖了标题、正文、相关性、受欢迎程度及排名等多个维度，为心理学研究提供了丰富的文本资源。

Detailed Introduction

Psychology Wiki Dataset该数据集包含五个特征：标题（title）、文本（text）、相关性（relevans）、流行度（popularity）和排名（ranking），数据类型分别为字符串和浮点数。数据集分为一个训练集，包含989个样本，总大小为12359374字节。数据集的下载大小为6790523字节。

Visit Website

More
Dataset

Mental Health Large Model Lingxin (SoulChat)

Lingxin (SoulChat) is a psychological health large model fine-tuned with millions of Chinese long-text instructions and multi-turn empathetic dialogue data in the field of psychological counseling.

electronic media datasets - Mental Health Datasets

An evolving list of electronic media datasets used to model mental health status. This repository curates a variety of datasets from different sources, including social media platforms, online forums, and academic studies, to support research in mental health modeling and AI applications.

HuggingFaceFW/fineweb-2

FineWeb-2 is a dataset of over 15 trillion tokens of cleaned and deduplicated English web data from CommonCrawl. This is the second iteration of the popular 🍷 FineWeb dataset, bringing high quality pretraining data to over 1000 🗣️ languages.The 🥂 FineWeb2 dataset is fully reproducible, available under the permissive ODC-By 1.0 license and extensively validated through hundreds of ablation experiments.In particular, on the set of 9 diverse languages we used to guide our processing decisions, 🥂 FineWeb2 outperforms other popular pretraining datasets covering multiple languages (such as CC-100, mC4, CulturaX or HPLT, while being substantially larger) and, in some cases, even performs better than some datasets specifically curated for a single one of these languages, in our diverse set of carefully selected evaluation tasks: FineTasks.

Website URL

https://huggingface.co/datasets/burgerbee/psychology_wiki

More Categories

Dataset AI LLM

Keywords

Psychology WikiDatasetHugging Face