The Emotional First Aid Raw Dataset is a collection of raw, unannotated psychological counseling Q&A data, designed to support research in AI applications for mental health. It contains over 172,000 topics with 2,381,273 messages, totaling 44,514,786 characters, providing a rich source of data for natural language processing and AI development.
This dataset is a valuable resource for researchers and developers working on AI-powered psychological counseling tools. It includes a wide range of topics and detailed messages, making it suitable for tasks such as data preprocessing, model training, and dialogue generation. The data is sourced from public websites and has been anonymized and desensitized for privacy protection.
FineWeb is a dataset of over 15 trillion tokens of cleaned and deduplicated English web data from CommonCrawl. It is optimized for LLM performance and processed using the datatrove library. The dataset aims to provide high-quality data for training large language models and outperforms other commonly used web datasets.We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Psychology LLM、LLM、The Big Model of Mental Health、Finetune、InternLM2、InternLM2.5、Qwen、ChatGLM、Baichuan、DeepSeek、Mixtral、LLama3、GLM4、Qwen2 - SmartFlowAI/EmoLLM
This study surveys the attitudes and behaviors of US higher education faculty members regarding online resources, the library, and related topics. It covers a wide range of issues, including faculty dependence on electronic scholarly resources, the transition from print to electronic journals, publishing preferences, e-books, and the preservation of scholarly journals.