Emotional First Aid Dataset: Psychological Counseling QA Corpus

The Emotional First Aid Dataset is a comprehensive Chinese psychological counseling QA corpus, featuring 20,000 multi-turn dialogues. It is designed to support the development of AI applications in the field of psychological counseling and is available for research purposes.

Detaylı Giriş

The Emotional First Aid Dataset is a valuable resource for researchers and developers working on AI-powered psychological counseling tools. It includes detailed multi-turn dialogues, topic labels, and emotional annotations, making it suitable for a variety of tasks such as emotion classification and counseling dialogue generation.

Visit Website

Daha fazla
Veri seti

Data Science for COVID-19 (DS4C)

The DS4C dataset is a structured collection of COVID-19 data from South Korea, based on reports from the Korea Centers for Disease Control & Prevention (KCDC) and local governments. It includes information on infections, patient routes, and various analyses. The dataset has been used for multiple research and visualization projects.

Question-Level Feature Extraction on DAIC-WOZ Dataset

The DAIC-WOZ dataset contains clinical interviews designed to support the diagnosis of psychological distress conditions such as anxiety, depression, and post-traumatic stress disorder. This repository provides code for extracting question-level features from the DAIC-WOZ dataset, which can be used for multimodal analysis of depression levels.

HuggingFaceFW/fineweb-2

FineWeb-2 is a dataset of over 15 trillion tokens of cleaned and deduplicated English web data from CommonCrawl. This is the second iteration of the popular 🍷 FineWeb dataset, bringing high quality pretraining data to over 1000 🗣️ languages.The 🥂 FineWeb2 dataset is fully reproducible, available under the permissive ODC-By 1.0 license and extensively validated through hundreds of ablation experiments.In particular, on the set of 9 diverse languages we used to guide our processing decisions, 🥂 FineWeb2 outperforms other popular pretraining datasets covering multiple languages (such as CC-100, mC4, CulturaX or HPLT, while being substantially larger) and, in some cases, even performs better than some datasets specifically curated for a single one of these languages, in our diverse set of carefully selected evaluation tasks: FineTasks.

Web Sitesi URL'si

https://github.com/chatopera/efaqa-corpus-zh

Kategoriler

Veri seti Yapay Zeka LLM

Anahtar Kelimeler

Emotional First Aid DatasetPsychological CounselingQA CorpusMulti-turn DialoguesResearch