Emotional First Aid Dataset: Psychological Counseling QA Corpus

Emotional First Aid Dataset: Psychological Counseling QA Corpus

The Emotional First Aid Dataset is a comprehensive Chinese psychological counseling QA corpus, featuring 20,000 multi-turn dialogues. It is designed to support the development of AI applications in the field of psychological counseling and is available for research purposes.

Emotional First Aid Dataset: Psychological Counseling QA Corpus

Detaylı Giriş

The Emotional First Aid Dataset is a valuable resource for researchers and developers working on AI-powered psychological counseling tools. It includes detailed multi-turn dialogues, topic labels, and emotional annotations, making it suitable for a variety of tasks such as emotion classification and counseling dialogue generation.

Daha fazla
Veri seti

Data Science for COVID-19 (DS4C)
Detayları Görüntüle

Data Science for COVID-19 (DS4C)

The DS4C dataset is a structured collection of COVID-19 data from South Korea, based on reports from the Korea Centers for Disease Control & Prevention (KCDC) and local governments. It includes information on infections, patient routes, and various analyses. The dataset has been used for multiple research and visualization projects.

Question-Level Feature Extraction on DAIC-WOZ Dataset
Detayları Görüntüle

Question-Level Feature Extraction on DAIC-WOZ Dataset

The DAIC-WOZ dataset contains clinical interviews designed to support the diagnosis of psychological distress conditions such as anxiety, depression, and post-traumatic stress disorder. This repository provides code for extracting question-level features from the DAIC-WOZ dataset, which can be used for multimodal analysis of depression levels.

HuggingFaceFW/fineweb-2
Detayları Görüntüle

HuggingFaceFW/fineweb-2

FineWeb-2 is a dataset of over 15 trillion tokens of cleaned and deduplicated English web data from CommonCrawl. This is the second iteration of the popular 🍷 FineWeb dataset, bringing high quality pretraining data to over 1000 🗣️ languages.The 🥂 FineWeb2 dataset is fully reproducible, available under the permissive ODC-By 1.0 license and extensively validated through hundreds of ablation experiments.In particular, on the set of 9 diverse languages we used to guide our processing decisions, 🥂 FineWeb2 outperforms other popular pretraining datasets covering multiple languages (such as CC-100, mC4, CulturaX or HPLT, while being substantially larger) and, in some cases, even performs better than some datasets specifically curated for a single one of these languages, in our diverse set of carefully selected evaluation tasks: FineTasks.

Anahtar Kelimeler

Emotional First Aid DatasetPsychological CounselingQA CorpusMulti-turn DialoguesResearch

Paylaş