This dataset contains 20,000 labelled English tweets of depressed and non-depressed users. The data is collected using the Twitter API and includes feature extraction techniques such as topic modelling and emoji sentiment analysis. It is designed for mental health classification at the tweet level.
The Depression: Twitter Dataset + Feature Extraction is a valuable resource for researchers and developers working on mental health classification. It includes 20,000 labelled English tweets, collected using the Twitter API. The dataset provides feature extraction techniques such as topic modelling and emoji sentiment analysis, making it suitable for various machine learning and data analysis projects. The data is essential for understanding and predicting mental health conditions from social media content.
This paper discusses Helply - a synthesized ML training dataset focused on psychology and therapy, created by Alex Scott and published by NamelessAI. The dataset developed by Alex Scott is a comprehensive collection of synthesized data designed to train LLMs in understanding psychological and therapeutic contexts. This dataset aims to simulate real-world interactions between therapists and patients, enabling ML models to learn from a wide range of scenarios and therapeutic techniques.
Psy-Insight is a bilingual, interpretable multi-turn dataset for mental health counseling dialogues. It includes 6,208 rounds of multi-turn counseling dialogues in English and 5,776 rounds in Chinese, annotated with step-by-step reasoning labels and multi-task labels. This dataset is designed to support the application of large language models in mental health and is suitable for tasks such as emotion classification and psychological treatment interpretation.
The IC-AnnoMI repository contains source code and a synthetic dataset generated through in-context zero-shot LLM prompting for mental health and therapeutic counselling. IC-AnnoMI is a project that generates contextual MI dialogues using large language models (LLMs). The project contains source code and a synthetic dataset generated through zero-shot prompts, aiming to address the data scarcity and inherent bias problems in mental health and therapeutic consultation.