This paper discusses Helply - a synthesized ML training dataset focused on psychology and therapy, created by Alex Scott and published by NamelessAI. The dataset developed by Alex Scott is a comprehensive collection of synthesized data designed to train LLMs in understanding psychological and therapeutic contexts. This dataset aims to simulate real-world interactions between therapists and patients, enabling ML models to learn from a wide range of scenarios and therapeutic techniques.
The Helply dataset is a comprehensive synthetic ML training dataset created by Alex Scott and released by NamelessAI, focusing on the fields of psychology and therapy. The dataset is designed to train large language models (LLMs) to understand and simulate human psychological processes. By combining existing psychology literature, therapy session records, and patient self-report data, the Helply dataset covers a variety of treatment scenarios, such as cognitive behavioral therapy (CBT), internal family systems (IFS), and internet-based cognitive behavioral therapy (iCBT). In addition, the dataset emphasizes the dynamic interaction between patients and therapists, capturing communication details that affect treatment outcomes. Despite challenges such as ethical considerations and model generalization, the Helply dataset has revolutionary potential to change the understanding and application of therapeutic practices in digital environments.
The ISSP is a cross-national collaboration program conducting annual surveys on diverse topics relevant to social sciences. Established in 1984, it includes members from various cultures around the globe. Over one million respondents have participated in ISSP surveys, and all collected data and documentation are available free of charge.
tartuNLP/reddit-anhedonia by huggingface-mirror (hf-mirror)
The Emotional First Aid Raw Dataset is a collection of raw, unannotated psychological counseling Q&A data, designed to support research in AI applications for mental health. It contains over 172,000 topics with 2,381,273 messages, totaling 44,514,786 characters, providing a rich source of data for natural language processing and AI development.