This paper discusses Helply - a synthesized ML training dataset focused on psychology and therapy, created by Alex Scott and published by NamelessAI. The dataset developed by Alex Scott is a comprehensive collection of synthesized data designed to train LLMs in understanding psychological and therapeutic contexts. This dataset aims to simulate real-world interactions between therapists and patients, enabling ML models to learn from a wide range of scenarios and therapeutic techniques.
The Helply dataset is a comprehensive synthetic ML training dataset created by Alex Scott and released by NamelessAI, focusing on the fields of psychology and therapy. The dataset is designed to train large language models (LLMs) to understand and simulate human psychological processes. By combining existing psychology literature, therapy session records, and patient self-report data, the Helply dataset covers a variety of treatment scenarios, such as cognitive behavioral therapy (CBT), internal family systems (IFS), and internet-based cognitive behavioral therapy (iCBT). In addition, the dataset emphasizes the dynamic interaction between patients and therapists, capturing communication details that affect treatment outcomes. Despite challenges such as ethical considerations and model generalization, the Helply dataset has revolutionary potential to change the understanding and application of therapeutic practices in digital environments.
The Substance Abuse and Mental Health Data Archive (SAMHDA) provides a comprehensive collection of data sets related to mental health and substance use. It includes ongoing studies, population surveys, treatment facility surveys, and client-level data, offering valuable insights for researchers and policymakers.
The ToM QA Dataset is designed to evaluate question-answering models' ability to reason about beliefs. It includes 3 task types and 4 question types, creating 12 total scenarios. The dataset is inspired by theory-of-mind experiments in developmental psychology and is used to test models' understanding of beliefs and inconsistent states of the world.
The Mental Health Corpus contains labeled comments on mental health issues, used for sentiment and toxic language analysis.