tartuNLP/reddit-anhedonia by huggingface-mirror (hf-mirror)
Focusing on the PRIMATE dataset, our study reveals concerns regarding annotation validity, particularly for the lack of interest or pleasure symptom. Through re-annotation by a mental health professional, we introduce finer labels and textual spans as evidence, identifying a notable number of false positives. Our refined annotations offer a higher-quality test set for anhedonia detection. This study underscores the necessity of addressing annotation quality issues in mental health datasets, advocating for improved methodologies to enhance NLP model reliability in mental health assessments. A mental health professional (MHP) read all the posts in the subset and labelled them for the presence of loss of interest or pleasure (anhedonia). The MHP assigned three labels to each post: a) 'mentioned' if the symptom is talked about in the text, but it is not possible to infer its duration or intensity; b) 'answerable' if there is clear evidence of anhedonia; c) 'writer's symptoms' which shows whether the author of the post discusses themselves or a third person. Additionally, the MHP selected the part of the text that supports the positive label.
The DS4C dataset is a structured collection of COVID-19 data from South Korea, based on reports from the Korea Centers for Disease Control & Prevention (KCDC) and local governments. It includes information on infections, patient routes, and various analyses. The dataset has been used for multiple research and visualization projects.
Psych-101 is a dataset of natural language transcripts from human psychological experiments, comprising trial-by-trial data from 160 experiments and 60,092 participants, making 10,681,650 choices. It provides valuable insights into human decision-making processes and is available under the Apache License 2.0.
PsychData is an online platform for hosting and conducting surveys and experiments in psychology, supporting secure data collection for researchers and students.