This dataset contains 20,000 labelled English tweets of depressed and non-depressed users. The data is collected using the Twitter API and includes feature extraction techniques such as topic modelling and emoji sentiment analysis. It is designed for mental health classification at the tweet level.
The Depression: Twitter Dataset + Feature Extraction is a valuable resource for researchers and developers working on mental health classification. It includes 20,000 labelled English tweets, collected using the Twitter API. The dataset provides feature extraction techniques such as topic modelling and emoji sentiment analysis, making it suitable for various machine learning and data analysis projects. The data is essential for understanding and predicting mental health conditions from social media content.
The Substance Abuse and Mental Health Data Archive (SAMHDA) provides a comprehensive collection of data sets related to mental health and substance use. It includes ongoing studies, population surveys, treatment facility surveys, and client-level data, offering valuable insights for researchers and policymakers.
FineWeb is a dataset of over 15 trillion tokens of cleaned and deduplicated English web data from CommonCrawl. It is optimized for LLM performance and processed using the datatrove library. The dataset aims to provide high-quality data for training large language models and outperforms other commonly used web datasets.We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Tobii Pro Lab is a comprehensive eye tracking software designed for behavioral research, offering a complete solution for researchers to conduct experiments from test design to data analysis.