Reddit financial image post sentiment dataset.

Data Brief

Department of Computer Science and Mathematics, Munich University of Applied Sciences, Lothstr. 34, 80335, München, Germany.

Published: December 2022


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

The dataset presented in this paper consists of sentiment information extracted from image and text data of financial subreddit posts. Members of these subreddits post about their trading behavior, express their opinions, and discuss capital market trends. Their posts contain sentiment information on financial topics as well as signaling information on trading decisions. Frequently, members post screenshots of their portfolios from their mobile broker apps. We collected the posts, processed them to extract sentiment scores using various methods, and anonymized them. The dataset consists therefore not of any content from the posts or information about the author, but the processed sentiment information within the post. Further financial tickers mentioned in the posts are tracked, such that the effect of sentiment in the posts can be attributed to financial products and used in the context of financial forecasting. The posts were collected using the Reddit [2] and Pushshift APIs [3] and processed using an Amazon Web Services architecture. A fine-tuned MobileNets artificial neural network [4] was used to classify images into four distinct categories, which had been determined in a preliminary analysis. The categories included (e.g. screenshots of mobile broker portfolios), (e.g. screenshots from twitter) and (e.g. other financial screenshots, such as charts). The reason for the classification of images into the four categories is that the images are so inherently different, that different extraction methods had to be applied for each category. OCR - methods [5] were used to extract text from images. Custom methods were applied to extract sentiment and other information from the resulting text. The data [1] is available on a 20-minute basis and can be used in many areas, such as financial forecasting and analyzing sentiment dynamics in social media posts.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9747619PMC
http://dx.doi.org/10.1016/j.dib.2022.108759DOI Listing

Publication Analysis

Top Keywords

sentiment
8
text data
8
posts
8
mobile broker
8
extract sentiment
8
financial forecasting
8
methods applied
8
financial
7
reddit financial
4
financial image
4

Similar Publications

Aim: This study aimed to describe barriers and facilitators of the adherence of children with human immunodeficiency virus (HIV) to antiretroviral therapy (ART) from the perspectives of their caregivers.

Methods: In-depth interviews were held with the caregivers of 15 children. The collected data were analyzed using thematic analysis procedures.

View Article and Find Full Text PDF

Background: Mobile health (mHealth) interventions can be effective for people living with HIV, who are sensitive to privacy breach risks. Understanding the perceived experiences of intervention participants can provide comprehensive insights into potential users and predict intervention effectiveness. Thus, it is necessary to plan engagement measurement and consider ways to enhance engagement during the app development phase.

View Article and Find Full Text PDF

Background: Out-of-hospital cardiac arrests (OHCAs) are a leading cause of death worldwide, yet first responder apps can significantly improve outcomes by mobilizing citizens to perform cardiopulmonary resuscitation before professional help arrives. Despite their importance, limited research has examined the psychological and behavioral factors that influence individuals' willingness to adopt these apps.

Objective: Given that first responder app use involves elements of both technology adoption and preventive health behavior, it is essential to examine this behavior from multiple theoretical perspectives.

View Article and Find Full Text PDF

Purpose: Gender bias against girls may affect health-seeking behavior and outcomes of childhood cancer. This study aimed to study the nature and extent of gender bias in health care among caregivers of childhood patients with cancer and also in community.

Methods: This cross-sectional mixed-methods study was conducted in a tertiary cancer hospital and an urban community between July 2021 and July 2023.

View Article and Find Full Text PDF

In this paper, we study the impact of momentum, volume and investor sentiment on U.S. tech sector stock returns using Principal Component Analysis-Hidden Markov Model (PCA-HMM) methodology.

View Article and Find Full Text PDF