Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background: Data from the social media platform X (formerly Twitter) can provide insights into the types of language that are used when discussing drug use. In past research using latent Dirichlet allocation (LDA), we found that tweets containing "street names" of prescription drugs were difficult to classify due to the similarity to other colloquialisms and lack of clarity over how the terms were used. Conversely, "brand name" references were more amenable to machine-driven categorization.

Objective: This study sought to use next-generation techniques (beyond LDA) from natural language processing to reprocess X data and automatically cluster groups of tweets into topics to differentiate between street- and brand-name data sets. We also aimed to analyze the differences in emotional valence between the 2 data sets to study the relationship between engagement on social media and sentiment.

Methods: We used the Twitter application programming interface to collect tweets that contained the street and brand name of a prescription drug within the tweet. Using BERTopic in combination with Uniform Manifold Approximation and Projection and k-means, we generated topics for the street-name corpus (n=170,618) and brand-name corpus (n=245,145). Valence Aware Dictionary and Sentiment Reasoner (VADER) scores were used to classify whether tweets within the topics had positive, negative, or neutral sentiments. Two different logistic regression classifiers were used to predict the sentiment label within each corpus. The first model used a tweet's engagement metrics and topic ID to predict the label, while the second model used those features in addition to the top 5000 tweets with the largest term-frequency-inverse document frequency score.

Results: Using BERTopic, we identified 40 topics for the street-name data set and 5 topics for the brand-name data set, which we generalized into 8 and 5 topics of discussion, respectively. Four of the general themes of discussion in the brand-name corpus referenced drug use, while 2 themes of discussion in the street-name corpus referenced drug use. From the VADER scores, we found that both corpora were inclined toward positive sentiment. Adding the vectorized tweet text increased the accuracy of our models by around 40% compared with the models that did not incorporate the tweet text in both corpora.

Conclusions: BERTopic was able to classify tweets well. As with LDA, the discussion using brand names was more similar between tweets than the discussion using street names. VADER scores could only be logically applied to the brand-name corpus because of the high prevalence of non-drug-related topics in the street-name data. Brand-name tweets either discussed drugs positively or negatively, with few posts having a neutral emotionality. From our machine learning models, engagement alone was not enough to predict the sentiment label; the added context from the tweets was needed to understand the emotionality of a tweet.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11380061PMC
http://dx.doi.org/10.2196/57885DOI Listing

Publication Analysis

Top Keywords

topics street-name
12
brand-name corpus
12
vader scores
12
tweets
9
prescription drug
8
social media
8
tweets topics
8
brand-name data
8
data sets
8
street-name corpus
8

Similar Publications

Background: Data from the social media platform X (formerly Twitter) can provide insights into the types of language that are used when discussing drug use. In past research using latent Dirichlet allocation (LDA), we found that tweets containing "street names" of prescription drugs were difficult to classify due to the similarity to other colloquialisms and lack of clarity over how the terms were used. Conversely, "brand name" references were more amenable to machine-driven categorization.

View Article and Find Full Text PDF

Background: Social media is an important information source for a growing subset of the population and can likely be leveraged to provide insight into the evolving drug overdose epidemic. Twitter can provide valuable insight into trends, colloquial information available to potential users, and how networks and interactivity might influence what people are exposed to and how they engage in communication around drug use.

Objective: This exploratory study was designed to investigate the ways in which unsupervised machine learning analyses using natural language processing could identify coherent themes for tweets containing substance names.

View Article and Find Full Text PDF

Ergonomics and design: traffic sign and street name sign.

Work

January 2014

Department of Design and Graphic Expression, Federal University of Rio Grande do Sul, Osvaldo Aranha, 99 / 408, Porto Alegre, RS, Brasil.

This work proposes a design methodology using ergonomics and anthropometry concepts applied to traffic sign and street name sign projects. Initially, a literature revision on cognitive ergonomics and anthropometry is performed. Several authors and their design methodologies are analyzed and the aspects to be considered in projects of traffic and street name signs are selected and other specific aspects are proposed for the design methodology.

View Article and Find Full Text PDF

Can medical students identify recreational drugs by name?

QJM

December 2008

Guy's and St Thomas' Poisons Unit, Guy's and St Thomas' NHS Foundation Trust, Avonley Road, London, SE14 5ER, UK.

Background: Recreational drug toxicity is a common reason for presentation to the Emergency Department. Knowledge of recreational drug names is important to allow targeted assessment of patients presenting with recreational drug toxicity.

Aims: To assess final year medical student knowledge of proper and street names for recreational drugs.

View Article and Find Full Text PDF