Severity: Warning
Message: file_get_contents(https://...@gmail.com&api_key=61f08fa0b96a73de8c900d749fcb997acc09&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests
Filename: helpers/my_audit_helper.php
Line Number: 197
Backtrace:
File: /var/www/html/application/helpers/my_audit_helper.php
Line: 197
Function: file_get_contents
File: /var/www/html/application/helpers/my_audit_helper.php
Line: 271
Function: simplexml_load_file_from_url
File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3165
Function: getPubMedXML
File: /var/www/html/application/controllers/Detail.php
Line: 597
Function: pubMedSearch_Global
File: /var/www/html/application/controllers/Detail.php
Line: 511
Function: pubMedGetRelatedKeyword
File: /var/www/html/index.php
Line: 317
Function: require_once
98%
921
2 minutes
20
Background and purpose To cope with the continuous risk of sudden infectious diseases and achieve real-time monitoring of research trends, this paper proposes a new prediction framework that combines public attention indicators with medical preprint topic analysis. In view of the lag problem of traditional topic prediction methods, this paper introduces Google Trends data to improve the timeliness of prediction. Methods In this study, 18,060 COVID-19-related preprint abstracts were obtained from the medRxiv platform using web crawler technology. The unsupervised probabilistic modeling method, Latent Dirichlet Allocation (LDA), was used to extract the latent topic structure in the text. In order to analyze the dynamic relationship between research topic intensity and public attention, the Autoregressive Distributed Lag (ARDL) model, which can simultaneously process I(0) and I(1) time series, was introduced. Text data preprocessing included word segmentation, stop word removal, lemmatization, and synonym standardization. Time series data were aggregated by week, the original data were logarithmized, the Augmented Dickey-Fuller (ADF) unit root test was used to determine stationarity, and non-stationary variables were differenced. The models were implemented in Python and EViews10, respectively. Results Seven major research topics were identified through LDA modeling. ARDL analysis verified that there was a significant dynamic relationship between public search trends and topic intensity, and that the model had good predictive performance. Conclusion This study combined LDA with ARDL models to construct a real-time prediction method that can be used to track the evolution of medical preprint topics. This method has important theoretical and practical significance in the field of public health informatics and provides feasible predictive support for the monitoring and prevention of future infectious diseases.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12248262 | PMC |
http://dx.doi.org/10.7759/cureus.85773 | DOI Listing |