Background: Popularized by ChatGPT, large language models (LLMs) are poised to transform the scalability of clinical natural language processing (NLP) downstream tasks such as medical question answering (MQA) and automated data extraction from clinical narrative reports. However, the use of LLMs in the health care setting is limited by cost, computing power, and patient privacy concerns. Specifically, as interest in LLM-based clinical applications grows, regulatory safeguards must be established to avoid exposure of patient data through the public domain.
View Article and Find Full Text PDFIEEE Trans Image Process
March 2025
Ridge detection is a classical tool to extract curvilinear features in image processing. As such, it has great promise in applications to material science problems; specifically, for trend filtering relatively stable atom-shaped objects in image sequences, such as bright-field Transmission Electron Microscopy (TEM) videos. Standard analysis of TEM videos is limited to frame-by-frame object recognition.
View Article and Find Full Text PDFMaterials functionalities may be associated with atomic-level structural dynamics occurring on the millisecond timescale. However, the capability of electron microscopy to image structures with high spatial resolution and millisecond temporal resolution is often limited by poor signal-to-noise ratios. With an unsupervised deep denoising framework, we observed metal nanoparticle surfaces (platinum nanoparticles on cerium oxide) in a gas environment with time resolutions down to 10 milliseconds at a moderate electron dose.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
February 2025
For many countries in the Global South traditional poverty estimates are available only infrequently and at coarse spatial resolutions, if at all. This limits decision-makers' and analysts' ability to target humanitarian and development interventions and makes it difficult to study relationships between poverty and other natural and human phenomena at finer spatial scales. Advances in Earth observation and machine learning-based methods have proven capable of generating more granular estimates of relative asset wealth indices.
View Article and Find Full Text PDFVector AutoRegressive Moving Average (VARMA) models form a powerful and general model class for analyzing dynamics among multiple time series. While VARMA models encompass the Vector AutoRegressive (VAR) models, their popularity in empirical applications is dominated by the latter. Can this phenomenon be explained fully by the simplicity of VAR models? Perhaps many users of VAR models have not fully appreciated what VARMA models can provide.
View Article and Find Full Text PDFWe examine the use of time series data, derived from Electric Cell-substrate Impedance Sensing (ECIS), to differentiate between standard mammalian cell cultures and those infected with a mycoplasma organism. With the goal of easy visualization and interpretation, we perform low-dimensional feature-based classification, extracting application-relevant features from the ECIS time courses. We can achieve very high classification accuracy using only two features, which depend on the cell line under examination.
View Article and Find Full Text PDFBackground: Machine learning has been increasingly used to develop algorithms that can improve medical diagnostics and prognostication and has shown promise in improving the classification of thyroid ultrasound images. This proof-of-concept study aims to develop a multimodal machine-learning model to classify follicular carcinoma from adenoma.
Methods: This is a retrospective study of patients with follicular adenoma or carcinoma at a single institution between 2010 and 2022.
J Phys Chem A
July 2023
Markov State Models (MSM) and related techniques have gained significant traction as a tool for analyzing and guiding molecular dynamics (MD) simulations due to their ability to extract structural, thermodynamic, and kinetic information on proteins using computationally feasible MD simulations. The MSM analysis often relies on spectral decomposition of empirically generated transition matrices. This work discusses an alternative approach for extracting the thermodynamic and kinetic information from the so-called rate/generator matrix rather than the transition matrix.
View Article and Find Full Text PDFJ Am Stat Assoc
August 2021
Comput Stat Data Anal
July 2022
Independent component analysis (ICA) is an unsupervised learning method popular in functional magnetic resonance imaging (fMRI). Group ICA has been used to search for biomarkers in neurological disorders including autism spectrum disorder and dementia. However, current methods use a principal component analysis (PCA) step that may remove low-variance features.
View Article and Find Full Text PDFSpatially resolved in situ transmission electron microscopy (TEM), equipped with direct electron detection systems, is a suitable technique to record information about the atom-scale dynamics with millisecond temporal resolution from materials. However, characterizing dynamics or fluxional behavior requires processing short time exposure images which usually have severely degraded signal-to-noise ratios. The poor signal-to-noise associated with high temporal resolution makes it challenging to determine the position and intensity of atomic columns in materials undergoing structural dynamics.
View Article and Find Full Text PDFPLoS One
December 2021
Advances in remote sensing and machine learning enable increasingly accurate, inexpensive, and timely estimation of poverty and malnutrition indicators to guide development and humanitarian agencies' programming. However, state of the art models often rely on proprietary data and/or deep or transfer learning methods whose underlying mechanics may be challenging to interpret. We demonstrate how interpretable random forest models can produce estimates of a set of (potentially correlated) malnutrition and poverty prevalence measures using free, open access, regularly updated, georeferenced data.
View Article and Find Full Text PDFThe electric power grid is a critical societal resource connecting multiple infrastructural domains such as agriculture, transportation, and manufacturing. The electrical grid as an infrastructure is shaped by human activity and public policy in terms of demand and supply requirements. Further, the grid is subject to changes and stresses due to diverse factors including solar weather, climate, hydrology, and ecology.
View Article and Find Full Text PDFData Min Knowl Discov
June 2021
The ability to accurately and consistently discover anomalies in time series is important in many applications. Fields such as finance (fraud detection), information security (intrusion detection), healthcare, and others all benefit from anomaly detection. Intuitively, anomalies in time series are time points or sequences of time points that deviate from normal behavior characterized by periodic oscillations and long-term trends.
View Article and Find Full Text PDFStat (Int Stat Inst)
December 2020
Int J Biostat
December 2019
We present new methods for cell line classification using multivariate time series bioimpedance data obtained from electric cell-substrate impedance sensing (ECIS) technology. The ECIS technology, which monitors the attachment and spreading of mammalian cells in real time through the collection of electrical impedance data, has historically been used to study one cell line at a time. However, we show that if applied to data from multiple cell lines, ECIS can be used to classify unknown or potentially mislabeled cells, factors which have previously been associated with the reproducibility crisis in the biological literature.
View Article and Find Full Text PDFNeuroimage
November 2016
Estimating spatiotemporal models for multi-subject fMRI is computationally challenging. We propose a mixed model for localization studies with spatial random effects and time-series errors. We develop method-of-moment estimators that leverage population and spatial information and are scalable to massive datasets.
View Article and Find Full Text PDFWe examine differences between independent component analyses (ICAs) arising from different assumptions, measures of dependence, and starting points of the algorithms. ICA is a popular method with diverse applications including artifact removal in electrophysiology data, feature extraction in microarray data, and identifying brain networks in functional magnetic resonance imaging (fMRI). ICA can be viewed as a generalization of principal component analysis (PCA) that takes into account higher-order cross-correlations.
View Article and Find Full Text PDF