Publications by authors named "Patrick R Alba"

The characterization of patient smoking history is an integral part of evidence-based cancer care. We developed a schema to fully describe longitudinal history and quantified behavioral patterns for smoking consumption, including substance, modality, recency, quantity and cessation efforts, to support extraction from free-text patient histories.

View Article and Find Full Text PDF

Native Hawaiian and Pacific Islander (NHPI) populations are often aggregated into broad racial categories, obscuring potential disparities. This study leverages an expanded race/ethnicity lexicon and natural language processing (NLP) to identify documentation of NHPI subgroups to address gaps in electronic health records' (EHRs) recorded race. Results demonstrate the potential of NLP to classify NHPI documentation, disaggregate legacy categories, and improve health equity by incorporating more detailed subgroup data into standardized healthcare data sets.

View Article and Find Full Text PDF

Balancing operational feasibility with the performance of natural language processing (NLP) systems is a significant challenge. This study presents a hybrid strategy to integrate manually curated rules, small language model (SLM), and large language model (LLM) for cohort identification tasks. This approach demonstrates superior performance in terms of both computational efficiency and NLP validity, as shown here in two separate tasks using large number of clinical notes from the US Department of Veteran Affairs (VA) Healthcare system.

View Article and Find Full Text PDF

This study presents the development and evaluation of an annotation schema and rule-based natural language processing (NLP) system for extracting key melanoma pathology concepts from surgical pathology reports. Achieving high precision and recall, our system addresses melanoma's complex staging criteria for use in downstream staging and cohort recruitment operational needs.

View Article and Find Full Text PDF

In order to utilize clinical notes for research studies, it is necessary to identify the most relevant notes. Mapping to the LOINC Document Ontology makes this process easier by reducing the variability of note types. We experimented with three models to automatically identify LOINC DO entities in VA note titles.

View Article and Find Full Text PDF

Purpose: This study aims to develop a robust methodology using structured and semistructured health data to identify patients who have undergone radiation therapy, thereby facilitating future research on treatment outcomes.

Methods And Materials: In this retrospective cohort study, we identified Veterans receiving radiation oncology care through documentation of referrals, encounters, and billing codes from 2014 to 2023. We classified administrative codes based on the process of care and type of radiation received and then analyzed utilization patterns.

View Article and Find Full Text PDF

Purpose: This study introduces an integrated approach using structured and unstructured data from an electronic health record to identify and characterize patient utilization of hereditary cancer genetic testing among patients with metastatic castration-resistant prostate cancer (mCRPC). Secondary objectives were to describe factors associated with the receipt of testing.

Methods: This retrospective cohort study included a cohort of Veterans diagnosed with mCRPC from January 2016 to December 2021.

View Article and Find Full Text PDF

Background: Germline genetic testing is a vital component of guideline-recommended cancer care for males with pancreatic, breast, or metastatic prostate cancers. We sought to determine whether there were racial disparities in germline genetic testing completion in this population.

Patients And Methods: This retrospective cohort study included non-Hispanic White and Black males with incident pancreatic, breast, or metastatic prostate cancers between January 1, 2019, and September 30, 2021.

View Article and Find Full Text PDF

Objective: To use natural language processing (NLP) of clinical notes to augment existing structured electronic health record (EHR) data for classification of a patient's menopausal status.

Materials And Methods: A rule-based NLP system was designed to capture evidence of a patient's menopause status including dates of a patient's last menstrual period, reproductive surgeries, and postmenopause diagnosis as well as their use of birth control and menstrual interruptions. NLP-derived output was used in combination with structured EHR data to classify a patient's menopausal status.

View Article and Find Full Text PDF

Electronic Nicotine Delivery Systems (ENDS) use has increased substantially in the United States since 2010. To date, there is limited evidence regarding the nature and extent of ENDS documentation in the clinical note. In this work we investigate the effectiveness of different approaches to identify a patient's documented ENDS use.

View Article and Find Full Text PDF

Standardized operational definitions are an important tool to improve reproducibility of research using secondary real-world healthcare data. This approach was leveraged for studies evaluating the effectiveness of AZD7442 as COVID-19 pre-exposure prophylaxis across multiple healthcare systems. Value sets were defined, grouped, and mapped.

View Article and Find Full Text PDF

Natural language processing (NLP) tools can automate the identification of cancer patients eligible for specific pathways. We developed and validated a cancer agnostic, rules-based NLP framework to extract the dimensions and measurements of several concepts from pathology and radiology reports. This framework was then efficiently and cost-effectively deployed to identify patients eligible for breast, lung, and prostate cancers clinical pathways.

View Article and Find Full Text PDF

Background: Pragmatic trials are gaining popularity as a cost-effective way to examine treatment effectiveness and generate timely comparative evidence. Incorporating supplementary real-world data is recommended for robust outcome monitoring. However, detailed operational guidelines are needed to inform effective use and integration of heterogeneous databases.

View Article and Find Full Text PDF

Background: Although Black men are more likely than non-Hispanic White men to develop and die from prostate cancer, limited data exist to guide prostate-specific antigen (PSA) screening protocols in Black men. This study investigated whether the risk for prostate cancer was higher than expected among self-identified Black than White veterans based on prebiopsy PSA level.

Methods: Multivariable logistic regression models were estimated to predict the likelihood of prostate cancer diagnosis on first biopsy for 75,295 Black and 207,658 White male veterans.

View Article and Find Full Text PDF

Purpose: Several novel therapies for castration-resistant prostate cancer (CRPC) have been approved with randomized phase III studies with continuing observational research either planned or ongoing. Accurately identifying patients with CRPC in electronic health care data is critical for quality observational research, resource allocation, and quality improvement. Previous work in this area has relied on either structured laboratory results and medication data or natural language processing (NLP) methods.

View Article and Find Full Text PDF

Objective: This article summarizes our approach to extracting medication and corresponding attributes from clinical notes, which is the focus of track 1 of the 2022 National Natural Language Processing (NLP) Clinical Challenges(n2c2) shared task.

Methods: The dataset was prepared using Contextualized Medication Event Dataset (CMED), including 500 notes from 296 patients. Our system consisted of three components: medication named entity recognition (NER), event classification (EC), and context classification (CC).

View Article and Find Full Text PDF

Black Veterans have higher a incidence of localized and metastatic prostate cancer compared to White Veterans yet are underrepresented in reports of frequencies of somatic and germline alterations. This retrospective analysis of somatic and putative germline alterations was conducted in a large cohort of Veterans with prostate cancer (N = 835 Black, 1613 White) who underwent next generation sequencing through the VA Precision Oncology Program, which facilitates molecular testing for Veterans with metastatic cancer. No differences were observed in gene alterations for FDA approved targetable therapies (13.

View Article and Find Full Text PDF

Importance: Reported risk of incident peripheral artery disease (PAD) by sex and race varies significantly and has not been reported in national cohorts among individuals free of baseline PAD.

Objective: To evaluate the association of sex and race, as well as prevalent cardiovascular risk factors, with limb outcomes in a national cohort of people with normal baseline ankle-brachial indices (ABIs).

Design, Setting, And Participants: This cohort study was conducted using data from participants in the Veterans Affairs Birth Cohort Study (born 1945-1965), with follow-up data between January 1, 2000, and December 31, 2016.

View Article and Find Full Text PDF

Background: Genetic scores may provide an objective measure of prostate cancer risk and thus inform screening decisions. We evaluated whether a polygenic hazard score based on 290 genetic variants (PHS290) is associated with prostate cancer risk in a diverse population, including Black men, who have higher average risk of prostate cancer death but are often treated as a homogeneously high-risk group.

Methods: This was a retrospective analysis of the Million Veteran Program, a national, population-based cohort study of US military veterans conducted 2011-2021.

View Article and Find Full Text PDF

Importance: There is controversy about the benefit of prostate-specific antigen (PSA) screening. Prostate-specific antigen screening rates have decreased since 2008 in the US, and the incidence of metastatic prostate cancer has increased. However, there is no direct epidemiologic evidence of a correlation between population PSA screening rates and subsequent metastatic prostate cancer rates.

View Article and Find Full Text PDF

Despite impressive success of machine learning algorithms in clinical natural language processing (cNLP), rule-based approaches still have a prominent role. In this paper, we introduce medspaCy, an extensible, open-source cNLP library based on spaCy framework that allows flexible integration of rule-based and machine learning-based algorithms adapted to clinical text. MedspaCy includes a variety of components that meet common cNLP needs such as context analysis and mapping to standard terminologies.

View Article and Find Full Text PDF

Use of Electronic Nicotine Delivery Systems (ENDS, colloquially known as "electronic cigarettes") has increased substantially in the United States in the decade since 2010. However, currently relatively little is known regarding the documentation of ENDS use in clinical notes. With this study, we describe the development of an annotation scheme (and associated annotated corpus) consisting of 4,351 ENDS mentions derived from Department of Veterans Affairs clinical notes during the period 2010-2020.

View Article and Find Full Text PDF

Background: Deaths from pneumonia were decreasing globally prior to the COVID-19 pandemic, but it is unclear whether this was due to changes in patient populations, illness severity, diagnosis, hospitalization thresholds, or treatment. Using clinical data from the electronic health record among a national cohort of patients initially diagnosed with pneumonia, we examined temporal trends in severity of illness, hospitalization, and short- and long-term deaths.

Design: Retrospective cohort PARTICIPANTS: All patients >18 years presenting to emergency departments (EDs) at 118 VA Medical Centers between 1/1/2006 and 12/31/2016 with an initial clinical diagnosis of pneumonia and confirmed by chest imaging report.

View Article and Find Full Text PDF

Importance: Prostate cancer (PCa) disproportionately affects African American men, but research evaluating the extent of racial and ethnic disparities across the PCa continuum in equal-access settings remains limited at the national level. The US Department of Veterans Affairs (VA) Veterans Hospital Administration health care system offers a setting of relatively equal access to care in which to assess racial and ethnic disparities in self-identified African American (or Black) veterans and White veterans.

Objective: To determine the extent of racial and ethnic disparities in the incidence of PCa, clinical stage, and outcomes between African American patients and White patients who received a diagnosis or were treated at a VA hospital.

View Article and Find Full Text PDF

Background: Dexamethasone decreases mortality in coronavirus disease 2019 (COVID-19) patients on intensive respiratory support (IRS) but is of uncertain benefit if less severely ill. We determined whether early (within 48 h) dexamethasone was associated with mortality in patients hospitalised with COVID-19 not on IRS.

Methods: We included patients admitted to US Veterans Affairs hospitals between 7 June 2020 and 31 May 2021 within 14 days after a positive test for severe acute respiratory syndrome coronavirus 2.

View Article and Find Full Text PDF