Range of Radiologist Performance in a Population-based Screening Cohort of 1 Million Digital Mammography Examinations.

Radiology

From the Departments of Pathology and Oncology (M.S., F.S.), Physiology and Pharmacology (K.D., P.L.), and Medical Epidemiology and Biostatistics (M.E.), Karolinska Institute, Stockholm, Sweden; Department of Radiology (M.S.) and Breast Radiology (F.S.), Karolinska University Hospital, Dalagatan 90,

Published: October 2020


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background There is great interest in developing artificial intelligence (AI)-based computer-aided detection (CAD) systems for use in screening mammography. Comparative performance benchmarks from true screening cohorts are needed. Purpose To determine the range of human first-reader performance measures within a population-based screening cohort of 1 million screening mammograms to gauge the performance of emerging AI CAD systems. Materials and Methods This retrospective study consisted of all screening mammograms in women aged 40-74 years in Stockholm County, Sweden, who underwent screening with full-field digital mammography between 2008 and 2015. There were 110 interpreting radiologists, of whom 24 were defined as high-volume readers (ie, those who interpreted more than 5000 annual screening mammograms). A true-positive finding was defined as the presence of a pathology-confirmed cancer within 12 months. Performance benchmarks included sensitivity and specificity, examined per quartile of radiologists' performance. First-reader sensitivity was determined for each tumor subgroup, overall and by quartile of high-volume reader sensitivity. Screening outcomes were examined based on the first reader's sensitivity quartile with 10 000 screening mammograms per quartile. Linear regression models were fitted to test for a linear trend across quartiles of performance. Results A total of 418 041 women (mean age, 54 years ± 10 [standard deviation]) were included, and 1 186 045 digital mammograms were evaluated, with 972 899 assessed by high-volume readers. Overall sensitivity was 73% (95% confidence interval [CI]: 69%, 77%), and overall specificity was 96% (95% CI: 95%, 97%). The mean values per quartile of high-volume reader performance ranged from 63% to 84% for sensitivity and from 95% to 98% for specificity. The sensitivity difference was very large for basal cancers, with the least sensitive and most sensitive high-volume readers detecting 53% and 89% of cancers, respectively ( < .001). Conclusion Benchmarks showed a wide range of performance differences between high-volume readers. Sensitivity varied by tumor characteristics. © RSNA, 2020

Download full-text PDF

Source
http://dx.doi.org/10.1148/radiol.2020192212DOI Listing

Publication Analysis

Top Keywords

screening mammograms
16
high-volume readers
16
screening
10
performance
9
population-based screening
8
screening cohort
8
digital mammography
8
cad systems
8
performance benchmarks
8
sensitivity
8

Similar Publications

Purpose: The ability to accurately detect and characterize intramammary micro- and macrocalcifications without ionized radiation has significant clinical implications for early breast cancer assessment. The aim of this prospective study was to investigate the feasibility of detecting intramammary calcifications using 3D multi-echo gradient echo (ME-GRE) magnitude and true susceptibility-weighted images (tSWI) compared to digital mammography (DM) in patients with different breast sizes and densities of breast parenchyma at 1.5T.

View Article and Find Full Text PDF

Cost-effectiveness of genetic risk-stratified screening for breast cancer in Taiwan.

Breast

August 2025

Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan; Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei, Taiwan. Electronic address:

Background: Risk-stratified breast screening has gained international attention, as individualized risk assessments can inform screening initiation, frequency, and whether to screen. In this study, we evaluated the cost-effectiveness of risk-stratified screening based on genetic testing for breast cancer-associated single nucleotide polymorphisms (SNPs) compared to the current age-based screening program in Taiwan.

Methods: A Markov model was used to estimate lifetime health outcomes and costs for 35-year-old Taiwanese women without a family history of breast cancer.

View Article and Find Full Text PDF

Background: Breast cancer is the most common cancer among women and a leading cause of mortality in Europe. Early detection through screening reduces mortality, yet participation in mammography-based programs remains suboptimal due to discomfort, radiation exposure, and accessibility issues. Thermography, particularly when driven by artificial intelligence (AI), is being explored as a noninvasive, radiation-free alternative.

View Article and Find Full Text PDF

The 2019 novel coronavirus disease (COVID-19) has brought to the forefront racial disparities in health outcomes across the US, but there is limited formal analysis into factors associated with these disparities. In-depth examination of COVID-19 disparities has been challenging due to inconsistent case definition, isolation procedures, and incomplete racial and medical information. As of June 2020, over 14,000 (25%) confirmed COVID-19 cases in Georgia did not have racial information.

View Article and Find Full Text PDF

Background: In contrast-enhanced digital mammography (CEDM) and contrast-enhanced digital breast tomosynthesis (CEDBT), low-energy (LE) and high-energy (HE) images are acquired after injection of iodine contrast agent. Weighted subtraction is then applied to generate dual-energy (DE) images, where normal breast tissues are suppressed, leaving iodinated objects enhanced. Currently, clinical systems employ a dual-shot (DS) method, where LE and HE images are acquired with two separate exposures.

View Article and Find Full Text PDF