98%
921
2 minutes
20
Many real-world image recognition problems, such as diagnostic medical imaging exams, are "long-tailed" - there are a few common findings followed by many more relatively rare conditions. In chest radiography, diagnosis is both a long-tailed and multi-label problem, as patients often present with multiple findings simultaneously. While researchers have begun to study the problem of long-tailed learning in medical image recognition, few have studied the interaction of label imbalance and label co-occurrence posed by long-tailed, multi-label disease classification. To engage with the research community on this emerging topic, we conducted an open challenge, CXR-LT, on long-tailed, multi-label thorax disease classification from chest X-rays (CXRs). We publicly release a large-scale benchmark dataset of over 350,000 CXRs, each labeled with at least one of 26 clinical findings following a long-tailed distribution. We synthesize common themes of top-performing solutions, providing practical recommendations for long-tailed, multi-label medical image classification. Finally, we use these insights to propose a path forward involving vision-language foundation models for few- and zero-shot disease classification.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11365790 | PMC |
http://dx.doi.org/10.1016/j.media.2024.103224 | DOI Listing |
IEEE Trans Comput Biol Bioinform
January 2025
With the rapid growth of high-resolution microscopy imaging data, current protein subcellular localization methods often face the problem of imbalanced data with long-tailed distributions in large-scale protein data. To address this challenge, this paper proposes a self-supervised pre-training method called MC-MSTLoc. Aiming to maximize feature consistency and inconsistency of microscopy imaging data, the pre-training scheme is proposed based on contrastive task at scale and view levels, which substantially improves the quality of the learned feature representations.
View Article and Find Full Text PDFMed Image Anal
July 2025
Department of Population Health Sciences, Weill Cornell Medicine, NY, USA. Electronic address:
The CXR-LT series is a community-driven initiative designed to enhance lung disease classification using chest X-rays (CXR). It tackles challenges in open long-tailed lung disease classification and enhances the measurability of state-of-the-art techniques. The first event, CXR-LT 2023, aimed to achieve these goals by providing high-quality benchmark CXR data for model development and conducting comprehensive evaluations to identify ongoing issues impacting lung disease classification performance.
View Article and Find Full Text PDFArXiv
June 2025
Department of Population Health Sciences, Weill Cornell Medicine, New York, USA.
The CXR-LT series is a community-driven initiative designed to enhance lung disease classification using chest X-rays (CXR). It tackles challenges in open long-tailed lung disease classification and enhances the measurability of state-of-the-art techniques. The first event, CXR-LT 2023, aimed to achieve these goals by providing high-quality benchmark CXR data for model development and conducting comprehensive evaluations to identify ongoing issues impacting lung disease classification performance.
View Article and Find Full Text PDFMed Image Anal
October 2024
Department of Population Health Sciences, Weill Cornell Medicine, 10065, New York, NY, USA. Electronic address:
Many real-world image recognition problems, such as diagnostic medical imaging exams, are "long-tailed" - there are a few common findings followed by many more relatively rare conditions. In chest radiography, diagnosis is both a long-tailed and multi-label problem, as patients often present with multiple findings simultaneously. While researchers have begun to study the problem of long-tailed learning in medical image recognition, few have studied the interaction of label imbalance and label co-occurrence posed by long-tailed, multi-label disease classification.
View Article and Find Full Text PDFNeural Netw
March 2024
Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen, China; School of Data Science, The Chinese University of Hong Kong, Shenzhen, China.
Document-level relation extraction faces two often overlooked challenges: long-tail problem and multi-label problem. Previous work focuses mainly on obtaining better contextual representations for entity pairs, hardly address the above challenges. In this paper, we analyze the co-occurrence correlation of relations, and introduce it into the document-level relation extraction task for the first time.
View Article and Find Full Text PDF