Stepwise classification of cancer samples using clinical and molecular data.

BMC Bioinformatics

Department of Epidemiology and Biostatistics, VU University Medical Center, Amsterdam, The Netherlands.

Published: October 2011


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background: Combining clinical and molecular data types may potentially improve prediction accuracy of a classifier. However, currently there is a shortage of effective and efficient statistical and bioinformatic tools for true integrative data analysis. Existing integrative classifiers have two main disadvantages: First, coarse combination may lead to subtle contributions of one data type to be overshadowed by more obvious contributions of the other. Second, the need to measure both data types for all patients may be both unpractical and (cost) inefficient.

Results: We introduce a novel classification method, a stepwise classifier, which takes advantage of the distinct classification power of clinical data and high-dimensional molecular data. We apply classification algorithms to two data types independently, starting with the traditional clinical risk factors. We only turn to relatively expensive molecular data when the uncertainty of prediction result from clinical data exceeds a predefined limit. Experimental results show that our approach is adaptive: the proportion of samples that needs to be re-classified using molecular data depends on how much we expect the predictive accuracy to increase when re-classifying those samples.

Conclusions: Our method renders a more cost-efficient classifier that is at least as good, and sometimes better, than one based on clinical or molecular data alone. Hence our approach is not just a classifier that minimizes a particular loss function. Instead, it aims to be cost-efficient by avoiding molecular tests for a potentially large subgroup of individuals; moreover, for these individuals a test result would be quickly available, which may lead to reduced waiting times (for diagnosis) and hence lower the patients distress. Stepwise classification is implemented in R-package stepwiseCM and available at the Bioconductor website.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3221726PMC
http://dx.doi.org/10.1186/1471-2105-12-422DOI Listing

Publication Analysis

Top Keywords

molecular data
24
clinical molecular
12
data
12
data types
12
stepwise classification
8
clinical data
8
molecular
7
clinical
6
classification cancer
4
cancer samples
4

Similar Publications

Analyzing the toxicological effects of PET-MPs on male infertility: Insights from network toxicology, mendelian randomization, and transcriptomics.

Reprod Biol

September 2025

Department of Obstetrics and Gynecology, The First Affiliated Hospital of Anhui Medical University, Hefei 230022, China; Engineering Research Center of Biopreservation and Artificial Organs, Ministry of Education, No 218 Jixi Road, Hefei Anhui230022, China; Key Laboratory of Population Health Across

Current research indicates that polyethylene terephthalate microplastics (PET-MPs) may significantly impair male reproductive function. This study aimed to investigate the potential molecular mechanisms underlying this impairment. Potential gene targets of PET-MPs were predicted via the SwissTargetPrediction database.

View Article and Find Full Text PDF

Clinicopathological features of dermal clear cell sarcoma: A series of 13 cases.

Pathol Res Pract

September 2025

Department of Pathology, Xijing Hospital and School of Basic Medicine, Fourth Military Medical University, Xi'an, China. Electronic address:

Background: Dermal clear cell sarcoma (DCCS) is a rare malignant mesenchymal neoplasm. Owing to the overlaps in its morphological and immunophenotypic profiles with a broad spectrum of tumors exhibiting melanocytic differentiation, it is frequently misdiagnosed as other tumor entities in clinical practice. By systematically analyzing the clinicopathological characteristics, immunophenotypic features, and molecular biological properties of DCCS, this study intends to further enhance pathologists' understanding of this disease and provide a valuable reference for its accurate diagnosis.

View Article and Find Full Text PDF

The calculation of the highest occupied molecular orbital-lowest unoccupied molecular orbital (HOMO-LUMO) gap for chemical molecules is computationally intensive using quantum mechanics (QM) methods, while experimental determination is often costly and time-consuming. Machine Learning (ML) offers a cost-effective and rapid alternative, enabling efficient predictions of HOMO-LUMO gap values across large data sets without the need for extensive QM computations or experiments. ML models facilitate the screening of diverse molecules, providing valuable insights into complex chemical spaces and integrating seamlessly into high-throughput workflows to prioritize candidates for experimental validation.

View Article and Find Full Text PDF

Background: Crohn's disease (CD) and rheumatoid arthritis (RA) are autoimmune diseases. CD is known to be closely associated with RA. However, the mechanisms underlying these relationships remain unclear.

View Article and Find Full Text PDF

Background: Pulmonary hypertension (PH) is a systemic illness with increasingly subtle disease manifestations including sleep disruption. Patients with PH are at increased risk for disturbances in circadian biology, although to date there is no data on "morningness" or "eveningness" in pulmonary vascular disease.

Research Questions: Our group studied circadian rhythms in PH patients based upon chronotype analysis, to explore whether there is a link between circadian parameters and physiologic risk-stratifying factors to inform novel treatment strategies in patients with PH?

Study Design And Methods: We serially recruited participants from July 2022 to March 2024, administering in clinic the Munich Chronotype Questionnaire (MCTQ).

View Article and Find Full Text PDF