Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

We introduce a within-sample SNP calling method, called the "butterfly method", that improves the quality of SNP calling with the Illumina Infinium Omni5-4 SNP Kit. This was done by improving how no-calls are determined from allele signal intensities. High confidence of SNP allele calling is extremely important in forensic genetics and clinical diagnostics. This paper is accompanied by two open-source R packages, omni54manifest and snpbeadchip that make SNP calling easy by helping with bookkeeping and giving easy access to meta-information about the SNPs typed with the Illumina Infinium Omni5-4 Kit (including chromosome, probe type, and SNP bases). We compared the results from our method with those obtained with the Illumina GenomeStudio software (which does not provide sample and SNP specific genotype probabilities or other quality measures), and with whole-genome sequencing (WGS). Given the signal intensities, the SNP calling quality was optimised using a threshold for the a posteriori probability of a SNP belonging to a SNP cluster. By lowering the a posteriori probability threshold for no-calls, we obtained a higher call rate than GenomeStudio. Using a higher a posteriori probability threshold, we achieved a higher concordance with the WGS data than GenomeStudio. Our method had SNP call and concordance rates with WGS data of approximately 99%.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9556601PMC
http://dx.doi.org/10.1038/s41598-022-22162-8DOI Listing

Publication Analysis

Top Keywords

snp calling
16
snp
12
illumina infinium
12
infinium omni5-4
12
posteriori probability
12
snp allele
8
allele calling
8
calling illumina
8
signal intensities
8
probability threshold
8

Similar Publications

Discriminatory power of the Precision ID GlobalFiler™ NGS STR panel v2 in monozygotic twins for forensic applications.

Sci Justice

September 2025

Departamento de Medicina Legal, Bioética, Medicina do Trabalho e Medicina Física e Reabilitação, Faculdade de Medicina FMUSP, Universidade de São Paulo, São Paulo, SP, Brazil. Electronic address:

Short Tandem Repeats (STRs) are the standard technique used in forensic genetics for individual identification due to their high polymorphism and robustness. Although Capillary Electrophoresis (CE) enables the analysis of many STRs, Next-Generation Sequencing (NGS) offers enhanced resolution and the ability to detect STRs' isoalleles and their flanking regions, enhancing the discrimination power of this analysis. Despite the fact that STR kits for NGS are well standardized for evaluating forensic samples, there is no data on their effectiveness in differentiating monozygotic (MZ) twins, which are indistinguishable by CE.

View Article and Find Full Text PDF

Bat guano may contain zoonotic parasites that contaminate the environment and/or serve as a potential source of infection to humans and animals. Repeated bat-human exposure could be a risk factor for zoonosis. To date, knowledge on the status of bat gastrointestinal parasites (GIPs) in Uganda is limited.

View Article and Find Full Text PDF

is an opportunistic yeast pathogen that can cause life-threatening infections in immunocompromised humans. Whole-genome sequencing studies of the species have demonstrated remarkably low diversity, with strains typically differing by about 1.5 single nucleotide polymorphisms (SNPs) per 10 kb.

View Article and Find Full Text PDF

Making sense of whole-genome polymorphism data is challenging, but it is essential for overcoming the biases in SNP data. Here we analyze 27 genomes of Arabidopsis thaliana to illustrate these issues. Genome size variation is mostly due to tandem repeat regions that are difficult to assemble.

View Article and Find Full Text PDF

Accurate calling of parental-child SNPs and Indels in family trios is very helpful for understanding genetic traits and diseases. Indel calling is even more important than SNP calling, as Indels may have led to substantial changes in protein structures that affect more of the traits of the organism. However, the best Indel calling methods have recall rates below 85%, precision below 92%, and F1 below 88% on $60\times $ ONT Q20 data, much lower than their SNP calling's recall performance of 99.

View Article and Find Full Text PDF