98%
921
2 minutes
20
We introduce a within-sample SNP calling method, called the "butterfly method", that improves the quality of SNP calling with the Illumina Infinium Omni5-4 SNP Kit. This was done by improving how no-calls are determined from allele signal intensities. High confidence of SNP allele calling is extremely important in forensic genetics and clinical diagnostics. This paper is accompanied by two open-source R packages, omni54manifest and snpbeadchip that make SNP calling easy by helping with bookkeeping and giving easy access to meta-information about the SNPs typed with the Illumina Infinium Omni5-4 Kit (including chromosome, probe type, and SNP bases). We compared the results from our method with those obtained with the Illumina GenomeStudio software (which does not provide sample and SNP specific genotype probabilities or other quality measures), and with whole-genome sequencing (WGS). Given the signal intensities, the SNP calling quality was optimised using a threshold for the a posteriori probability of a SNP belonging to a SNP cluster. By lowering the a posteriori probability threshold for no-calls, we obtained a higher call rate than GenomeStudio. Using a higher a posteriori probability threshold, we achieved a higher concordance with the WGS data than GenomeStudio. Our method had SNP call and concordance rates with WGS data of approximately 99%.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9556601 | PMC |
http://dx.doi.org/10.1038/s41598-022-22162-8 | DOI Listing |
Sci Justice
September 2025
Departamento de Medicina Legal, Bioética, Medicina do Trabalho e Medicina Física e Reabilitação, Faculdade de Medicina FMUSP, Universidade de São Paulo, São Paulo, SP, Brazil. Electronic address:
Short Tandem Repeats (STRs) are the standard technique used in forensic genetics for individual identification due to their high polymorphism and robustness. Although Capillary Electrophoresis (CE) enables the analysis of many STRs, Next-Generation Sequencing (NGS) offers enhanced resolution and the ability to detect STRs' isoalleles and their flanking regions, enhancing the discrimination power of this analysis. Despite the fact that STR kits for NGS are well standardized for evaluating forensic samples, there is no data on their effectiveness in differentiating monozygotic (MZ) twins, which are indistinguishable by CE.
View Article and Find Full Text PDFJ Parasitol Res
August 2025
Department of Zoology, Entomology and Fisheries Sciences, College of Natural Sciences, Makerere University, Kampala, Uganda.
Bat guano may contain zoonotic parasites that contaminate the environment and/or serve as a potential source of infection to humans and animals. Repeated bat-human exposure could be a risk factor for zoonosis. To date, knowledge on the status of bat gastrointestinal parasites (GIPs) in Uganda is limited.
View Article and Find Full Text PDFmBio
August 2025
School of Biomolecular and Biomedical Sciences, Conway Institute, University College Dublin, Dublin, Ireland.
is an opportunistic yeast pathogen that can cause life-threatening infections in immunocompromised humans. Whole-genome sequencing studies of the species have demonstrated remarkably low diversity, with strains typically differing by about 1.5 single nucleotide polymorphisms (SNPs) per 10 kb.
View Article and Find Full Text PDFNat Genet
August 2025
Gregor Mendel Institute, Austrian Academy of Sciences, Vienna, Austria.
Making sense of whole-genome polymorphism data is challenging, but it is essential for overcoming the biases in SNP data. Here we analyze 27 genomes of Arabidopsis thaliana to illustrate these issues. Genome size variation is mostly due to tandem repeat regions that are difficult to assemble.
View Article and Find Full Text PDFBrief Bioinform
July 2025
Faculty of Computer Science and Control Engineering, Shenzhen University of Advanced Technology, Shenzhen, 518000, Guangdong, China.
Accurate calling of parental-child SNPs and Indels in family trios is very helpful for understanding genetic traits and diseases. Indel calling is even more important than SNP calling, as Indels may have led to substantial changes in protein structures that affect more of the traits of the organism. However, the best Indel calling methods have recall rates below 85%, precision below 92%, and F1 below 88% on $60\times $ ONT Q20 data, much lower than their SNP calling's recall performance of 99.
View Article and Find Full Text PDF