98%
921
2 minutes
20
Low-coverage sequencing refers to sequencing DNA of individuals to a low depth of coverage (e.g., 0.5X) and imputing that sequence to genomic sequence based on reference haplotypes from individuals sequenced to high depth of coverage (e.g., ≥ 10X). It has been proposed as an alternative to genotyping by SNP arrays. At least one commercial product based on it is available for agricultural species. Concerns limiting adoption in its current form are: 1) the cost of storing the huge volume of data it generates and 2) whether that additional data will result in improved accuracy of genetic evaluation. This work envisions future implementation of low-coverage sequencing to reduce storage costs and enhance genetic evaluations by leveraging the additional information in the full sequence of the pangenome to account for more genetic variation. We propose addressing the storage issue by representing genomic sequence of an individual in a pair of haplotype arrays with each element pointing to an enumerated haplotype of the sequence within one of approximately 50,000 defined genome segments. Assuming 60 million genomic variants, the infrastructure required to translate the identifier of any enumerated haplotype into its genomic sequence would require less than 10 gigabytes of binary storage. Each haplotype array element would require 2 bytes, so the marginal binary storage required to represent the genomic sequence of an individual would be about 200 kilobytes (KB), similar to the genotypes from a SNP array with 200,000 markers. This assumes no pedigree and no ambiguity of the imputation, though the latter is unrealistic. Strategies to minimize, and when necessary, to manage and efficiently represent ambiguity are proposed. The genomic sequence of an individual could be stored in about 1 KB (binary) if both parents have unambiguous sequence stored as described above. The proposed system for representing the pangenome includes algorithms for read mapping and imputation intended to leverage all known genetic variation in the target population. It is also designed to use sequencing reads generated for imputing genomic sequence of new individuals to identify unrecognized mutations, crossovers, and structural variants, thus continuously improving the genome representation, especially if widespread use of low-coverage sequencing in livestock industries is realized. This could make improved genetic merit and management of livestock feasible without computational burden.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1093/jas/skaf294 | DOI Listing |
Eur J Clin Microbiol Infect Dis
September 2025
School of Bioengineering and Biosciences, Department of Biochemistry, Lovely Professional University, Punjab, 144411, India.
Purpose: This study investigates codon usage and amino acid usage bias in the genus Acinetobacter to uncover the evolutionary forces shaping these patterns and their implications for pathogenicity and biotechnology.
Methods: Codon usage patterns were examined in representative genomes of the genus Acinetobacter using standard codon bias indices, including GC content, relative synonymous codon usage (RSCU), effective number of codons (ENC), and codon adaptation index (CAI). Neutrality and parity plots were employed to evaluate the relative influence of mutational pressure and natural selection on codon preferences.
Planta
September 2025
Department of Biology, University of Naples Federico II, Via Cinthia 26, 80126, Naples, Italy.
The first complete plastid genome of the critically endangered species Valeriana trinervis was sequenced, assembled and compared with other published Valeriana plastomes. In this study, we assembled the plastid genome of the critically endangered, endemic species Valeriana trinervis (= Centranthus trinervis) and compare it with all published plastomes of Valeriana. We found not only differences in the inverted repeats boundaries, in the type and abundance of repeats, but also similarities in codon usage and microsatellite numbers.
View Article and Find Full Text PDFFunct Integr Genomics
September 2025
Department of Plastic Surgery, the First Affiliated Hospital of Fujian Medical University, Fuzhou, 350005, China.
Keloid scarring and Metabolic Syndrome (MS) are distinct conditions marked by chronic inflammation and tissue dysregulation, suggesting shared pathogenic mechanisms. Identifying common regulatory genes could unveil novel therapeutic targets. Methods.
View Article and Find Full Text PDFMar Biotechnol (NY)
September 2025
Yazhou Bay Innovation Institute, Hainan Tropical Ocean University, Sanya, China.
Epinephelus tukula is an economically important aquaculture animal, and a major parent in grouper crossbreeding. To better preserve and exploit E. tukula germplasm resources, a core collection (containing 34 individuals derived from 10 genetic groups) was first constructed based on phenotypic growth traits and whole-genome resequencing (WGS) data.
View Article and Find Full Text PDFFunct Integr Genomics
September 2025
The First Clinical Medical College, Yunnan University of Chinese Medicine, Kunming, China.
Ischemic stroke (IS) has high morbidity/mortality with limited treatments. This study screened core copper homeostasis-related genes in IS and validated their function as precise intervention targets. Human IS gene chip data were retrieved from GEO, and copper homeostasis genes from multiple databases.
View Article and Find Full Text PDF