98%
921
2 minutes
20
Background: Founder populations have an important role in the study of genetic diseases. Access to detailed genealogical records is often one of their advantages. These genealogical data provide unique information for researchers in evolutionary and population genetics, demography and genetic epidemiology. However, analyzing large genealogical datasets requires specialized methods and software. The GENLIB software was developed to study the large genealogies of the French Canadian population of Quebec, Canada. These genealogies are accessible through the BALSAC database, which contains over 3 million records covering the whole province of Quebec over four centuries. Using this resource, extended pedigrees of up to 17 generations can be constructed from a sample of present-day individuals.
Results: We have extended and implemented GENLIB as a package in the R environment for statistical computing and graphics, thus allowing optimal flexibility for users. The GENLIB package includes basic functions to manage genealogical data allowing, for example, extraction of a part of a genealogy or selection of specific individuals. There are also many functions providing information to describe the size and complexity of genealogies as well as functions to compute standard measures such as kinship, inbreeding and genetic contribution. GENLIB also includes functions for gene-dropping simulations. The goal of this paper is to present the full functionalities of GENLIB. We used a sample of 140 individuals from the province of Quebec (Canada) to demonstrate GENLIB's functions. Ascending genealogies for these individuals were reconstructed using BALSAC, yielding a large pedigree of 41,523 individuals. Using GENLIB's functions, we provide a detailed description of these genealogical data in terms of completeness, genetic contribution of founders, relatedness, inbreeding and the overall complexity of the genealogical tree. We also present gene-dropping simulations based on the whole genealogy to investigate identical-by-descent sharing of alleles and chromosomal segments of different lengths and estimate probabilities of identical-by-descent sharing.
Conclusions: The R package GENLIB provides a user friendly and flexible environment to analyze extensive genealogical data, allowing an efficient and easy integration of different types of data, analytical methods and additional developments and making this tool ideal for genealogical analysis.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4431039 | PMC |
http://dx.doi.org/10.1186/s12859-015-0581-5 | DOI Listing |
Genetics
September 2025
Institute of Ecology and Evolution, School of Biological Sciences, The University of Edinburgh, Edinburgh, EH9 3FL, United Kingdom.
Recent advances in methods to infer and analyse ancestral recombination graphs (ARGs) are providing powerful new insights in evolutionary biology and beyond. Existing inference approaches tend to be designed for use with fully-phased datasets, and some rely on model assumptions about demography and recombination rate. Here I describe a simple model-free approach for genealogical inference along the genome from unphased genotype data called Sequential Tree Inference by Collecting Compatible Sites (sticcs).
View Article and Find Full Text PDFGenetics
September 2025
Department of Statistics, University of Oxford, Oxford OX1 3LB, UK.
Phantom epistasis arises when, in the course of testing for gene-by-gene interactions, the omission of a causal variant with a purely additive effect on the phenotype causes the spurious inference of a significant interaction between two SNPs. This is more likely to arise when the two SNPs are in relatively close proximity, so while true epistasis between nearby variants could be commonplace, in practice there is no reliable way of telling apart true epistatic signals from false positives. By considering the causes of phantom epistasis from a genealogy-based perspective, we leverage the rich information contained within reconstructed genealogies (in the form of ancestral recombination graphs) to address this problem.
View Article and Find Full Text PDFPLoS One
August 2025
Department of Digital Humanities, University of Helsinki, Helsinki, Finland.
Investigating linguistic relationships on a global scale requires analyzing diverse features such as syntax, phonology and prosody, which evolve at varying rates influenced by internal diversification, language contact, and sociolinguistic factors. Recent advances in machine learning (ML) offer complementary alternatives to traditional historical and typological approaches. Instead of relying on expert labor in analyzing specific linguistic features, these new methods enable the exploration of linguistic variation through embeddings derived directly from speech, opening new avenues for large-scale, data-driven analyses.
View Article and Find Full Text PDFMol Biol Evol
July 2025
Department of Genetics, Evolution, and Environment, University College London, Gower Street, London WC1E 6BT, UK.
The multispecies coalescent (MSC) model accounts for genealogical fluctuations across the genome and provides a framework for analyzing genomic data from closely related species to estimate species phylogenies and divergence times, infer interspecific gene flow, and delineate species boundaries. As the MSC model assumes correct sequences, sequencing and genotyping errors at low read depths may be a serious concern. Here, we use computer simulation to assess the impact of genotyping errors in phylogenomic data on Bayesian inference of the species tree and population parameters such as species split times, population sizes, and the rate of gene flow.
View Article and Find Full Text PDFCommun Biol
August 2025
Département des sciences fondamentales, Université du Québec à Chicoutimi, Saguenay, Québec, Canada.
Founder events influenced the genetic diversity within the province of Quebec, increasing the frequency of certain rare pathogenic variants in regional populations. Some regions, such as Beauce, remain understudied despite evidence of a regional founder effect. Leveraging extensive genealogical data, we observe a specific regional structure emerging in Beauce following the initial settlement.
View Article and Find Full Text PDF