Publications by Ira Hall | LitMetric

Publications by authors named "Ira Hall"

Page 1 of 3

Clade Distillation for Genome-wide Association Studies.

Ryan Christ , Xinxin Wang , Louis J M Aslett , David Steinsaltz , Ira Hall

Genetics

August 2025

Testing inferred haplotype genealogies for association with phenotypes has been a longstanding goal in human genetics given their potential to detect association signals driven by allelic heterogeneity - when multiple causal variants modulate a phenotype - in both coding and noncoding regions. Recent scalable methods for inferring locus-specific genealogical trees along the genome, or representations thereof, have made substantial progress towards this goal; however, the problem of testing these trees for association with phenotypes has remained unsolved due to the growth in the number of clades with increasing sample size. To address this issue, we introduce several practical improvements to the kalis ancestry inference engine, including a general optimal checkpointing algorithm for decoding hidden Markov models, thereby enabling efficient genome-wide analyses.

View Article and Find Full Text PDF

Single cell variant to enhancer to gene map for coronary artery disease.

Junedh M Amrute , Paul C Lee , Ittai Eres , Chang Jie Mick Lee , Andrea Bredemeyer , Ira M Hall

medRxiv

November 2024

Article Synopsis

The study connects genetic variants linked to coronary artery disease (CAD) with cellular and molecular traits by analyzing chromatin accessibility and gene expression in human coronary arteries.
Through single-cell analysis, researchers identified thousands of specific chromatin accessibility loci (caQTLs) and found that smooth muscle cells (SMCs) are particularly susceptible to genetic risks associated with CAD.
They developed a comprehensive mapping approach to trace disease variants to potential causal genes across different cell types and confirmed their findings using advanced techniques like genome-wide Hi-C and CRISPR interference.

View Article and Find Full Text PDF

Semi-supervised machine learning method for predicting homogeneous ancestry groups to assess Hardy-Weinberg equilibrium in diverse whole-genome sequencing studies.

Derek Shyr , Rounak Dey , Xihao Li , Hufeng Zhou , Eric Boerwinkle , Ira Hall

Am J Hum Genet

October 2024

Large-scale, multi-ethnic whole-genome sequencing (WGS) studies, such as the National Human Genome Research Institute Genome Sequencing Program's Centers for Common Disease Genomics (CCDG), play an important role in increasing diversity for genetic research. Before performing association analyses, assessing Hardy-Weinberg equilibrium (HWE) is a crucial step in quality control procedures to remove low quality variants and ensure valid downstream analyses. Diverse WGS studies contain ancestrally heterogeneous samples; however, commonly used HWE methods assume that the samples are homogeneous.

View Article and Find Full Text PDF

Whole-genome sequencing uncovers two loci for coronary artery calcification and identifies ARSE as a regulator of vascular calcification.

Paul S de Vries , Matthew P Conomos , Kuldeep Singh , Christopher J Nicholson , Deepti Jain , Ira M Hall

Nat Cardiovasc Res

December 2023

Article Synopsis

Coronary artery calcification (CAC) is linked to heart disease and assessed through a genome-wide association study (GWAS) involving 22,400 participants from various backgrounds.
The study confirmed connections with four known genetic loci and discovered two new loci related to CAC, with supportive replication findings for both.
Functional tests suggest that ARSE promotes calcification in vascular smooth muscle cells and its variants may influence CAC levels, identifying ARSE as a key target for potential treatments in vascular calcific diseases.

View Article and Find Full Text PDF

A draft human pangenome reference.

Wen-Wei Liao , Mobin Asri , Jana Ebler , Daniel Doerr , Marina Haukness , Tobias Marschall , Ira M Hall

Nature

May 2023

Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels.

View Article and Find Full Text PDF

Semi-automated assembly of high-quality diploid human reference genomes.

Erich D Jarvis , Giulio Formenti , Arang Rhie , Andrea Guarracino , Chentao Yang , Tobias Marschall , Ira Hall

Nature

November 2022

The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has benefitted society. However, it still has many gaps and errors, and does not represent a biological genome as it is a blend of multiple individuals. Recently, a high-quality telomere-to-telomere reference, CHM13, was generated with the latest long-read technologies, but it was derived from a hydatidiform mole cell line with a nearly homozygous genome.

View Article and Find Full Text PDF

Integrating transcriptomics, metabolomics, and GWAS helps reveal molecular mechanisms for metabolite levels and disease risk.

Xianyong Yin , Debraj Bose , Annie Kwon , Sarah C Hanks , Anne U Jackson , Ira M Hall

Am J Hum Genet

October 2022

Transcriptomics data have been integrated with genome-wide association studies (GWASs) to help understand disease/trait molecular mechanisms. The utility of metabolomics, integrated with transcriptomics and disease GWASs, to understand molecular mechanisms for metabolite levels or diseases has not been thoroughly evaluated. We performed probabilistic transcriptome-wide association and locus-level colocalization analyses to integrate transcriptomics results for 49 tissues in 706 individuals from the GTEx project, metabolomics results for 1,391 plasma metabolites in 6,136 Finnish men from the METSIM study, and GWAS results for 2,861 disease traits in 260,405 Finnish individuals from the FinnGen study.

View Article and Find Full Text PDF

High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios.

Marta Byrska-Bishop , Uday S Evani , Xuefang Zhao , Anna O Basile , Haley J Abel , Ira M Hall

Cell

September 2022

The 1000 Genomes Project (1kGP) is the largest fully open resource of whole-genome sequencing (WGS) data consented for public distribution without access or use restrictions. The final, phase 3 release of the 1kGP included 2,504 unrelated samples from 26 populations and was based primarily on low-coverage WGS. Here, we present a high-coverage 3,202-sample WGS 1kGP resource, which now includes 602 complete trios, sequenced to a depth of 30X using Illumina.

View Article and Find Full Text PDF

The Human Pangenome Project: a global resource to map genomic diversity.

Ting Wang , Lucinda Antonacci-Fulton , Kerstin Howe , Heather A Lawson , Julian K Lucas , Tobias Marschall , Ira M Hall

Nature

April 2022

The human reference genome is the most widely used resource in human genetics and is due for a major update. Its current structure is a linear composite of merged haplotypes from more than 20 people, with a single individual comprising most of the sequence. It contains biases and errors within a framework that does not represent global human genomic variation.

View Article and Find Full Text PDF

The complete sequence of a human genome.

Sergey Nurk , Sergey Koren , Arang Rhie , Mikko Rautiainen , Andrey V Bzikadze , Ira M Hall , Tobias Marschall

Science

April 2022

Article Synopsis

The Telomere-to-Telomere Consortium has completed the human reference genome, addressing the previously unfinished heterochromatic regions and offering a sequence of 3.055 billion base pairs.
This new genome assembly, T2T-CHM13, includes gapless sequences for nearly all chromosomes, correcting errors found in earlier genome references.
The update introduces nearly 200 million new base pairs and includes important genomic features like centromeric satellite arrays and gene predictions, enabling more comprehensive genetic studies.

View Article and Find Full Text PDF

Genome-wide association studies of metabolites in Finnish men identify disease-relevant loci.

Xianyong Yin , Lap Sum Chan , Debraj Bose , Anne U Jackson , Peter VandeHaar , Ira M Hall

Nat Commun

March 2022

Few studies have explored the impact of rare variants (minor allele frequency < 1%) on highly heritable plasma metabolites identified in metabolomic screens. The Finnish population provides an ideal opportunity for such explorations, given the multiple bottlenecks and expansions that have shaped its history, and the enrichment for many otherwise rare alleles that has resulted. Here, we report genetic associations for 1391 plasma metabolites in 6136 men from the late-settlement region of Finland.

View Article and Find Full Text PDF

Inverting the model of genomics data sharing with the NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space.

Michael C Schatz , Anthony A Philippakis , Enis Afgan , Eric Banks , Vincent J Carey , Ira M Hall

Cell Genom

January 2022

The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL; https://anvilproject.org) was developed to address a widespread community need for a unified computing environment for genomics data storage, management, and analysis. In this perspective, we present AnVIL, describe its ecosystem and interoperability with other platforms, and highlight how this platform and associated initiatives contribute to improved genomic data sharing efforts.

View Article and Find Full Text PDF

Structural variants are a major source of gene expression differences in humans and often affect multiple nearby genes.

Alexandra J Scott , Colby Chiang , Ira M Hall

Genome Res

December 2021

Structural variants (SVs) are an important source of human genome diversity, but their functional effects are poorly understood. We mapped 61,668 SVs in 613 individuals from the GTEx project and measured their effects on gene expression. We estimate that common SVs are causal at 2.

View Article and Find Full Text PDF

Mitochondrial genome copy number measured by DNA sequencing in human blood is strongly associated with metabolic traits via cell-type composition differences.

Liron Ganel , Lei Chen , Ryan Christ , Jagadish Vangipurapu , Erica Young , Ira M Hall

Hum Genomics

June 2021

Background: Mitochondrial genome copy number (MT-CN) varies among humans and across tissues and is highly heritable, but its causes and consequences are not well understood. When measured by bulk DNA sequencing in blood, MT-CN may reflect a combination of the number of mitochondria per cell and cell-type composition. Here, we studied MT-CN variation in blood-derived DNA from 19184 Finnish individuals using a combination of genome (N = 4163) and exome sequencing (N = 19034) data as well as imputed genotypes (N = 17718).

View Article and Find Full Text PDF

Population-scale tissue transcriptomics maps long non-coding RNAs to complex disease.

Olivia M de Goede , Daniel C Nachun , Nicole M Ferraro , Michael J Gloudemans , Abhiram S Rao , Ira M Hall

Cell

May 2021

Article Synopsis

Long non-coding RNA (lncRNA) genes play critical roles in biological functions, but identifying which ones are linked to diseases is challenging due to their large number.
The study analyzed data from the GTEx project to investigate the expression and associations of over 14,000 lncRNA genes across 49 tissues and 101 complex traits.
They discovered 1,432 lncRNA gene-trait associations, with many linked to diseases such as inflammatory bowel disease and diabetes, indicating these lncRNAs can have significant effects not explained by nearby protein-coding genes.

View Article and Find Full Text PDF

Association of structural variation with cardiometabolic traits in Finns.

Lei Chen , Haley J Abel , Indraniel Das , David E Larson , Liron Ganel , Ira M Hall

Am J Hum Genet

April 2021

The contribution of genome structural variation (SV) to quantitative traits associated with cardiometabolic diseases remains largely unknown. Here, we present the results of a study examining genetic association between SVs and cardiometabolic traits in the Finnish population. We used sensitive methods to identify and genotype 129,166 high-confidence SVs from deep whole-genome sequencing (WGS) data of 4,848 individuals.

View Article and Find Full Text PDF

Haplotype-resolved diverse human genomes and integrated analysis of structural variation.

Peter Ebert , Peter A Audano , Qihui Zhu , Bernardo Rodriguez-Martin , David Porubsky , Ira M Hall , Tobias Marschall

Science

April 2021

Long-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent-child trio data. We present 64 assembled haplotypes from 32 diverse human genomes. These highly contiguous haplotype assemblies (average minimum contig length needed to cover 50% of the genome: 26 million base pairs) integrate all forms of genetic variation, even across complex loci.

View Article and Find Full Text PDF

Polygenic burden has broader impact on health, cognition, and socioeconomic outcomes than most rare and high-risk copy number variants.

Elmo Christian Saarentaus , Aki Samuli Havulinna , Nina Mars , Ari Ahola-Olli , Tuomo Tapio Johannes Kiiskinen , Ira M Hall

Mol Psychiatry

September 2021

Copy number variants (CNVs) are associated with syndromic and severe neurological and psychiatric disorders (SNPDs), such as intellectual disability, epilepsy, schizophrenia, and bipolar disorder. Although considered high-impact, CNVs are also observed in the general population. This presents a diagnostic challenge in evaluating their clinical significance.

View Article and Find Full Text PDF

Transcriptomic signatures across human tissues identify functional rare genetic variation.

Nicole M Ferraro , Benjamin J Strober , Jonah Einson , Nathan S Abell , Francois Aguet , Ira Hall

Science

September 2020

Rare genetic variants are abundant across the human genome, and identifying their function and phenotypic impact is a major challenge. Measuring aberrant gene expression has aided in identifying functional, large-effect rare variants (RVs). Here, we expanded detection of genetically driven transcriptome abnormalities by analyzing gene expression, allele-specific expression, and alternative splicing from multitissue RNA-sequencing data, and demonstrate that each signal informs unique classes of RVs.

View Article and Find Full Text PDF

Telomere-to-telomere assembly of a complete human X chromosome.

Karen H Miga , Sergey Koren , Arang Rhie , Mitchell R Vollger , Ariel Gershman , Ira Hall

Nature

September 2020

After two decades of improvements, the current human reference genome (GRCh38) is the most accurate and complete vertebrate genome ever produced. However, no single chromosome has been finished end to end, and hundreds of unresolved gaps persist. Here we present a human genome assembly that surpasses the continuity of GRCh38, along with a gapless, telomere-to-telomere assembly of a human chromosome.

View Article and Find Full Text PDF

Mapping and characterization of structural variation in 17,795 human genomes.

Haley J Abel , David E Larson , Allison A Regier , Colby Chiang , Indraniel Das , Ira M Hall

Nature

July 2020

A key goal of whole-genome sequencing for studies of human genetics is to interrogate all forms of variation, including single-nucleotide variants, small insertion or deletion (indel) variants and structural variants. However, tools and resources for the study of structural variants have lagged behind those for smaller variants. Here we used a scalable pipeline to map and characterize structural variants in 17,795 deeply sequenced human genomes.

View Article and Find Full Text PDF

Author Correction: Exome sequencing of Finnish isolates enhances rare-variant association power.

Adam E Locke , Karyn Meltz Steinberg , Charleston W K Chiang , Susan K Service , Aki S Havulinna , Ira M Hall

Nature

November 2019

An Amendment to this paper has been published and can be accessed via a link at the top of the paper.

View Article and Find Full Text PDF

Exome sequencing of Finnish isolates enhances rare-variant association power.

Adam E Locke , Karyn Meltz Steinberg , Charleston W K Chiang , Susan K Service , Aki S Havulinna , Ira M Hall

Nature

August 2019

Exome-sequencing studies have generally been underpowered to identify deleterious alleles with a large effect on complex traits as such alleles are mostly rare. Because the population of northern and eastern Finland has expanded considerably and in isolation following a series of bottlenecks, individuals of these populations have numerous deleterious alleles at a relatively high frequency. Here, using exome sequencing of nearly 20,000 individuals from these regions, we investigate the role of rare coding variants in clinically relevant quantitative cardiometabolic traits.

View Article and Find Full Text PDF

svtools: population-scale analysis of structural variation.

David E Larson , Haley J Abel , Colby Chiang , Abhijit Badve , Indraniel Das , Ira M Hall

Bioinformatics

November 2019

Summary: Large-scale human genetics studies are now employing whole genome sequencing with the goal of conducting comprehensive trait mapping analyses of all forms of genome variation. However, methods for structural variation (SV) analysis have lagged far behind those for smaller scale variants, and there is an urgent need to develop more efficient tools that scale to the size of human populations. Here, we present a fast and highly scalable software toolkit (svtools) and cloud-based pipeline for assembling high quality SV maps-including deletions, duplications, mobile element insertions, inversions and other rearrangements-in many thousands of human genomes.

View Article and Find Full Text PDF

Genomic Analysis in the Age of Human Genome Sequencing.

Tuuli Lappalainen , Alexandra J Scott , Margot Brandt , Ira M Hall

Cell

March 2019

Affordable genome sequencing technologies promise to revolutionize the field of human genetics by enabling comprehensive studies that interrogate all classes of genome variation, genome-wide, across the entire allele frequency spectrum. Ongoing projects worldwide are sequencing many thousands-and soon millions-of human genomes as part of various gene mapping studies, biobanking efforts, and clinical programs. However, while genome sequencing data production has become routine, genome analysis and interpretation remain challenging endeavors with many limitations and caveats.

View Article and Find Full Text PDF