Publications by Peter Ebert

Publications by authors named "Peter Ebert"

Page 1 of 3

Author Correction: Complex genetic variation in nearly complete human genomes.

Glennis A Logsdon , Peter Ebert , Peter A Audano , Mark Loftus , David Porubsky

Nature

August 2025

View Article and Find Full Text PDF

Complex genetic variation in nearly complete human genomes.

Glennis A Logsdon , Peter Ebert , Peter A Audano , Mark Loftus , David Porubsky

Nature

August 2025

Diverse sets of complete human genomes are required to construct a pangenome reference and to understand the extent of complex structural variation. Here we sequence 65 diverse human genomes and build 130 haplotype-resolved assemblies (median continuity of 130 Mb), closing 92% of all previous assembly gaps and reaching telomere-to-telomere status for 39% of the chromosomes. We highlight complete sequence continuity of complex loci, including the major histocompatibility complex (MHC), SMN1/SMN2, NBPF8 and AMY1/AMY2, and fully resolve 1,852 complex structural variants.

View Article and Find Full Text PDF

Human de novo mutation rates from a four-generation pedigree reference.

David Porubsky , Harriet Dashnow , Thomas A Sasani , Glennis A Logsdon , Pille Hallast , Peter Ebert

Nature

July 2025

Understanding the human de novo mutation (DNM) rate requires complete sequence information. Here using five complementary short-read and long-read sequencing technologies, we phased and assembled more than 95% of each diploid human genome in a four-generation, twenty-eight-member family (CEPH 1463). We estimate 98-206 DNMs per transmission, including 74.

View Article and Find Full Text PDF

Magnitude and dynamics of the T-cell response to SARS-CoV-2 infection at both individual and population levels.

Thomas M Snyder , Rachel M Gittelman , Mark Klinger , Damon H May , Edward J Osborne , Peter Ebert

Front Immunol

May 2025

Introduction: T cells are involved in the early identification and clearance of viral infections and also support the development of antibodies by B cells. This central role for T cells makes them a desirable target for assessing the immune response to SARS-CoV-2 infection.

Methods: Here, we combined two high-throughput immune profiling methods to create a quantitative picture of the T-cell response to SARS-CoV-2.

View Article and Find Full Text PDF

Graphasing: phasing diploid genome assembly graphs with single-cell strand sequencing.

Mir Henglin , Maryam Ghareghani , William T Harvey , David Porubsky , Sergey Koren , Peter Ebert

Genome Biol

October 2024

Haplotype information is crucial for biomedical and population genetics research. However, current strategies to produce de novo haplotype-resolved assemblies often require either difficult-to-acquire parental data or an intermediate haplotype-collapsed assembly. Here, we present Graphasing, a workflow which synthesizes the global phase signal of Strand-seq with assembly graph topology to produce chromosome-scale de novo haplotypes for diploid genomes.

View Article and Find Full Text PDF

Complex genetic variation in nearly complete human genomes.

Glennis A Logsdon , Peter Ebert , Peter A Audano , Mark Loftus , David Porubsky

bioRxiv

September 2024

Article Synopsis

* It achieves a high level of completeness, closing 92% of previous assembly gaps and fully assembling complex regions, including 1,852 complex structural variants and 1,246 human centromeres.
* The findings lead to significant improvements in genotyping accuracy and enable the detection of over 26,000 structural variants per sample, enhancing the potential for future disease association research.

View Article and Find Full Text PDF

A familial, telomere-to-telomere reference for human mutation and recombination from a four-generation pedigree.

David Porubsky , Harriet Dashnow , Thomas A Sasani , Glennis A Logsdon , Pille Hallast , Peter Ebert

bioRxiv

August 2024

Using five complementary short- and long-read sequencing technologies, we phased and assembled >95% of each diploid human genome in a four-generation, 28-member family (CEPH 1463) allowing us to systematically assess mutations (DNMs) and recombination. From this family, we estimate an average of 192 DNMs per generation, including 75.5 single-nucleotide variants (SNVs), 7.

View Article and Find Full Text PDF

Phasing Diploid Genome Assembly Graphs with Single-Cell Strand Sequencing.

Mir Henglin , Maryam Ghareghani , William Harvey , David Porubsky , Sergey Koren , Peter Ebert

bioRxiv

June 2024

Haplotype information is crucial for biomedical and population genetics research. However, current strategies to produce haplotype-resolved assemblies often require either difficult-to-acquire parental data or an intermediate haplotype-collapsed assembly. Here, we present Graphasing, a workflow which synthesizes the global phase signal of Strand-seq with assembly graph topology to produce chromosome-scale haplotypes for diploid genomes.

View Article and Find Full Text PDF

Whole-genome long-read sequencing downsampling and its effect on variant-calling precision and recall.

William T Harvey , Peter Ebert , Jana Ebler , Peter A Audano , Katherine M Munson

Genome Res

December 2023

Advances in long-read sequencing (LRS) technologies continue to make whole-genome sequencing more complete, affordable, and accurate. LRS provides significant advantages over short-read sequencing approaches, including phased de novo genome assembly, access to previously excluded genomic regions, and discovery of more complex structural variants (SVs) associated with disease. Limitations remain with respect to cost, scalability, and platform-dependent read accuracy and the tradeoffs between sequence coverage and sensitivity of variant discovery are important experimental considerations for the application of LRS.

View Article and Find Full Text PDF

Systematic discovery of neoepitope-HLA pairs for neoantigens shared among patients and tumor types.

Hem R Gurung , Amy J Heidersbach , Martine Darwish , Pamela Pui Fung Chan , Jenny Li , Peter J R Ebert

Nat Biotechnol

July 2024

The broad application of precision cancer immunotherapies is limited by the number of validated neoepitopes that are common among patients or tumor types. To expand the known repertoire of shared neoantigen-human leukocyte antigen (HLA) complexes, we developed a high-throughput platform that coupled an in vitro peptide-HLA binding assay with engineered cellular models expressing individual HLA alleles in combination with a concatenated transgene harboring 47 common cancer neoantigens. From more than 24,000 possible neoepitope-HLA combinations, biochemical and computational assessment yielded 844 unique candidates, of which 86 were verified after immunoprecipitation mass spectrometry analyses of engineered, monoallelic cell lines.

View Article and Find Full Text PDF

Assembly of 43 human Y chromosomes reveals extensive complexity and variation.

Pille Hallast , Peter Ebert , Mark Loftus , Feyza Yilmaz , Peter A Audano

Nature

September 2023

The prevalence of highly repetitive sequences within the human Y chromosome has prevented its complete assembly to date and led to its systematic omission from genomic analyses. Here we present de novo assemblies of 43 Y chromosomes spanning 182,900 years of human evolution and report considerable diversity in size and structure. Half of the male-specific euchromatic region is subject to large inversions with a greater than twofold higher recurrence rate compared with all other chromosomes.

View Article and Find Full Text PDF

Multimodal, broadly neutralizing antibodies against SARS-CoV-2 identified by high-throughput native pairing of BCRs from bulk B cells.

Gladys J Keitany , Benjamin E R Rubin , Meghan E Garrett , Andrea Musa , Jeff Tracy , Peter Ebert

Cell Chem Biol

November 2023

TruAB Discovery is an approach that integrates cellular immunology, high-throughput immunosequencing, bioinformatics, and computational biology in order to discover naturally occurring human antibodies for prophylactic or therapeutic use. We adapted our previously described pairSEQ technology to pair B cell receptor heavy and light chains of SARS-CoV-2 spike protein-binding antibodies derived from enriched antigen-specific memory B cells and bulk antibody-secreting cells. We identified approximately 60,000 productive, in-frame, paired antibody sequences, from which 2,093 antibodies were selected for functional evaluation based on abundance, isotype and patterns of somatic hypermutation.

View Article and Find Full Text PDF

Alterations in the hepatocyte epigenetic landscape in steatosis.

Ranjan Kumar Maji , Beate Czepukojc , Michael Scherer , Sascha Tierling , Cristina Cadenas , Peter Ebert

Epigenetics Chromatin

July 2023

Fatty liver disease or the accumulation of fat in the liver, has been reported to affect the global population. This comes with an increased risk for the development of fibrosis, cirrhosis, and hepatocellular carcinoma. Yet, little is known about the effects of a diet containing high fat and alcohol towards epigenetic aging, with respect to changes in transcriptional and epigenomic profiles.

View Article and Find Full Text PDF

Whole-genome long-read sequencing downsampling and its effect on variant calling precision and recall.

William T Harvey , Peter Ebert , Jana Ebler , Peter A Audano , Katherine M Munson

bioRxiv

May 2023

Article Synopsis

Advances in long-read sequencing (LRS) technology improve whole-genome sequencing, making it more complete, affordable, and accurate than short-read methods.
LRS helps in identifying complex structural variants and phased genome assembly but faces challenges related to cost and accuracy.
A comparison of Oxford Nanopore and PacBio HiFi platforms shows that while both efficiently detect variants, PacBio typically offers better quality of results, especially for structural variants and indels.

View Article and Find Full Text PDF

A draft human pangenome reference.

Wen-Wei Liao , Mobin Asri , Jana Ebler , Daniel Doerr , Marina Haukness , Peter Ebert

Nature

May 2023

Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels.

View Article and Find Full Text PDF

Gaps and complex structurally variant loci in phased genome assemblies.

David Porubsky , Mitchell R Vollger , William T Harvey , Allison N Rozanski , Peter Ebert

Genome Res

April 2023

Article Synopsis

Recent advancements in phased genome assembly, especially using long-read data and parental information, still leave significant gaps, averaging over 140 per assembly from trio-hifiasm methods.
A comprehensive analysis of 182 haploid assemblies shows that chromosome-wide accuracy is similar when using Strand-seq instead of parental data, with many gaps clustering near large repeat regions.
The research highlights that a considerable amount of human DNA is misoriented and includes notable variations like deletions and insertions, suggesting key areas for future algorithm improvements and better pangenome models.

View Article and Find Full Text PDF

Benchmarking challenging small variants with linked and long reads.

Justin Wagner , Nathan D Olson , Lindsay Harris , Ziad Khan , Jesse Farek , Peter Ebert

Cell Genom

May 2022

Genome in a Bottle benchmarks are widely used to help validate clinical sequencing pipelines and develop variant calling and sequencing methods. Here we use accurate linked and long reads to expand benchmarks in 7 samples to include difficult-to-map regions and segmental duplications that are challenging for short reads. These benchmarks add more than 300,000 SNVs and 50,000 insertions or deletions (indels) and include 16% more exonic variants, many in challenging, clinically relevant genes not covered previously, such as .

View Article and Find Full Text PDF

Read-Based Phasing and Analysis of Phased Variants with WhatsHap.

Marcel Martin , Peter Ebert , Tobias Marschall

Methods Mol Biol

November 2022

Article Synopsis

WhatsHap is a command-line tool designed for phasing tasks, which helps determine the genetic makeup of individuals based on their DNA.
It works primarily with diploid and polyploid samples, using long reads to analyze at least two heterozygous variants.
The tool also provides features for analyzing phased variant calls, including calculating statistics, comparing phasings, and assigning reads to specific haplotypes in alignment files.

View Article and Find Full Text PDF

Semi-automated assembly of high-quality diploid human reference genomes.

Erich D Jarvis , Giulio Formenti , Arang Rhie , Andrea Guarracino , Chentao Yang , Peter Ebert

Nature

November 2022

The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has benefitted society. However, it still has many gaps and errors, and does not represent a biological genome as it is a blend of multiple individuals. Recently, a high-quality telomere-to-telomere reference, CHM13, was generated with the latest long-read technologies, but it was derived from a hydatidiform mole cell line with a nearly homozygous genome.

View Article and Find Full Text PDF

Recurrent inversion polymorphisms in humans associate with genetic instability and genomic disorders.

David Porubsky , Wolfram Höps , Hufsah Ashraf , PingHsun Hsieh , Bernardo Rodriguez-Martin , Peter Ebert

Cell

May 2022

Unlike copy number variants (CNVs), inversions remain an underexplored genetic variation class. By integrating multiple genomic technologies, we discover 729 inversions in 41 human genomes. Approximately 85% of inversions <2 kbp form by twin-priming during L1 retrotransposition; 80% of the larger inversions are balanced and affect twice as many nucleotides as CNVs.

View Article and Find Full Text PDF

Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes.

Jana Ebler , Peter Ebert , Wayne E Clarke , Tobias Rausch , Peter A Audano

Nat Genet

April 2022

Typical genotyping workflows map reads to a reference genome before identifying genetic variants. Generating such alignments introduces reference biases and comes with substantial computational burden. Furthermore, short-read lengths limit the ability to characterize repetitive genomic regions, which are particularly challenging for fast k-mer-based genotypers.

View Article and Find Full Text PDF

ASHLEYS: automated quality control for single-cell Strand-seq data.

Christina Gros , Ashley D Sanders , Jan O Korbel , Tobias Marschall , Peter Ebert

Bioinformatics

October 2021

Article Synopsis

Strand-seq is a technique that allows for detailed chromosome analysis, including haplotype phasing and structural variant discovery.* -
ASHLEYS is a new tool that automates the quality control process for single-cell sequencing libraries, reducing manual labor and achieving near-expert accuracy.* -
The tool is available for use on GitHub under the MIT license, and supplementary data can be found in Bioinformatics online.*

View Article and Find Full Text PDF

Haplotype-resolved diverse human genomes and integrated analysis of structural variation.

Peter Ebert , Peter A Audano , Qihui Zhu , Bernardo Rodriguez-Martin , David Porubsky

Science

April 2021

Long-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent-child trio data. We present 64 assembled haplotypes from 32 diverse human genomes. These highly contiguous haplotype assemblies (average minimum contig length needed to cover 50% of the genome: 26 million base pairs) integrate all forms of genetic variation, even across complex loci.

View Article and Find Full Text PDF

An environment for sustainable research software in Germany and beyond: current state, open challenges, and call for action.

Hartwig Anzt , Felix Bach , Stephan Druskat , Frank Löffler , Axel Loewe , Peter Ebert

F1000Res

March 2021

Research software has become a central asset in academic research. It optimizes existing and enables new research methods, implements and embeds research knowledge, and constitutes an essential research product in itself. Research software must be sustainable in order to understand, replicate, reproduce, and build upon existing research or conduct new research effectively.

View Article and Find Full Text PDF

Fully phased human genome assembly without parental data using single-cell strand sequencing and long reads.

David Porubsky , Peter Ebert , Peter A Audano , Mitchell R Vollger , William T Harvey

Nat Biotechnol

March 2021

Article Synopsis

Human genomes are usually represented as consensus sequences, which don't include parental haplotype details.
The research presents a new method for creating a complete and phased diploid genome assembly using advanced sequencing techniques, specifically for an individual of Puerto Rican descent.
The resulting assemblies show high accuracy and continuity, yielding precise genetic variations while identifying common regions where genome breaks occur across different sequencing platforms.

View Article and Find Full Text PDF