Publications by authors named "Louxin Zhang"

Causal relationships between different entities are often modeled as labeled acyclic digraphs (DAGs) in biology and healthcare, in particular for depicting the progression of malignant tumor cells. Comparison of labeled DAGs is essential for developing methods for inference and evaluation of DAG models. Therefore, a robust dissimilarity metric is critical for such comparison tasks.

View Article and Find Full Text PDF

Unrooted phylogenetic networks are commonly used to represent evolutionary data in the presence of incompatibilities. While rooted phylogenetic networks offer a more explicit framework for depicting evolutionary histories involving reticulate events, they are reported less frequently, probably due to a lack of tools that are as easily applicable as those for unrooted networks. Here, we introduce PhyloFusion, a fast and user-friendly method for constructing rooted phylogenetic networks from sets of rooted phylogenetic trees.

View Article and Find Full Text PDF

Motivation: Drug combinations can not only enhance drug efficacy but also effectively reduce toxic side effects and mitigate drug resistance. With the advancement of drug combination screening technologies, large amounts of data have been generated. The availability of large data enables researchers to develop deep learning methods for predicting drug targets for synergistic combination.

View Article and Find Full Text PDF

Good representations for phylogenetic trees and networks are important for enhancing storage efficiency and scalability for the inference and analysis of evolutionary trees for genes, genomes and species. We propose a new representation for rooted phylogenetic trees that encodes a tree on [Formula: see text] ordered taxa as a vector of length [Formula: see text] in which each taxon appears exactly twice. Using this new tree representation, we introduce a novel tree rearrangement operator, termed an , that results in a tree space of linear diameter and quadratic neighbourhood size.

View Article and Find Full Text PDF

Motivation: Personalized cancer treatments require accurate drug response predictions. Existing deep learning methods show promise but higher accuracy is needed to serve the purpose of precision medicine. The prediction accuracy can be improved with not only topology but geometrical information of drugs.

View Article and Find Full Text PDF

The reconstruction of phylogenetic networks is an important but challenging problem in phylogenetics and genome evolution, as the space of phylogenetic networks is vast and cannot be sampled well. One approach to the problem is to solve the minimum phylogenetic network problem, in which phylogenetic trees are first inferred, and then the smallest phylogenetic network that displays all the trees is computed. The approach takes advantage of the fact that the theory of phylogenetic trees is mature, and there are excellent tools available for inferring phylogenetic trees from a large number of biomolecular sequences.

View Article and Find Full Text PDF

The Sackin and Colless indices are two widely-used metrics for measuring the balance of trees and for testing evolutionary models in phylogenetics. This short paper contributes two results about the Sackin and Colless indices of trees. One result is the asymptotic analysis of the expected Sackin and Colless indices of tree shapes (which are full binary rooted unlabelled trees) under the uniform model where tree shapes are sampled with equal probability.

View Article and Find Full Text PDF

We present a data integration framework that uses non-negative matrix factorization of patient-similarity networks to integrate continuous multi-omics datasets for molecular subtyping. It is demonstrated to have the capability to handle missing data without using imputation and to be consistently among the best in detecting subtypes with differential prognosis and enrichment of clinical associations in a large number of cancers. When applying the approach to data from individuals with lower-grade gliomas, we identify a subtype with a significantly worse prognosis.

View Article and Find Full Text PDF

The drug response prediction problem arises from personalized medicine and drug discovery. Deep neural networks have been applied to the multi-omics data being available for over 1000 cancer cell lines and tissues for better drug response prediction. We summarize and examine state-of-the-art deep learning methods that have been published recently.

View Article and Find Full Text PDF

Background: Mutation trees are rooted trees in which nodes are of arbitrary degree and labeled with a mutation set. These trees, also referred to as clonal trees, are used in computational oncology to represent the mutational history of tumours. Classical tree metrics such as the popular Robinson-Foulds distance are of limited use for the comparison of mutation trees.

View Article and Find Full Text PDF

Background: Understanding the mechanisms underlying the malignant progression of cancer cells is crucial for early diagnosis and therapeutic treatment for cancer. Mutational heterogeneity of breast cancer suggests that about a dozen of cancer genes consistently mutate, together with many other genes mutating occasionally, in patients.

Methods: Using the whole-exome sequences and clinical information of 468 patients in the TCGA project data portal, we analyzed mutated protein domains and signaling pathway alterations in order to understand how infrequent mutations contribute aggregately to tumor progression in different stages.

View Article and Find Full Text PDF

We collated contact tracing data from COVID-19 clusters in Singapore and Tianjin, China and estimated the extent of pre-symptomatic transmission by estimating incubation periods and serial intervals. The mean incubation periods accounting for intermediate cases were 4.91 days (95%CI 4.

View Article and Find Full Text PDF

Drug response prediction arises from both basic and clinical research of personalized therapy, as well as drug discovery for cancers. With gene expression profiles and other omics data being available for over 1000 cancer cell lines and tissues, different machine learning approaches have been applied to drug response prediction. These methods appear in a body of literature and have been evaluated on different datasets with only one or two accuracy metrics.

View Article and Find Full Text PDF

Background: Inference of cancer-causing genes and their biological functions are crucial but challenging due to the heterogeneity of somatic mutations. The heterogeneity of somatic mutations reveals that only a handful of oncogenes mutate frequently and a number of cancer-causing genes mutate rarely.

Results: We develop a Cytoscape app, named ZDOG, for visualization of the extent to which mutated genes may affect cancer pathways using the dominating tree model.

View Article and Find Full Text PDF

Background: Galled trees are studied as a recombination model in theoretical population genetics. This class of phylogenetic networks has been generalized to tree-child networks and other network classes by relaxing a structural condition imposed on galled trees. Although these networks are simple, their topological structures have yet to be fully understood.

View Article and Find Full Text PDF

Rooted phylogenetic networks are rooted acyclic digraphs. They are used to model complex evolution where hybridization, recombination, and other reticulation events play a role. A rigorous definition of network compression is introduced on the basis of recent studies of relationships between cluster, tree, and rooted phylogenetic networks.

View Article and Find Full Text PDF

Motivation: Comparative genomic studies indicate that extant genomes are more properly considered to be a fusion product of random mutations over generations (vertical evolution) and genomic material transfers between individuals of different lineages (reticulate transfer). This has motivated biologists to use phylogenetic networks and other general models to study genome evolution. Two fundamental algorithmic problems arising from verification of phylogenetic networks and from computing Robinson-Foulds distance in the space of phylogenetic networks are the tree and cluster containment problems.

View Article and Find Full Text PDF

Motivation: A reconciliation is an annotation of the nodes of a gene tree with evolutionary events-for example, speciation, gene duplication, transfer, loss, etc.-along with a mapping onto a species tree. Many algorithms and software produce or use reconciliations but often using different reconciliation formats, regarding the type of events considered or whether the species tree is dated or not.

View Article and Find Full Text PDF

Background: Human cancer cell lines are used in research to study the biology of cancer and to test cancer treatments. Recently there are already some large panels of several hundred human cancer cell lines which are characterized with genomic and pharmacological data. The ability to predict drug responses using these pharmacogenomics data can facilitate the development of precision cancer medicines.

View Article and Find Full Text PDF

Background: Over the past two decades, phylogenetic networks have been studied to model reticulate evolutionary events. The relationships among phylogenetic networks, phylogenetic trees and clusters serve as the basis for reconstruction and comparison of phylogenetic networks. To understand these relationships, two problems are raised: the tree containment problem, which asks whether a phylogenetic tree is displayed in a phylogenetic network, and the cluster containment problem, which asks whether a cluster is represented at a node in a phylogenetic network.

View Article and Find Full Text PDF

Motivation: Genetic material is transferred in a non-reproductive manner across species more frequently than commonly thought, particularly in the bacteria kingdom. On one hand, extant genomes are thus more properly considered as a fusion product of both reproductive and non-reproductive genetic transfers. This has motivated researchers to adopt phylogenetic networks to study genome evolution.

View Article and Find Full Text PDF

A large class of phylogenetic networks can be obtained from trees by the addition of horizontal edges between the tree edges. These networks are called tree-based networks. We present a simple necessary and sufficient condition for tree-based networks and prove that a universal tree-based network exists for any number of taxa that contains as its base every phylogenetic tree on the same set of taxa.

View Article and Find Full Text PDF