From The Zhang Lab
Jump to: navigation, search


Next-Generation Bioinformatics


Data Integration

The rapid advancements in high-throughput experiment technologies make biological data increasing at an unprecedentedly exponential rate. To answer the most important and complex biological questions, it is very often to involve the integration of diverse data from multiple data sources, which needs to harness collective contributions and build bioinformatic Web APIs for massive data integration.

Data Analysis

The fast-growing volume of biological data makes it imperative to develop time-efficient applications for large-scale data analysis. This requires utility of highly efficient computing technologies (e.g., cloud, parallel) and establishment of lightweight programming environment to make full use of computing resources as well as storage resources.

Data Sharing

Data, broadly speaking, including raw data, algorithms, results, pipelines, publications, knowledge and even connections among people, are growing at an unparalleled pace. Thus, it needs to link researchers all over the world and build scientific social networks for efficient and effective data sharing.

Ongoing Projects

  • Omics Bioinformatic Cloud: a hadoop-based bioinformatics cloud for large-scale NGS data storage, analysis and sharing
Participants: Siqi Liu, Dong Zou, Ang Li, Chao Xu
  • RiceWiki: a community-based annotation platform for rice genes
Participants: Chao Xu, Siqi Liu, Dong Zou, Ang Li, Lina Ma, Hao Wu, Gang Wu, Dawei Huang

Computational Molecular Evolution


Modeling Compositional Dynamics

Sequence compositions at different levels (e.g., codon) reflect an interplay result of mutation and selection. To better understand sequence evolution, it is of fundamental significance to study sequence composition, which is closely related to gene expression, translation speed and/or accuracy, gene function, protein structure, the intrinsic nature of the genetic code, and so on.

Detecting Mutation and Selection

A number of models have been proposed for modeling evolution of protein-coding sequence. It would be desirable to model sequence evolution and detect selective pressure, not merely in protein-coding sequences, but also in non-coding sequences.

Simulating Evolutionary Process

Simulating evolutionary process of molecular sequences over time is essential for a broad range of evolutionary studies. To perform simulations in a biologically realistic way, it is necessary to take full considerations of a variety of multiple parameters, such as, mutation rate, functional and structural constraints, pattern of site substitution, co-evolving sites, site-specific evolutionary constraints, etc.

Ongoing Projects

  • Non-coding sequences: composition, evolution, expression, function
Participants: Lina Ma, Hao Wu, Gang Wu, Dawei Huang, Siqi Liu, Ang Li
  • Prokaryotic genomes: three dnaE-based groups
Participants: Hao Wu, Gang Wu, Dawei Huang, Lina Ma, Dong Zou, Ang Li
  • Circadian genes: identification, clustering, expression,
Participants: Gang Wu, Hao Wu, Dawei Huang
  • Substitution features & evolutionary models:
Participants: Dawei Huang

We look forward to world-wide collaborations as well as comments, suggestions and guidance from colleagues and peers with common research interests.‚Äč

Permission and Copyright of Images

Permission is required to use the above two images. High-resolution versions of these images are available upon request.

Personal tools