|
My career goal is to understand the mechanisms of complex
biological systems. To achieve this goal, I focus my research on three
areas:
1. How does the interplay of nonlinear system behavior and genetic diversity
influences complex traits?
2. Evolution and organization of biological networks.
3. Algorithm and method development in computational biology.
Interplay of nonlinear system behavior and genetic diversity and
its implication on complex traits.
Biological complexity is the result of the interplay between the heterogeneous
system properties and genetic variations. The former is nongenic, due
to the nonlinear behavior resulted from the complex relations among
biological components. The later is the direct result of the evolutionary
process. The former should be addressed by studying complex biological
system as a whole (in contrast to reductive approach). The later can
be best understood from the population genetics and evolutionary studies.
Environmental cues usually contain random noises. For example, growth
hormone concentration in a Petri dish may be modeled a uniform distribution
with normally distributed noise. The normally distributed noise (input
to the cellular system) will be amplified via the signal transduction
pathway, metabolic pathway, and transcriptional regulation, etc. These
outputs from the cellular system may exhibit a myriad of heterogeneous
behaviors due to the nonlinear nature of the complex cellular system.
For example, a simple positive feedback loop can lead to a bimodal distribution
of response. Therefore, understanding the nonlinear behavior of cellular
system is of critical importance in our understanding of complex phenotypes.
Evolution and organization of biological networks
There are two layers in the big picture of biological network research:
(1) The physical biological networks, such as the protein interaction
network and the transcription network. (2) The conceptual biological
networks, such as the genetic networks and mathematical models. Network
research is to generate conceptual models from phenotypical observations
(the reverse approach) and then use these models to predict phenotypic
changes (the forward approach). To model biological networks, it is
imperative to understand the network configuration and the evolutionary
process that led to the configuration.
I have developed an isotemporal approach to study network evolution.
Based on the occurrence of orthologues in various groups of genomes,
I first estimated the evolutionary histories of all annotated yeast
proteins and classified them into isotemporal categories. Next, I analyzed
the interaction tendencies within and between categories, and discovered
that proteins with similar evolutionary histories tend to interact with
each other and form isotemporal clusters. Finally, I converted interaction
tendencies into distances and reconstructed the path of the network
growth. I found that network evolution is driven by synergistic selection
and network growth correlates with the evolution history of yeast. These
evolutionary mechanisms provide insight into the hierarchical modularity
of biological networks in general.
I developed two independent methods to analyze the interaction pattern
and infer the path of network evolution. I reached the same conclusion
using both methods. These methods may be applicable to study the evolution
of many other complex networks
One important discovery of the above work is the isotemporal clusters.
With the advent of many yeast genomes (S. paradoxus, S. mikatae, S.
bayanus, S. kudriavzevii, S. bayanus, S. castellii, S. kluyveri), I
am studying the recent isotemporal clusters of the yeast protein interaction
network in details.
Algorithm and method development in computational biology
A great challenge in computational biology is how to extract meaningful
information from large-scale biological data, such as genomes, microarray
data, and high throughput screens. Because of my interest in biological
complexity, I will focus research based on large-scale data analysis.
Therefore, I devote a significant amount of my research effort to develop
computational tools for network and pathway analysis, expression profiling,
and genome comparison.
I have designed a super sequence method to analyze the global codon
spatial patterns in various genomes. I then applied isotonic regression
for quantitative analysis for spatial pattern variations. Spatial pattern
variations can be found in many other biological data, such as amino
acids in protein sequences, gene locations in chromosomes, single-nucleotide-polymorphism,
GC content variation along chromosome. Therefore, I am planning to implement
my spatial pattern recognition method in an open-source software package,
in order to address the spatial pattern variations in general.
My designed algorithm for network evolution is a method for categorical
data analysis in networks in general. I have used this method to infer
the functional architecture of the yeast protein interaction network.
Functional inference based biological network analysis should be of
great interest to many biologists. Therefore, I am also planning to
implement my network analysis methods into an open-source package in
the future.
|