Zhang Lab

Plant Bioinformatics and Functional Epigenomics Group

Software and Database

CGT-seq: Core Genome Targeted Sequencing

CGT-seq, which employed epigenomic information from both active and repressive epigenetic marks to guide the assembly of the core genome mainly composed of promoter and intragenic regions. This method was relatively easily implemented, and displayed high sensitivity and specificity for capturing the core genome of bread wheat.

95% intragenic and 89% promoter region from wheat were covered by CGT-seq read. We further demonstrated in rice that CGT-seq captured hundreds of novel genes and regulatory sequences from a previously unsequenced ecotype.

Together, with specific enrichment and sequencing of regions within and nearby genes, CGT-seq is a time- and resource-effective approach to profiling functionally relevant regions in sequenced and non-sequenced populations with large genomes.

Go to CGT-seq >>>

Plant Regulome: Data-driven Interface for Retrieving Upstream Regulators

Plant Regulome is a data-driven interface for retrieving upstream regulators from plant multi-omics data, which integrates 19,925 transcriptomic and epigenomic data sets and diverse sources of functional evidence (58,112 terms) from six plant species, namely Arabidopsis thaliana, Oryza sativa, Zea mays, Glycine max, Solanum lycopersicum and Triticum aestivum, along with the orthologous genes from 56 whole-genome sequenced plant species. These data were well-organized to gene modules and further implemented into the same statistical framework.

For any input gene list or genomic loci, Plant Regulome retrieves the factors, treatments, and experimental/environmental conditions regulating the input from the integrated omics data. Additionally, multiple tools and an interactive visualization are available through a user-friendly web interface.

Go to Plant Regulome >>>

GSHR: Gene Set-level Analyses of Hormone Responses in Arabidopsis

GSHR is a web server provides analyses based on integrated hormone response gene sets in Arabidopsis thaliana. We developed this to facilitate cross-study and cross-platform comparisons of transcriptomic changes to hormones.

The GSHR is user-friendly and has several features when comparing with other similar tools:

1. The GSHR especially focuses on genes response to hormones in Arabidopsis thaliana. It supported hormone response gene sets for users to compare with their own gene lists based on Fisher's exact test.

2. Other analysis tools are provided including cluster analysis, co-expression network, enrichment analysis of KEGG, GO and InterPro to help users unearthing the underlying biological insights of their gene lists.

Go to GSHR >>>

CARMO: Comprehensive Annotation of Rice Multi-Omics

CARMO is a web-based platform providing comprehensive annotations for multi-omics data, including transcriptomic data sets, epi-genomic modification sites, SNPs from genome re-sequencing, and the large gene lists derived from these omics studies. Well-organized results, as well as multiple tools for interactive visualization, are available through a user-friendly web interface.

The power of CARMO lies in the comprehensive collection and integration of information from both multi-omics data and diverse functional evidence of rice, which was further curated into gene sets and higher level gene modules. In this way, the high-throughput data can easily be compared across studies and platforms, and notably, integration of multiple types of evidence provides biological interpretation from the level of modules with high confidence. Examples in the manuscripts demonstrated that CARMO not only reproduced reported evidence, but also proposed novel functional insights for further experimental exploration.

Go to CARMO >>>

MAnorm: ChIP-Seq data quantitative comparison

ChIP-Seq is widely used to characterize genome-wide binding patterns of transcription factors and other chromatin-associated proteins. Although comparison of ChIP-Seq data sets is critical for understanding cell type-dependent and cell state-specific binding, and thus the study of cell-specific gene regulation, few quantitative approaches have been developed. Here, we present a simple and effective method, MAnorm, for quantitative comparison of ChIP-Seq data sets describing transcription factor binding sites and epigenetic modifications. The quantitative binding differences inferred by MAnorm showed strong correlation with both the changes in expression of target genes and the binding of cell type-specific regulators.

Go to MAnorm >>>

Motif-Scan: scan genomic regions for target of given motifs and perform enrichment analysis

With the accumulation of ChIP-seq data across different cell types, an effective and accurate method are essential to unravel the relationship between regulator binding and epigenetic modifications in different cell types. We present an integrative computational toolkit, MAmotif, to infer cell type specific regulators. Based on a hypotheses that the regions with higher epigenetic changes are more likely to be directly targeted by key cell type specific regulators, we combine MAnorm’s quantitative comparison information of 2 cell types and transcription factor binding sites information to infer cell type specific regulators. Here MAnorm is a model for quantitative comparison of ChIP-seq data between 2 cell types. While TFBS are detected from the epigenetic change regions by our newly developed motif scanning package. Our motif scan algorithm is a probabilistic model based on position weight matrix (PWM): the score of motif A is calculated as the ratio of A’s probability of occurrence on the target sequence and its probability of occurrence on the genome background. The target sequence can finally be defined as the motif A target sequence when the score is beyond the score threshold, which is from the distribution of motif A scores calculated on the whole genome sequence. When the epigenetic modification changes and TFBS information are prepared, several statistical tests and clustering methods are applied to determine the linkage between epigenetic modification changes and the motif binding affinity in specific cell type.

Go to Motif-Scan >>>