STAG-CNS: An Order-Aware Conserved Noncoding Sequences Discovery Tool for Arbitrary Numbers of Species

Xianjun Lai, Sairam Behera, Zhikai Liang, Yanli Lu, Jitender S. Deogun, James C. Schnable

Research output: Contribution to journalArticlepeer-review

8 Scopus citations


One method for identifying noncoding regulatory regions of a genome is to quantify rates of divergence between related species, as functional sequence will generally diverge more slowly. Most approaches to identifying these conserved noncoding sequences (CNSs) based on alignment have had relatively large minimum sequence lengths (≥15 bp) compared with the average length of known transcription factor binding sites. To circumvent this constraint, STAG-CNS that can simultaneously integrate the data from the promoters of conserved orthologous genes in three or more species was developed. Using the data from up to six grass species made it possible to identify conserved sequences as short as 9 bp with false discovery rate ≤0.05. These CNSs exhibit greater overlap with open chromatin regions identified using DNase I hypersensitivity assays, and are enriched in the promoters of genes involved in transcriptional regulation. STAG-CNS was further employed to characterize loss of conserved noncoding sequences associated with retained duplicate genes from the ancient maize polyploidy. Genes with fewer retained CNSs show lower overall expression, although this bias is more apparent in samples of complex organ systems containing many cell types, suggesting that CNS loss may correspond to a reduced number of expression contexts rather than lower expression levels across the entire ancestral expression domain.

Original languageEnglish (US)
Pages (from-to)990-999
Number of pages10
JournalMolecular Plant
Issue number7
StatePublished - Jul 5 2017


  • comparative genomics
  • conserved noncoding sequence
  • grain crops
  • longest path algorithm
  • suffix tree

ASJC Scopus subject areas

  • Molecular Biology
  • Plant Science


Dive into the research topics of 'STAG-CNS: An Order-Aware Conserved Noncoding Sequences Discovery Tool for Arbitrary Numbers of Species'. Together they form a unique fingerprint.

Cite this