Tissue specificity, the traditional predictor of gene function, has recently been used to interpret the selective pressure associated with gene architecture. In this work, we examine gene structures and their relation to the number of tissues expressed and to the number of co-expressed genes, using a recent atlas of microarray-based mouse gene expression in 55 normal tissues. We define tissue specificity and expression-pattern specificity according to the number of tissues expressed and the number of co-expressed genes, respectively. We find that, consistent with previous findings, tissue non-specific (housekeeping) genes are short in all gene regions (coding regions, intron, 5′ and 3′ untranslated regions). However, in contrast to previous suggestion that tissue-specific genes are long, the genes that are the most tissue-specific (expressed only in one tissue) are also short. We further show that both expression-pattern-specific and non-specific genes are long in coding and non-coding regions. The origins for short tissue-specific genes and long expression-pattern-specific genes are not clear. Genes with highly non-specific expression patterns (i.e. genes with a large number of co-expressed genes) are composed of genes that spread all tissues but are overwhelmingly enriched in the central nervous system (e.g. brain). Thus, the large sizes of these genes are possibly related to the functional complexity and/or accelerated evolutions of the central nervous system.
ASJC Scopus subject areas
- Molecular Biology