Serial analysis of gene expression (SAGE) is a powerful technique for genome-wide analysis of gene expression. However, two-thirds of SAGE tags cannot be used directly for gene identification for two reasons. First, many SAGE tags match several known expressed sequences, owing to the short length of SAGE tag sequences. Second, many SAGE tags do not match any known expressed sequences, presumably because the sequences corresponding to these SAGE tags have not been identified. These two problems can be solved by extension of the SAGE tags into 3′ complementary DNAs (cDNAs) by use of the GLGI technique (generation of longer cDNA fragments from SAGE tags for gene identification). We have improved the original GLGI technique into a high-throughput procedure for simultaneous conversion of a large number of SAGE tags into corresponding 3′ cDNAs. The whole process is simple, rapid, low-cost, and highly efficient, as shown by our use of this procedure for analyzing hundreds of SAGE tags. In addition to identifying the correct gene for SAGE tags with multiple matches, GLGI can be used for large-scale identification of novel genes by converting novel SAGE tags into 3′ cDNAs. Applying this high-throughput procedure should accelerate the rate of gene identification significantly in the human and other eukaryotic genomes.
ASJC Scopus subject areas
- Cancer Research