We have developed a technique called the Integrated Procedure for Gene Identification that modifies and integrates parts from several existing techniques to increase the efficiency for genome- wide gene identification. The procedure has the following features: (i) Only the 3' portion of the expressed templates is used to ensure a match to 3' expressed sequence tag (EST) sequences; (ii) the 3' portion of the cDNA is poly dA/poly dT minus, which maintains complete representation of the expressed copies, particularly the rare copies, which otherwise would be lost heavily because of random poly dA/poly dT hybridization in the subtraction reaction; (iii) redundancy is decreased substantially by the subtraction reaction to reduce the effort for sequencing analysis; (iv) the nonsubtracted templates that largely contain the rare copies are amplified selectively with suppression PCR and are sequenced directly or through serial analysis of gene expression (SAGE); and (v) the identified sequences are matched to databases to determine whether they are cloned genes, ESTs, or novel sequences. Using this procedure in a model system, we showed that the redundant copies were largely removed, and the rates of EST matches and the novel sequence identification were significantly increased. Most of the plasmids containing the matched EST are readily available from the IMAGE consortium. This technique can be used to index genome-wide expressed genes and to identify differentially expressed genes in different cells. Compared with the existing techniques, this procedure is relatively efficient, simple, less expensive, and labor intensive. It is especially useful for standard molecular laboratories to perform genome-wide studies.
|Original language||English (US)|
|Number of pages||6|
|Journal||Proceedings of the National Academy of Sciences of the United States of America|
|State||Published - Sep 29 1998|
ASJC Scopus subject areas