TY - JOUR
T1 - Derivation of stable microarray cancer-differentiating signatures using consensus scoring of multiple random sampling and gene-ranking consistency evaluation
AU - Zhi, Qun Tang
AU - Lian, Yi Han
AU - Hong, Huang Lin
AU - Cui, Juan
AU - Jia, Jia
AU - Boon, Chuan Low
AU - Bao, Wen Li
AU - Yu, Zong Chen
PY - 2007/10/15
Y1 - 2007/10/15
N2 - Microarrays have been explored for deriving molecular signatures to determine disease outcomes, mechanisms, targets, and treatment strategies. Although exhibiting good predictive performance, some derived signatures are unstable due to noises arising from measurement variability and biological differences. Improvements in measurement, annotation, and signature selection methods have been proposed. We explored a new signature selection method that incorporates consensus scoring of multiple random sampling and multistep evaluation of gene-ranking consistency for maximally avoiding erroneous elimination of predictor genes. This method was tested by using a well-studied 62-sample colon cancer data set and two other cancer data sets (86-sample lung adenocarcinoma and 60-sample hepatocellular carcinoma). For the colon cancer data set, the derived signatures of 20 sampling sets, composed of 10,000 training test sets, are fairly stable with 80% of top 50 and 69% to 93% of all predictor genes shared by all 20 signatures. These shared predictor genes include 48 cancer-related and 16 cancer-implicated genes, as well as 50% of the previously derived predictor genes. The derived signatures outperform all previously derived signatures in predicting colon cancer outcomes from an independent data set collected from the Stanford Microarray Database. Our method showed similar performance for the other two data sets, suggesting its usefulness in deriving stable signatures for biomarker and target discovery.
AB - Microarrays have been explored for deriving molecular signatures to determine disease outcomes, mechanisms, targets, and treatment strategies. Although exhibiting good predictive performance, some derived signatures are unstable due to noises arising from measurement variability and biological differences. Improvements in measurement, annotation, and signature selection methods have been proposed. We explored a new signature selection method that incorporates consensus scoring of multiple random sampling and multistep evaluation of gene-ranking consistency for maximally avoiding erroneous elimination of predictor genes. This method was tested by using a well-studied 62-sample colon cancer data set and two other cancer data sets (86-sample lung adenocarcinoma and 60-sample hepatocellular carcinoma). For the colon cancer data set, the derived signatures of 20 sampling sets, composed of 10,000 training test sets, are fairly stable with 80% of top 50 and 69% to 93% of all predictor genes shared by all 20 signatures. These shared predictor genes include 48 cancer-related and 16 cancer-implicated genes, as well as 50% of the previously derived predictor genes. The derived signatures outperform all previously derived signatures in predicting colon cancer outcomes from an independent data set collected from the Stanford Microarray Database. Our method showed similar performance for the other two data sets, suggesting its usefulness in deriving stable signatures for biomarker and target discovery.
UR - http://www.scopus.com/inward/record.url?scp=35448932757&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=35448932757&partnerID=8YFLogxK
U2 - 10.1158/0008-5472.CAN-07-1601
DO - 10.1158/0008-5472.CAN-07-1601
M3 - Article
C2 - 17942933
AN - SCOPUS:35448932757
SN - 0008-5472
VL - 67
SP - 9996
EP - 10003
JO - Cancer Research
JF - Cancer Research
IS - 20
ER -