Literature
Home医源资料库在线期刊临床研究杂志2005年第115卷第6期

Stem cell–ness: a “magic marker“ for cancer

来源:临床研究杂志
摘要:InthisissueoftheJCI,GlinskyandcolleaguesusedhumanandmurinemodelstoidentifyapotentialstemcellmRNAsignature,basedonthehypothesisthattumorswithstemcell–likecharacteristicsarelikelytohaveapoorprognosis。Remarkably,an11-gene“expressionsignature“associated......

点击显示 收起

1Department of Molecular Therapeutics and

2Department of Biostatistics and Applied Mathematics, University of Texas M.D. Anderson Cancer Center, Houston, Texas, USA.

    Abstract

Transcriptional profiling of patient tumors is a much-heralded advancement in cancer therapy, as it provides the opportunity to identify patients who would benefit from more or less aggressive therapy and thus allows the development of individualized treatment. However, translation of this promise into patient benefit has proven challenging. In this issue of the JCI, Glinsky and colleagues used human and murine models to identify a potential stem cell mRNA signature, based on the hypothesis that tumors with stem cell–like characteristics are likely to have a poor prognosis. Remarkably, an 11-gene "expression signature" associated with "stem cell–ness" separated patients with different cancers into good- and poor-prognosis groups. Such a "magic marker" would, if validated, have a major impact on patient care. However, there remain challenges incumbent with creating and validating such signatures.

See the related article beginning on page 1503

    Prediction and cancer

The inception of high-throughput analyses using oligonucleotide microarrays has given biologists the ability to globally assess RNA levels in a patient’s tumor sample. A typical microarray study generates data on the expression of the approximately 20,000 human genes and soon studies will be able to analyze the more than 150,000 splice variants of RNA that are likely to have functional roles. The inherent challenge is to convert this data into applicable knowledge. A potential strength of the technology lies in its ability to uncover complex gene interaction patterns and correlate those patterns with clinically relevant outcomes. This "holy grail" could ultimately predict not only the therapeutic response of the tumor present in each patient, but also the patient's survival, which would subsequently lead to the development of individualized therapy for each patient based both on the genetic aberrations in the tumor and on the patient’s own genetic makeup. However, despite early enthusiasm, there have been considerable challenges in converting the promise of individualized molecular medicine into clinical practice.

In this issue of the JCI, Glinsky and colleagues outline a possible expression signature comprising 11 genes that has the ability, according to the authors’ analysis, to segregate tumor samples from multiple tumor lineages into those that have good or poor prognoses (1). The authors have applied this gene set to multiple tissue types from disparate data sets and have repeatedly observed its predictive power. The application of their 11-gene signature to these independent sets addresses an analytical limitation that is often overlooked when "predictive" gene expression signatures are found in microarray experiments (2, 3). When thousands of measurements are taken on each patient, the number of ways to select some of those measurements as a pattern classifying tumors or predicting outcomes is enormous. When selecting multiple gene measurements, the probability of finding a combination with apparent clinical relevance just by chance is even higher.

This multiple-measurements problem can be addressed using a "training and test set" approach, wherein predictive models are validated on separate, independent data sets of adequate size. However, maintaining consistency of the thresholds and cutoffs used across data sets is fundamental to this paradigm. If a certain threshold is used to indicate a specific change on the training set, then it is important that the same threshold be used with subsequent test sets. While Glinsky et al. (1) do separate each data set into training and test sets, they require a separate training and test set and different cutoffs for each cell lineage and potentially for each RNA analysis platform. Since the authors use different cutoffs to identify prognosis in different data sets, one must remain cautious about whether their approach will be generally applicable to patient management. Furthermore, splitting a single data set into 2 smaller sets for training and test purposes also introduces statistical and analytical challenges, since the smaller data sets will have diminished power (4).

    Stem cells and cancer progression

A key insight in the Glinsky et al. study (1) is the biological motivation driving their selection of the gene signature. The authors begin with the plausible hypothesis that transformed cells, in which self-renewal or stem cell–related pathways are activated, may contribute to the survival of cancer cells in tumors and promote tumor progression and poor prognosis for patients. They apply this idea across species, combining a study of stem cells in a murine leukemia viral-1 (Bmi-1) knockout mouse model with a study of primary and metastatic tumors in a model of transgenic adenocarcinoma of the mouse prostate (TRAMP) in order to select genes that consistently display a stem cell self-renewal–like expression profile in multiple models. This approach builds on the paradigm that cancer likely arises in a limited population of stem cells. These stem cells could potentially have a set of common characteristics, and thus gene expression patterns, across tumor lineages. Characterization of a common expression signature with supplemental tissue-specific gene changes could reflect the cell of origin of a cancer. This concept is compatible with the observation that most tumors are less differentiated than the putative precursor cell and that only a small number of normal cells have the potential for self-renewal (5).

Moreover, if this signature can delineate those patients whose tumors rely on stem cell–like gene expression, then targeting those genes within the tumors may result in a cell population with limited proliferative potential. There is strong support for the hypothesis that the clones that initiated the cancer are different from the majority of cells within the tumor (6). Different pathways may be activated within different clones, and thus therapeutic targeting of these initiating cells may lead to a better outcome. Traditional therapies are aimed at the rapidly dividing cells within the tumor; while these may reduce tumor mass, they may not lead to cures.

    The significance of the Bmi-1–based 11-gene signature

Most statisticians (and many biologists) are leery of studies that claim to find "gene signatures" or "patterns of gene expression" that can be used to predict clinical outcomes across tumor lineages. Much of the uneasiness arises because many studies neglect to precisely define the notion of a signature. By contrast, Glinsky and colleagues produce a concrete definition of a gene signature (1). Their signature is defined quantitatively as an 11-dimensional vector of expression fold-change values in the base-10 logarithm of 11 genes in the peripheral nervous system (PNS) neurospheres in the mouse Bmi-1 knockout model. Deviations from the average expression of these 11 genes in individual tumor samples are correlated with this 11-dimensional vector. While this signature had classification and predictive value when assessed on human tumors, it is important to note that other potential "stem cell–ness" signatures, such as the 14-gene group discussed in the study, did not demonstrate predictive value.

The transcription factor Bmi-1 appears to play a role in gene regulation in both cancerous and normal stem cell proliferation through epigenetic mechanisms — changes that affect gene expression without altering gene structure, such as methylation or acetylation of chromatin. As the authors note, Bmi-1 has previously been shown to be required for maintenance of self-renewing HSCs (7) and for the self-renewal of leukemic stem cells (8). Bmi-1 has been implicated in extending the replicative potential of human fibroblasts through the suppression of the senescence pathway dependent on p16 (a cyclin-dependent kinase inhibitor) in a retinoblastoma protein–dependent manner (9). Further, a deficiency in the p16INK4a gene partially reverses the self-renewal defect in Bmi-1 dominant-negative neural stem cells (10). Additionally, a cooperative interaction between Bmi-1 and the oncogene c-myc has been demonstrated, through enhancement of mouse embryonic fibroblast proliferation, as a result of inhibiting c-myc–induced apoptosis and p19arf (11). It is apparent that the Polycomb group gene Bmi-1 is potentially involved in various phases of tumorigenesis; therefore, the 11-gene expression signature highlighted by Glinsky and colleagues (1), created from changes in gene expression induced by altering Bmi-1 in different backgrounds and validated on data collected by different researchers at different times, must be taken seriously and subjected to extensive evaluation and validation in independent laboratories.

    The 11-gene signature

At the very least, the 11-gene signature suggests the possibility of more accurately identifying patients with poor prognoses as candidates for more aggressive or investigational therapy. As the mode of treatment is different for each tumor lineage, it also suggests that the signature truly indicates prognosis rather than predicting response to therapy. However, it could include a general indication of sensitivity to cell death as a component of its prognostic load, although the individual components of the 11-gene signature have not been previously indicated as major regulators of cell survival.

The components of this 11-gene set vary in function and exist in different pathways (see Table 1 for gene names, signatures, and values). Both budding uninhibited by benzimidazoles 1 (BUB1) and kinetochore-associated 2 (KNTC2; also known as HEC) have been implicated as mitotic checkpoint proteins, and as such, aberrations in their function could contribute to genomic instability and aneuploidy. Mutations in the BUB1 gene lead to its inactivation and increase microsatellite instability in colon cancer as well as a predisposition to certain types of cancers (12, 13). Increased gastrulation homeobox 2 (GBX2) expression stimulates growth of human prostate cancer cells through upregulation of the gene coding for IL-6 (14). Overexpression of cyclin B1 (CCNB1) levels have been observed in high-grade large-cell and small-cell lung carcinoma (15) and have been shown to be downregulated as a result of p53 induction in non–small-cell lung cancer (16). Additionally, CCNB1 expression was highly correlated with the labeling index for antigen identified by mAb ki-67 (Ki-67, associated with increased tumor cell proliferation), which suggests a key role for CCNB1 in the regulation of neuroendocrine tumor cell proliferation (15). In breast cancer cell lines, overexpression of the FGF receptor 2 (FGFR2) gene resulted in activation of the MAPK and PI3K pathways (17). Interestingly, restoration of this gene product into a malignant prostate epithelial cancer cell line, PC3, led to suppression of malignancy and restoration of nonmalignant traits. Thus, 3 of the 11 genes identified in the Glinsky et al. gene signature (1) are related to cell proliferation and 2 to transition through mitosis. While little is known regarding the specific biological functions of ubiquitin-specific protease 22 (USP22), ubiquitin-specific proteases have been implicated in control of regulatory molecules such as p53 and cyclins (18). The ring finger 2 (RNF2) protein is part of the Polycomb group of proteins, like Bmi-1, that play key roles in hematopoiesis and cell-cycle regulation (19). Ankyrins are transmembrane proteins shown to be involved in cellular functions relating to the influx and efflux of sodium and calcium (20, 21). Carboxylesterase (CES) enzymes are found in many animal species and play a role in the hydrolysis of drugs such as steroids and anticancer agents (22). Mutations in CES1 have been implicated pharmacogenomically in the activation status of cancer drugs and prodrugs (23). At this time, the role of host cell factor c1 (HCFC1) in cancer has yet to be evaluated; however, as the host cell factor family is implicated in immunomodulation, it is possible that HCFC1 plays a role in limiting the immune response to cancer or in the production of cytokines such as IL-6 or IL-8, which can contribute to neovascularization or tumor growth.

   Table 1

Gene expression signature detected by Glinsky et al. (1) and reported to predict good or poor prognosis of patients with multiple types of cancer

As demonstrated with the prostate cancer data set, the 11-gene set can be divided into 2 groups: those for which elevated expression levels are associated with stem cell–ness and a poor prognosis (Ki67, CCNB1, GBX2, BUB1, KNTC2, USP22, and RNF2, in descending order of strength of association) and those for which decreased expression levels are associated with stem cell–ness and a good prognosis (CES1, FGFR2, and ankyrin 3 ; see Table 1) (1). High levels of these stem cell–related genes indicate a potential for self-renewal within tumors and for increased tumor aggressiveness within patients. Intriguingly, those genes that are positively associated with tumor cell proliferation (Ki67, CCNB1, and GBX2) and with the mitotic spindle (BUB1 and KNTC2) are also positively associated with the stem cell–ness signature. FGFR2, which has been shown to decrease the growth of prostate cancer cells, is negatively associated with the stem cell–ness signature. The association between this 11-gene signature and good and poor prognosis therefore makes sense, at least for those genes that have been characterized as associated with the behavior of cancer cells.

While the relevance to cancer progression and malignancy has been studied and even established in certain members of this 11-gene set, others have yet to be studied in detail. This underscores another strength of high-throughput analysis: the use of such a global approach may identify new and unexpected targets for study, some of which may prove to be potential therapeutic targets. The inclusion of genes involved in cancer cell proliferation as well as unexplored genes within this predictive group suggests that further analysis into their mechanisms and potential involvement in signaling pathways germane to cancer is warranted.

    Can their analysis be replicated?

In many microarray studies, how the data are analyzed may be more important than how they were generated. It is therefore incumbent upon researchers to describe their analytical methods in enough detail to allow independent researchers to replicate their computations on the same data. Ideally, these details take the form of precise equations or algorithms or the actual code used to analyze the data (24-26). Although the Glinsky et al. study (1) reports many of the critical values needed to replicate their results, their reliance on purely verbal descriptions leaves room for some ambiguity.

We attempted to replicate part of their analysis and turned our attention to the same lung cancer study used by the authors (27), which included survival data on 125 patients and microarray data from Affymetrix U95Av2 GeneChips. Our estimate of the standard gene expression signature is listed in Table 1. Because of redundancy, a total of 18 probe sets represent these 11 genes on the U95Av2 array (Table 2). We used all 18 probe sets, which expanded the fixed gene signature vector from Table 1 into an 18-dimensional vector. For each of the 18 probe sets, we computed the average expression over all 125 samples. We divided the expression vectors in individual tumor samples by the average expression value and then transformed the ratios by computing the base-10 logarithm, producing 18-dimensional vectors. We then computed the Pearson correlation coefficient between the fixed gene expression signature vector and the individual vector of log ratios. This procedure yielded 1 number, xi, for each tumor sample i = 1, . . . , 125, which represented our best estimate of the stem cell–like phenotype association index (SPAI) described in the Glinsky et al. study (1).

   Table 2

Probe sets on the Affymetrix U95Av2 array representing the 11 genes

We used SPAI as a continuous covariate in a Cox proportional hazards model to predict survival in the full data set and found that SPAI was significant (likelihood ratio test, P = 0.0179). Because the coefficient xi in the model was positive (0.798), higher values of SPAI were associated with shorter survival. Using trial and error, we determined a threshold on the full data set that, when used to split the samples into 2 groups, yielded the best results in a survival analysis. The optimum threshold occurred at 0.32 and separated the data into 40 samples with a stem cell–like profile (xi > 0.32) and 85 samples without a stem cell–like profile. To this extent, we were able to confirm the analysis reported by Glinsky et al. (Figure 1).

   Figure 1

Stem cell–positive samples correlate with poor prognosis in lung adenocarcinoma patients. Cox proportional hazards model of lung cancer patients (n = 125) using the SPAI confirmed the correlation between patient samples with stem cell–like expression pattern and poor overall survival (P = 0.0179). Analysis based on patient data from the lung cancer study by Bhattacharjee et al. (26).

Based on this preliminary attempt to replicate the authors’ analyses, we believe that their results should be greeted with cautious optimism. However, we were unable to validate their method when splitting the data multiple times into training and test sets. Our splits were more challenging than the ones used by the authors, since we did not use outcome information to balance the splits. It is also not clear whether we followed the same procedure for computing the SPAI that was used by Glinsky et al. (1).

    Conclusions

Glinsky and colleagues (1) have produced a stimulating analysis of a collection of microarray data experiments. Using ideas that were well motivated by the underlying biology and combining data across species from 2 different microarray studies, they identified an 11-gene signature that might be related to tumor behavior, and thus patient survival, in cancer. In order to validate this signature, they tested it retrospectively in a broad spectrum of microarray studies of different kinds of cancer.

In our hands, unsupervised analyses of expression levels across some of the same data sets used by Glinsky and colleagues (1) did not reveal any clinically relevant or statistically significant correlations with outcome. By introducing a weighting coefficient into their predictor model, they are in fact adding an element of supervision into the analysis. Further, as these coefficients are re-derived on each subsequent data set, the statistical and analytical issues incumbent with high-throughput technologies are insufficiently addressed (2).

Glinsky et al. proposed several models, but only the 11-gene metastatic TRAMP tumor sample/PNS set consistently differentiated samples in a clinically relevant manner (1). An important aspect of this gene set and the corresponding coefficients used in determining the weighted survival predictor is that those genes that have been previously identified as potential indicators of poor prognosis are given the greatest coefficients. That is, negative weighting correlates with longer survival. For example, in the prostate cancer SPAI, the authors assign the largest coefficient value to Ki-67, which is associated with increased proliferation in cancer and could therefore lead to greater aggressiveness and poorer prognosis.

In the search for a powerfully predictive set of cancer genes, there have been various signatures proposed that contain anywhere from 10–100 genes. However, recent studies have cast doubt on the power of those sets that were validated within a single data set or even a single tumor type (2-4). As such, the need for validation of a proposed gene expression signature across independent data sets is warranted. To their credit, Glinsky and colleagues have gone to great lengths to validate their 11-gene signature, albeit in a supervised manner, across multiple tumor samples and multiple tissue types. As researchers begin to question the results reported from microarray studies (2-4, 28), the need for replicable analyses and independent corroboration grows more acute. Regardless of whether it was said by Niels Bohr or Yogi Berra, it is still the case that "Prediction is very difficult, especially about the future."

        References

Glinsky, G.V., Berezovska, O., and Glinskii, A.B. 2005. . Microarray analysis identifies a death-from-cancer signature predicting therapy failure in patients with multiple types of cancer. J. Clin. Invest. 115::1503-1521. doi:10.1172/JCI23412.

Michiels, S., Koscielny, S., and Hill, C. 2005. . Prediction of cancer outcome with microarrays: a multiple random validation strategy. Lancet. 365::488-492.

Ransohoff, D.F. 2004. . Rules of evidence for cancer molecular-marker discovery and validation. Nat. Rev. Cancer. 4::309-314.

Ransohoff, D.F. 2005. . Lessons from controversy: ovarian cancer screening and serum proteomics. J. Natl. Cancer. Inst. 97::315-319.

Cheng, W., Liu, J., Yoshida, H., Rosen, D., and Naora, H. 2005. . Lineage infedility of epithelial ovarian cancers is controlled by HOX genes that specify regional identity in the reproductive tract. Nat. Med. 11::531-537.

Dick, J.E. 2003. . Breast cancer stem cells revealed. Proc. Natl. Acad. Sci. U. S. A. 100::3547-3549.

Park, I.K. et al. 2003. . Bmi-1 is required for maintenance of adult self-renewing haematopoietic stem cells. Nature. 423::302-305.

Lessard, J., and Sauvageau, G. 2003. . Bmi-1 determines the proliferative capacity of normal and leukaemic stem cells. Nature. 423::255-260.

Itahana, K. et al. 2003. . Control of the replicative life span of human fibroblasts by p16 and the polycomb protein Bmi-1. Mol. Cell. Biol. 23::389-401.

Molofsky, A.V. et al. 2003. . Bmi-1 dependence distinguishes neural stem cell self-renewal from progenitor proliferation. Nature. 425::962-967.

Jacobs, J.J. et al. 1999. . Bmi-1 collaborates with c-Myc in tumorigenesis by inhibiting c-Myc-induced apoptosis via INK4a/ARF. Genes Dev. 13::2678-2690.

Cahill, D.P. et al. 1998. . Mutations of mitotic checkpoint genes in human cancers. Nature. 392::300-303.

Hanks, S. et al. 2004. . Constitutional aneuploidy and cancer predisposition caused by biallelic mutations in BUB1B. Nat. Genet. 36::1159-1161.

Gao, A.C., Lou, W., and Isaacs, J.T. 2000. . Enhanced GBX2 expression stimulates growth of human prostate cancer cells via transcriptional up-regulation of the interleukin 6 gene. Clin. Cancer Res. 6::493-497.

Igarashi, T. et al. 2004. . Divergent cyclin B1 expression and Rb/p16/cyclin D1 pathway aberrations among pulmonary neuroendocrine tumors. Mod. Pathol. 17::1259-1267.

Dubrez, L. et al. 2001. . Cell cycle arrest is sufficient for p53-mediated tumor regression. Gene Ther. 8::1705-1712.

Moffa, A.B., Tannheimer, S.L., and Ethier, S.P. 2004. . Transforming potential of alternatively spliced variants of fibroblast growth factor receptor 2 in human mammary epithelial cells. Mol. Cancer Res. 2::643-652.

Chen, C., et al. 2005. Ubiquitin-proteasome degradation of KLF5 transcription factor in cancer and untransformed epithelial cells. Oncogene. doi:10.1038/sj.onc.1208497..

Sanchez-Beato, M. et al. 2004. . Abnormal PcG protein expression in Hodgkin’s lymphoma. Relation with E2F6 and NFkappaB transcription factors. J. Pathol. 204::528-537.

Hayashi, T., and Su, T.P. 2001. . Regulating ankyrin dynamics: Roles of sigma-1 receptors. Proc. Natl. Acad. Sci. U. S. A. 98::491-496.

Kretschmer, T. et al. 2002. . Painful neuromas: a potential role for a structural transmembrane protein, ankyrin G. J. Neurosurg. 97::1424-1431.

Redinbo, M.R., Bencharit, S., and Potter, P.M. 2003. . Human carboxylesterase 1: from drug metabolism to drug discovery. Biochem. Soc. Trans. 31::620-624.

Marsh, S. et al. 2004. . Pharmacogenomic assessment of carboxylesterases 1 and 2. Genomics. 84::661-668.

Gentleman, R. 2005. . Reproducible research: a bioinformatics case study. Statistical Applications in Genetics and Molecular Biology . 4::Article 2. http://www.bepress.com/sagmb/vol4/iss1/art2.

Gentleman, R., and Lang, D.T. 2004. Statistical Analyses and Reproducible Research. Bioconductor Project Working Papers . Working Paper 2. http://www.bepress.com/bioconductor/paper2..

Leisch, F., and Rossini, A. 2003. . Reproducible statistical research. Chance. 16::46-50.

Bhattacharjee, A. et al. 2001. . Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc. Natl. Acad. Sci. U. S. A. 98::13790-13795.

Ioannidis, J.P. 2005. . Microarrays and molecular research: noise discovery? Lancet. 365::454-455.

 

作者: John P. Lahad1, Gordon B. Mills1 and Kevin R. Coom 2007-5-11
医学百科App—中西医基础知识学习工具
  • 相关内容
  • 近期更新
  • 热文榜
  • 医学百科App—健康测试工具