Literature
首页医源资料库在线期刊动脉硬化血栓血管生物学杂志2006年第26卷第6期

Genome-Wide Expression Studies of Atherosclerosis

来源:《动脉硬化血栓血管生物学杂志》
摘要:ApproachesforGeneExpressionProfilingofAtherosclerosisUsingVascularSamplesIngeneexpressionprofilingstudiesinatherosclerosis,transcriptlevelshavebeendeterminedinhumanaswellasanimalvascularsamples。OverviewofGeneExpressionProfilingStudiesinMurineAtherosclerosisMo......

点击显示 收起

【摘要】  During the past 6 years, gene expression profiling of atherosclerosis has been used to identify genes and pathways relevant in vascular (patho)physiology. This review discusses some critical issues in the methodology, analysis, and interpretation of the data of gene expression studies that have made use of vascular specimens from animal models and humans. Analysis of gene expression studies has evolved toward the genome-wide expression profiling of large series of individual samples of well-characterized donors. Despite the advances in statistical and bioinformatical analysis of expression data sets, studies have not yet fully exploited the potential of gene expression data sets to obtain novel insights into the molecular mechanisms underlying atherosclerosis. To assess the potential of published expression data, we compared the data of a CC chemokine gene cluster between 18 murine and human gene expression profiling articles. Our analysis revealed that an adequate comparison is mainly hindered by the incompleteness of available data sets. The challenge for future vascular genomic profiling studies will be to further improve the experimental design, statistical, and bioinformatical analysis and to make data sets freely accessible.

This review discusses some critical issues in the methodology, analysis, and interpretation of gene expression studies using vascular specimens from animals and humans. Our analysis demonstrates that future studies may benefit from recent developments in statistical and bioinformatical analysis methods to exploit the full potential of transcriptomics data.

【关键词】  atherosclerosis gene expression genetically altered mice pathology vascular biology


Introduction


Since the beginning of this millennium, various techniques to quantify gene expression on a genome-wide scale have been applied to identify genes and pathways underlying atherosclerosis. Genome-wide approaches provide the ability to survey the expression level of thousands of genes simultaneously. Common study objectives are: (1) the identification of significantly differentially expressed genes between samples or groups of samples; (2) the generation of hypotheses about the mechanisms underlying the observed phenotypes 1,2; and (3) the identification of gene expression patterns for classification purposes.


Gene expression experiments produce complex and large data sets, and many investigators are not experienced in the analytical steps needed to convert tens of thousands of data points into reliable and interpretable biologic information. It is important to realize that several prerequisites need to be fulfilled to select the appropriate genes to meet the objectives described above. The experimental design needs to fulfill some minimal criteria to obtain meaningful results, 3 such as a clear description of analyzed samples, and sample sizes representative of the population and its intrinsic variance. Furthermore, the analysis strategy for a genome-wide experiment should be determined in light of the overall objective of the study. 1 For instance, cluster analysis (ie, a method for partitioning samples into groups on the basis of the similarities and differences among their gene expression profiles) can help in generating hypotheses but does not provide statistically valid quantitative information about the degree of differential expression between classes. 1,2


In this review, we give an overview and critical analysis of the design, analytical approaches, and outcome of published data sets of atherosclerosis. Focusing on murine and human atherosclerosis, we discuss the possibility to integrate information obtained from different expression studies. We exemplify shortcomings in available data by comparing published expression data of seven CC chemokine family members in murine and human atherosclerosis. Our analysis reveals the methodological limitations in the published studies and highlights challenges for future genomic profiling.


Approaches for Gene Expression Profiling of Atherosclerosis Using Vascular Samples


In gene expression profiling studies in atherosclerosis, transcript levels have been determined in human as well as animal vascular samples. Mice are most commonly used animals (7 studies; Table 2 ), but larger animals such as rats 4-6 and pigs 7 have also been used.


TABLE 2. Overview of Gene Expression Profiling Studies in Murine Atherosclerosis


Most studies have been performed on small numbers of biological replicates, resulting in a low statistical power for detecting differentially expressed genes. Although the value of studies with small sample sizes should not be underestimated, recent studies 8 clearly demonstrate that increasing the sample size increases the statistical power and decreases the error rate. If the number of samples is not a true representation of the population and its intrinsic variance, the distribution of parameters will be biased toward those specific for the type of samples collected. For the same statistical power, fewer individuals are needed when using inbred animals compared with that required using outbred human subjects. An approach to using fewer human samples, but maintaining the same statistical power, is to select samples with lower variability in the gene expression profile eg, using samples from the same site in the vascular bed, or from donors of the same gender, similar age, or clinical record. Finding consistent data within an outbred population can be among the most convincing kinds of evidence. Stronger arguments can be made against the validity of studies performed in inbred animals, which are theoretically equivalent to only one individual within an outbred population. This is exemplified by the variations in expression levels of inbred strains with different genetic backgrounds. 9


In the majority of gene expression profiling studies, the analysis has been performed on entire vessel segments 10-25 to get information about all the molecular processes involved in the development of atherosclerosis. A limitation of this approach is the lack of insight into the underlying reason for observed differences in gene expression levels. These differences may reflect the changed composition of the vessel wall during atherosclerosis (eg, thinning of the medial smooth muscle cell layer) result from the presence of different cell types (eg, infiltration of T-lymphocytes), or reflect a change in the gene expression profile of cells or a subpopulation of cells caused by a pro-atherosclerotic environment (eg, differentiation of macrophages to foam cells).


To circumvent the problem of analyzing complex tissues made up of multiple cell types, 3 different approaches have been used that correct for differences in plaque composition by the isolation of relatively pure cell populations. In 3 studies, 7,26,27 macrodissection was used to separate the smooth muscle cell (SMC)-rich fibrous cap, 26,27 media, 7,26 and nonatherosclerotic intima. 26 To this end, the adventitia was trimmed, the endothelium was scraped off, and the fibrous cap was dissected from the necrotic core and shoulder regions. In 2 studies, 10,28 laser capture microdissection was used to dissect SMCs or macrophages from whole mount specimens for subsequent cell-specific RNA isolation. A drawback of both macrodissection and laser capture microdissection is the low tissue and RNA yield. This necessitates pooling of samples and/or (several rounds of) amplification techniques. Pooling of samples limits the possibilities for downstream statistical analysis (see "Key Features of Gene Expression Studies in Atherosclerosis"). Pros and cons of amplification have been discussed in several recent articles and include issues such as reproducibility 29 and effects on magnitude of differential expression 30 caused by amplification. An alternative approach to obtain relatively pure cell populations and to remove cell products is to culture cells after isolation from entire vessel wall samples. An advantage of this approach is that cell type-specific transcripts are amplified in culture omitting the need for pooling or amplication procedures. However, a disadvantage is that the in vitro culture may cause a shift in the transcriptome. As a result, the expression profiles of cultured cells may not be entirely representative of the in vivo situation. In the current review, we do not aim to provide an overview of articles using in vitro cultures to study atherosclerosis.


Platforms Applied for Expression Profiling of Atherosclerosis


Various screening methods have been applied to study gene expression patterns in atherosclerosis: array analysis, 10-22,27,28 suppression subtractive hybridization (SSH), 23 and cDNA representational difference analysis (RDA). 24,25,31 Array studies 32 used genome-wide platforms to analyze nearly the complete transcriptome or "selected" arrays to measure the mRNA levels of several hundreds of genes. SSH 23 and RDA 24,25,31 are polymerase chain reaction (PCR)-based screening techniques to identify genes with a differential expression level. The aim of SSH and RDA is to amplify transcripts that are differentially expressed and to identify the corresponding genes after sequencing of the cloned amplicons. In contrast with array studies, these techniques do not provide quantitative information with respect to the differences in mRNA abundance. Advantages of SSH and RDA are that these techniques are not limited to the evaluation of genes represented on an array. Currently, microarrays, 33,34 as well as SSH 7 and RDA, 24,35 allow for the detection of low abundant genes.


Key Features of Gene Expression Studies of Atherosclerosis


Critical issues regarding the quality of a gene expression study are experimental design, analysis, and the availability of data sets. An overview of how these criteria are met in atherosclerosis studies are represented in Table 1. For this analysis we included all of the 20 published articles using whole mount human and murine vessels.


TABLE 1. Overview of Gene Expression Studies in Atherosclerosis


Design


To assess the design, we scored the published studies based on following parameters: (1) morphological data of the vessel samples; (2) general donor data (gender and age); (3) clinical data (eg, diabetes, blood pressure, medication); (4) comparison of samples with similar cellular composition; and (5) profiling of individual plaques. This analysis shows that gene expression profiling studies over the past 6 years have evolved from experiments assessing differences in expression levels of several hundreds of genes between pools of poorly characterized samples, to profiling of thousands of genes in well-characterized individual samples from donors with detailed information about age, gender, and, to a lesser extent, clinical data. Human plaques have been classified in different ways: based on the American Heart Association guidelines, 13,18,19,21,23,25,28 based on pulse wave velocity measurements, 20 or after macroscopic inspection. 16,22 However, a detailed microscopic characterization of human samples has only been given in 2 studies. 18,28 Consequently, this lack of transparency in the morphology of analyzed samples hinders extrapolation and comparison of gene expression results. In only a minority of studies, 10,25-28 the expression profiles of samples with a uniform cell content have been compared. However, one of these studies compares the transcriptome of SMC-enriched and macrophage-enriched cell populations 28 and does not exploit the advantage of isolating relatively pure cell populations. Last, but not least, we observed an evolution from analyzing pools of samples toward individual samples, which has a positive effect on the possibilities for downstream statistical analysis.


Statistical and Bioinformatical Analysis


In evaluating analytical procedures, we have taken into account various statistical and bioinformatical issues, as well as the need for validation of array results. To identify genes that are differentially expressed in the studied conditions, the degree of differential expression was evaluated in different ways ( Table 1 ). Most studies ranked the magnitude of differential expression based on fold-change in expression level. 10-12,14-16,18,19,27,28 Although this method is simple and intuitive, fold-change does not address the reproducibility of the observed difference and cannot be used to determine the statistical significance (for a review, see Draghici et al 36 ). To address this, comparison statistics (eg, t test, 19,21,13,15 ANOVA, 17 Wilcoxon ANOVA 19 ) need to be used to assign a confidence level to the differential expression. These statistics require replicates and use the variability within the replicates to assign a probability value that indicates the probability of incorrectly classifying a gene as differentially regulated. It needs to be considered that expression levels of thousands of genes are analyzed simultaneously in a microarray. This implies that a subset of genes will always be classified as "significantly different" simply by chance. To reduce the chance of false-positives, the probability value needs to be adjusted for multiple testing error (for a review, see Draghici et al 36 ). At present, however, there is no consensus about the best method to correct for multiple testing in microarray experiments. To this end, 5 recent studies have used permutation-based methods: significance analysis of microarrays 37 (SAM), 13,21 custom-made analysis algorithms including permutations to study multiple sets of time-course microarray data, 9,13 or by comparison with randomized data sets to quantify false-positive frequency in the data sets. 26


The statistical techniques discussed above aim to identify genes that are differentially expressed on a gene-by-gene basis. An alternative method of analyzing microarray data sets is to exploit the correlation in expression patterns between genes that perform similar functions or belong to the same biological pathway. To this end, various multivariate analysis methods have been developed to identify patterns of gene expression in microarray data. 38 In atherosclerosis research, hierarchical clustering, 17 K-means clustering, 22 self-organizing maps, 11 principal component analysis, 13 or custom-made methods 12 have been used. This "unsupervised" clustering 38 is performed without previous knowledge of groups and may lead to the identification of genes that share previously unknown common expression patterns (identification of novel subsets of genes) or the discovery of new subtypes of disease (identification of subsets in experimental groups). This type of analysis has not yet led to follow-up studies to confirm the proposed coregulation of genes or novel disease subsets in atherosclerosis. Clustering procedures have also been performed on differentially expressed genes (eg, on genes with P <0.05 using Student t test 19 ) instead of on complete expression data sets. When clustering is used on a selected data set, the added value of cluster analysis is limited because it will yield the same results as in the previous selection procedure and will not provide additional information about differences in expression patterns. 19


Apart from these studies, 2 studies 13,22 classified vessel samples according to location 22 or disease state 13,22 based on the expression levels of classifier genes. Classifier genes are a fixed subset of "informative genes" chosen based on the correlation of their expression level with a class distinction and used to make a prediction about a new sample on the basis of the expression level of these genes. The statistics associated with sample classification are quite a bit different than the statistics associate with the comparison of gene expression levels. A critical overview 2 and a typical, well-documented example from the cancer field 1 are added as references.


In atherosclerosis, classifier genes have been identified after analysis of a large series of samples (7 to 32 individual samples of the same artery per experimental group; Table 3 ). The validation has been performed using out-of-sample cross validation 22 or independent (murine and human) test sets. 13 In contrast to comparable studies in cancer research (eg, predicting the survival from breast or prostate cancer), 39,40 these studies have no direct clinical application. However, they indicate the existence of expression profiles in different stages or sites of atherosclerotic plaque development, which are conserved despite inter-individual 22 or even inter-species 13 variations.


TABLE 3. Overview of Gene Expression Profiling Studies in Human Atherosclerosis


To unveil the biological relevance of gene expression profiles, biological information from differentially expressed genes need to be obtained. To this end, literature mining (eg, using NCBI, Gene Ontology , Kyoto Encyclopedia of Genes and Genomes , or Biocarta databases) can be performed. Several atherosclerosis studies with a limited number of differentially expressed genes 10,11,14-17,23,25,27,31 have been analyzed at the individual gene level (eventually followed by grouping of functionally related genes 19 ). To overcome the enormous efforts that are needed to perform literature mining on a gene-by-gene basis, especially across a data set of several thousands of genes, tools have been developed to link genomics data to literature data in an automated way (for a review see Curtis et al 41 ). As demonstrated in Table 1, GO-based analyses are widespread. GO is a manually curated database using a standardized vocabulary of terms describing biological processes, cellular components, and molecular functions of genes. In addition, it provides a hierarchical structure for organizing genes into biologically relevant groups. In 7 studies, 12,13,18,20-22,28 differentially expressed genes were categorized based on GO terms. However, to identify genes that warrant further study, GO-based enrichment procedures are more appropriate to identify GO terms that are overrepresented in a set of differentially expressed genes (eg, in King et al 21 ). In these enrichment procedures several statistical methods 41 can be used to calculate whether more genes from a particular pathway or classification are differentially expressed as would be expected by chance (expressed as probability value, z-score, or odds ratio 41 ). Important drawbacks of GO-based procedures are: (1) results are restricted to the genes for which information is available in the databases (sometimes only half of the genes that are represented in a microarray); and (2) abundantly studied genes may be associated with a variety of GO terms, including terms unlikely to be relevant to the studied process. As an example, we compared the GO terms of human, mouse and rat CCL2 using the "AmiGO" tool (http://www.godatabase.org/). The data in Table I (please see http://atvb.ahajournals.org) exemplify the bias that can be introduced when categorizing genes based on GO terms. Whereas the GO database indicates that human CCL2 is involved in 11 biological functions (including "humoral immune response"), the function of mouse CCL2 is not described at all and the function of rat CCL2 is limited to "inflammatory response." Even more striking is the discrepancy in cellular components of CCL2 according to the GO database (extracellular space for human and murine CCL2, but cytoplasmatic for rat CCL2) and the low degree of similarity in molecular function between CCL2 among various species. This comparison illustrates that GO-based analysis procedures need to be carried out and interpreted with caution.


Moreover, it should be obvious that GO analysis can only be meaningful when performed on a statistical valid data set. For instance, the results of one particular study describing mouse strain-specific differences in vascular wall gene expression 9 need to be interpreted with caution because GO analysis was performed on genes that were determined to be differentially expressed using a false discovery rate of nearly 50%.


As statistical and bioinformatical analysis techniques improve, detection of smaller changes in expression is becoming feasible. However, it is still unclear how statistical and bioinformatical significance relates to biological relevance. Subtle but consistent changes in expression of a group of genes with a related function may result in significant changes at the pathway level. However, experimental validation is especially important for small changes in expression levels before concluding that these changes contribute to biologically relevant differences. Taking this into account, validation of "in silico" pathways consisting of genes with slight differences in expression level, is required for a valid interpretation of the biological significance of biostatistical results. In this respect, one should be aware that differences in RNA transcript levels do not necessarily reflect differences at the protein or functional level.


Moreover, clustering methods and literature network analysis may be biased because they are based on coregulation and copublication, respectively, and that a biological interaction can only be demonstrated using "wet laboratory" experiments (eg, by coimmunoprecipitation to prove physical interactions, or by reporter assays for transcriptional activity). However, as summarized in Table 1, recent articles that apply state-of-the art statistical analysis to identify subtle differences in expression level do not validate these results on the same or independent samples. Neither do they validate results at the RNA level using independent methods, nor at the protein or functional level. Thus, they have failed to show the biological relevance of the observed differences at the RNA level.


Availability of Data Sets


Finally, our analysis reveals that data from only 3 expression studies expression are available in a public repository for microarray data (GEO data sets GSE1560, 9 GSE420, 20 and GSE2143 21 ). Even more striking, GSE1560 contains the expression data of only two-thirds of the experimental groups (ie, for all C3H/HeJ and C57BL/6 *groups, but not of the apolipoprotein E -/-tm1Unc groups). Moreover, GSE2143 does not include a clear description of the samples, which makes the expression data impossible to interpret. As is illustrated in the section entitled "Integration of Genome-Wide Expression Data Sets," this lack of public data compromises our ability to gain insight into the molecular basis of atherosclerosis.


Together, based on our evaluation of gene expression studies in atherosclerosis, we consider 5 critical issues important in determining the quality of gene expression experiments using whole vessel samples: (1) transparent and detailed description of analyzed samples; (2) use of individual samples representative of the population and its intrinsic variance; (3) comparison statistics correcting for multiple testing error; (4) experimental validation of the biological significance of differentially expressed genes; and (5) public availability of data sets.


Gene Expression Profiling of Murine Atherosclerosis


Genetically engineered C57Bl/6J mice lacking functional apolipoprotein E (apoE -/- ) or low-density lipoprotein receptor (LDLR -/- ) are widespread models for atherosclerosis, and as such they have also been used for genome-wide expression studies (in 6 and 2 studies, respectively; Table 2 ).


A limiting factor for gene expression studies in mice is the small amount of RNA that can be extracted from murine vessels. Therefore, 3 to 30 samples per experimental condition are pooled. RNA is extracted from entire aortas or aortic arches (thus including diseased and nondiseased areas) rather than from individual plaques, or T7 polymerase-based linear amplification protocols are used before microarray analysis ( Table 2 ).


Array studies analyzed the gene expression profiles of arterial wall samples of mice of different ages or those fed various diets, 9,11-13 after cytomegalovirus infection, 14 or after maternal hypercholesterolemia 15 ( Table 2 ). Remarkably, the primary objective of genome-wide studies in mice has been to identify individual differentially expressed genes whose functions are well-described, as opposed to unraveling novel genes/pathways. Genes that were repeatedly reported to be differentially expressed were chemokines, 11-15 chemokine receptors, 11,13,15 and cathepsins. 11,15 Based on information obtained from GO and other databases, differentially expressed genes represented biological themes such as inflammation, 12,13 matrix degradation, 12,13 and ossification 13 (which is in agreement with our current knowledge that these processes have an important role in the development of this disease), but also less expected pathways such as carbohydrate metabolism. 13 Only one study used cDNA representational difference analysis (RDA) 24 to analyze the expression in the ascending aorta and arch in apoE -/- and LDLR -/- mice. This yielded mostly ESTs and uncharacterized genes. This suggests a major contribution of genes that were novel at the time when the arrays were performed.


Gene Expression Profiling of Human Atherosclerosis


As shown in Table 3, most expression profiling studies of human atherosclerosis use experimental groups of 3 to 5 human vascular specimens from surgery or autopsy. Carotid and coronary arteries are most often used. In 8 of 15 reported studies ( Table 3 ), profiles were made from individual samples. In the other 7 studies, RNA from different patients was pooled, thus abandoning informative data about interpersonal variability, and, consequently, about the relationship between diversity in patient data (eg, genetics, environmental factors) and gene expression levels. In 5 studies, 10,18,23,28,31 samples from different sites were combined. A drawback of combining samples from different sites is that differences in shear stress, embryonic origin, or elastic versus muscular phenotype between arteries may have an effect on expression profiles and thus obscure the data. 7 Therefore, the use of samples from a single site may reduce the variability in gene expression and fewer samples may be needed for the same statistical power.


Four studies used "selected arrays" to evaluate vascular expression levels of a few hundred genes, 10,16,17,27 selected because of their involvement in apoptosis for example. 10,16,27 These arrays do not provide a genome-wide view but give the expression profile of the selected subset of genes. Therefore, they are suited to identify individual genes differentially expressed in atherosclerosis (eg, death-associated protein kinase, 10 Egr-1 10,16,27 and Egr-1-inducible genes 10,16,27 ), but not to provide an unbiased expression profile.


Six articles 18-22,28 used genome-wide arrays to screen the expression of genes in human atherosclerosis ( Table 3 ). Because these studies assay tens of thousands genes simultaneously, they provide an unbiased profile of expressed genes, which may lead to insights into previously unknown molecular interactions. However, most of these studies aimed to identify the expression level of individual genes, and not pathways. 18-20 In these studies, GO analysis suggested that the differentially expressed genes were involved in inflammation, 19,21,22 cell turnover, 19,21,22 matrix degradation, 18,19 lipid metabolism, 18,19 coding for matrix proteins, 20 or originated from SMC proliferation and dedifferentiation. 21 Thus, to date, gene expression studies have merely validated the differential expression of genes and pathways known to be involved in atherosclerosis, 42-45 and have yet to fully exploit the power and possibilities of identifying novel players (and eventually novel pathways) underlying atherosclerosis.


To compare gene expression patterns from different parts of the atherosclerotic vessel wall, laser capture microdissection 10,28 and macrodissection 26,27 have been used to isolate relatively pure cell populations from whole mount specimen ( Table 3 ). As mentioned, these techniques help to overcome one of the shortcomings of the whole mount approach as it corrects for the cellular composition of a sample. Using a "selected arrays" of genes involved in apoptosis, Martinet et al 10 compared the expression of apoptosis-related genes between microdissected medial SMCs from atherosclerotic plaques and nondiseased mammary arteries, which revealed no differences between these SMC subpopulations. Yet the expression profile of genes involved in apoptosis in medial SMCs was shown to differ from the profile of the SMCs in the fibrous cap. 10,16,27 A cluster of stress-responding genes was shown to be upregulated in the fibrous cap. Adams et al 26 showed more consistent differences in gene expression between the intimal and medial SMC than between the cap and the media, because the adjacent nonatherosclerotic intima appears much more media-like than does the cap.


Expression profiling of vascular intimal SMC-rich and macrophage-rich shoulder regions 28 revealed differences in the expression level of genes involved in a variety of biological processes, such as cell signaling, structure, and metabolism. Several, but not all, of these differences were also present in PMA-stimulated versus nonstimulated THP-1 macrophages in vitro. Most remarkable was the 16-fold increased expression level of 3-hydroxy-3-methylglutaryl (HMG) coenzyme A (CoA) reductase in macrophage-rich areas compared with vascular intimal SMCs. This suggests that the plaque stabilizing effect of statins may result from the direct effect of these drugs on HMG CoA reductase expression in plaque macrophages compared with intimal SMCs.


Apart from arrays, other techniques have been used to analyze human atherosclerosis. One study used RDA to examine differential gene expression between normal and atherosclerotic human vessels. 25,31 In these studies, a high proportion (33% to 55%) of differentially expressed sequences represented genes that have not yet been annotated or functionally characterized. The article by Faber et al 23 is the only one to our knowledge that reports the gene expression profile of atherosclerotic plaques with a thrombus ( Table 3 ). In this study, SSH is used to make an inventory of genes that are differentially expressed in whole-mount stable versus thrombus-containing plaques. Several genes that had not previously been linked to atherosclerosis (including perilipin 23 and a previously unidentified gene named vasculin 46 ) were identified as being differentially expressed.


In conclusion, over the past 6 years gene expression profiling of human atherosclerosis has lead to the identification of individual genes involved in atherosclerosis and has, to a lesser extent, allowed the relative significance of pathways involved in atherosclerosis to be evaluated. Pathways associated with the differentially expressed genes were inflammation, (smooth muscle) cell proliferation, cell death, matrix formation and degradation, and lipid metabolism. Although genome-wide analysis offers the opportunity to obtain an unbiased picture of the genes expressed during atherosclerosis, the majority of studies have focused on known genes and have not exploited the full potential of genome-wide screening to identify novel player in atherosclerosis. The high proportion of uncharacterized ESTs at the time when the microarrays, SSH and RDA, studies were analyzed suggests that the molecular basis of atherosclerosis is far more complicated than our current knowledge of genes and pathways. Therefore, complete insight into the molecular mechanisms of atherosclerosis may only be reached if gene expression data are combined with research that focuses on the characterization of currently unknown genes.


Integration of Genome-Wide Expression Data Sets


Based on the results of published gene profiling studies in mice and humans, we conclude that studies of the expression of individual genes give only molecular snapshots of the atherosclerotic process. A way to complete the picture of the complex biological network involved in atherosclerosis is to combine and integrate data from different expression profiling studies.


To evaluate the possibility of integrating published information from various genome-wide expression studies, we analyzed the expression data of 7 chemokines with a cystein-cystein (CC) motif (CCLs), which were recently demonstrated by our 12 and other 47-49 groups to be differentially expressed during murine atherosclerosis. CCLs are members of a family of small secreted proteins (8 to 16 kDa) that mediate migration and activation of monocytes and T-lymphocytes into the tissue. An important role for CC chemokines in the pathogenesis of atherosclerosis has been demonstrated in murine atherosclerosis models, for example, using anti-CCL2 (anti-MCP-1) therapy, 47,48 antagonists of cysteine-cysteine chemokine receptor 5 (CCR5, RANTES receptor) 49 or mice lacking functional CCR2. 50,51


A major bottleneck in analyzing these studies was the lack of publicly available data sets. Expression data from only 3 of 20 gene expression studies 9,20,21 are deposited in a public repository for microarray data (GEO data sets GSE1560, GSE420, and GSE2143 ). As mentioned in the section entitled "Key Features of Gene Expression Studies in Atherosclerosis," GSE420 is incomplete and GSE2143 does not include a clear description of the samples. Other studies include incomplete lists of differentially expressed genes, with or without quantitative details. All together, this makes an adequate comparison impossible.


Comparison of our own data set (7 CCLs) 12 with 6 other murine atherosclerosis studies 11,15,24 yielded little information about the differential expression of CCLs (Table I). CCL2 levels were higher after murine cytomegalovirus infection of the aorta of apoE -/- mice, coinciding with an increase in atherosclerotic plaque area, 14 and were higher in ApoE-/- fed a high-fat diet compared with those on a chow diet, as well as compared with C57Bl/6 and C3H/HeJ mice. 9,13 CCL4 expression was higher in the atherosclerotic aorta of ApoE-/- mice in comparison to the normal aorta of C57Bl/6 mice. 11 Comparison of gene expression profiles from non-atherosclerotic aortas of C57Bl/6 mice and after 40 weeks high fat diet (lipid deposits), 9 revealed that CCL2, 3, 4 and 8 are only marginally expressed in C57Bl/6 aortas even after high-fat feeding, and that CCL12 was significantly downregulated after lipid accumulation. Although it is difficult to draw a conclusion with such a limited data set, our analysis suggests that the available gene expression data on murine atherosclerosis are in line with each other.


Only 3 of 13 studies on gene expression profiles of human vascular specimens reported expression levels of selected CCLs (Table I). In total, 9 hits were found, which correspond to 10% of the query of 7 genes in 13 data sets (91 potential hits). In contrast to the murine data set that indicates an upregulation of CCL2 throughout plaque progression, CCL2 was downregulated in coronary arteries from patients with unstable angina versus stable angina 17 and was not differentially expressed according to the degree of aortic stiffness. 20 However, this comparison is hampered by the fact that plaque instability and aortic stiffness in humans does not necessarily reflect the same molecular processes as plaque progression in mice. CCL3 was mentioned in the list of genes associated with disease burden, 22 but neither quantitative nor qualitative details were given. CCL4 and CCL5 were expressed in 2 of 4 stiff and 1 of 4 distensible aortic biopsies, 20 suggesting higher CCL4 and CCL5 expression levels in more advanced stages of atherosclerosis. If so, the expression pattern of CCL5 in humans may differ from its expression in mice. Unfortunately, the CCL5 data reported by Seo et al 22 cannot be used to give a conclusive answer, because the latter study does not specify whether CCL5 is upregulated or downregulated according to disease burden. CCL8 (MCP-2) levels were lower in patients with unstable compared with stable angina 17 and were higher in stiff compared with distensible aortas. 20 For CCL12 and CCL21, we did not find any evidence for differential expression in human atherosclerosis. Therefore, our analysis indicates that the available information is not sufficient to draw a definitive conclusion as to whether data from mouse studies can be translated to the human situation.


Our analysis revealed that the incompleteness of published data sets is the major hindrance for the integration of information. Unexpectedly, our comparison showed that in a few studies differences in gene expression levels were insignificant for genes that are already considered to play an important role in atherosclerosis in terms of function or protein expression (eg, CCL2 and CCL5; Table II, available online at http://atvb.ahajournals.org). A possible explanation is that RNA levels are a poor representation of the protein levels and functional activity of these chemokines. Additional bottlenecks are the different ways in which the degree of atherosclerosis can be classified, the use of different models, platforms, statistical approaches, and reference tissues. As indicated in Tables 2 and 3, there is no consensus about a reference tissue in atherosclerosis, and the reference varies based on the addressed research question. One way to deal with this is to use a general common reference. 52 To facilitate the comparison of datasets, initiatives have been taken to investigate the differences in array data gathered on different platforms 53 and to agree on a reference standard to facilitate the comparison of data sets. 52 Apart from this, it is important to note that the dissimilarities identified are the results of comparisons made at the level of individual genes. Because subtle changes in single gene expression levels may act in concert to result in significant changes at the pathway level, it may be worthwhile to transcend from the differential expression of individual genes to the identification of important biological processes underlying atherosclerosis. Recent insights 54,55 have demonstrated that single gene-based analyses (all atherosclerosis studies so far) may miss important changes at the pathway level. As an example, pathway analysis has been shown to reveal differences in gene expression at the pathway level in type 2 diabetes, whereas it was not possible to detect significant differences at the level of individual genes. 56


Conclusions


Although gene array technology has brought the possibility of examining changes in the entire transcriptome of vessel wall specimens, it remains a challenge to use these data to characterize atherosclerosis at the molecular level. A critical appraisal of the methodology and analysis of published genome-wide gene expression studies on murine and human atherosclerotic vascular samples reveals significant improvements in the design and analysis of recent array studies.


Acknowledgments


A.P.J.J.B. is a postdoctoral fellow of the NWO Innovational Research VENI program (grant 916.46.083). E.L. is a post-doctoral fellow of the Dr E. Dekker program of the Dutch Heart Foundation (2000T41). M.J.A.P.D. is involved in the NWO-genomics grant 050-10-014. The Departments of Pathology, University of Maastricht, Medical Biochemistry, University of Amsterdam, Amsterdam, and the Division of Biopharmaceutics, Leiden/Amsterdam Center for Drug Research, Leiden, are all involved in the European Vascular Genomics Network (grant LSHM-CT-2003-503254.). The authors thank Kitty Schapira for proofreading the manuscript.

【参考文献】
  Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science. 1999; 286: 531-537.

Simon R, Radmacher MD, Dobbin K, McShane LM. Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. J Natl Cancer Inst. 2003; 95: 14-18.

Churchill GA. Fundamentals of experimental design for cDNA microarrays. Nat Genet. 2002; 32 (Suppl): 490-495.

Csiszar A, Ungvari Z, Koller A, Edwards JG, Kaley G. Aging-induced proinflammatory shift in cytokine expression profile in coronary arteries. FASEB J. 2003; 17: 1183-1185.

Csiszar A, Ungvari Z, Koller A, Edwards JG, Kaley G. Proinflammatory phenotype of coronary arteries promotes endothelial apoptosis in aging. Physiol Genomics. 2004; 17: 21-30.

Herrera VM, Didishvili T, Lopez LV, Ruiz-Opazo N. Differential regulation of functional gene clusters in overt coronary artery disease in a transgenic atherosclerosis-hypertensive rat model. Mol Med. 2002; 8: 367-375.

Qin M, Zeng Z, Zheng J, Shah PK, Schwartz SM, Adams LD, Sharifi BG. Suppression subtractive hybridization identifies distinctive expression markers for coronary and internal mammary arteries. Arterioscler Thromb Vasc Biol. 2003; 23: 425-433.

Wei C, Li J, Bumgarner RE. Sample size for detecting differentially expressed genes in microarray experiments. BMC Genomics. 2004; 5: 87.

Tabibiazar R, Wagner RA, Spin JM, Ashley EA, Narasimhan B, Rubin EM, Efron B, Tsao PS, Tibshirani R, Quertermous T. Mouse strain-specific differences in vascular wall gene expression and their relationship to vascular disease. Arterioscler Thromb Vasc Biol. 2005; 25: 302-308.

Martinet W, Schrijvers DM, De Meyer GR, Thielemans J, Knaapen MW, Herman AG, Kockx MM. Gene expression profiling of apoptosis-related genes in human atherosclerosis: upregulation of death-associated protein kinase. Arterioscler Thromb Vasc Biol. 2002; 22: 2023-2029.

Wuttge DM, Sirsjo A, Eriksson P, Stemme S. Gene expression in atherosclerotic lesion of ApoE deficient mice. Mol Med. 2001; 7: 383-392.

Lutgens E, Faber B, Schapira K, Evelo CT, van Haaften R, Heeneman S, Cleutjens KB, Bijnens AP, Beckers L, Porter JG, Mackay CR, Rennert P, Bailly V, Jarpe M, Dolinski B, Koteliansky V, de Fougerolles T, Daemen MJ. Gene profiling in atherosclerosis reveals a key role for small inducible cytokines: validation using a novel monocyte chemoattractant protein monoclonal antibody. Circulation. 2005; 111: 3443-3452.

Tabibiazar R, Wagner RA, Ashley EA, King JY, Ferrara R, Spin JM, Sanan DA, Narasimhan B, Tibshirani R, Tsao PS, Efron B, Quertermous T. Signature patterns of gene expression in mouse atherosclerosis and their correlation to human coronary disease. Physiol Genomics. 2005; 22: 213-226.

Burnett MS, Durrani S, Stabile E, Saji M, Lee CW, Kinnaird TD, Hoffman EP, Epstein SE. Murine cytomegalovirus infection increases aortic expression of proatherosclerotic genes. Circulation. 2004; 109: 893-897.

Napoli C, de Nigris F, Welch JS, Calara FB, Stuart RO, Glass CK, Palinski W. Maternal hypercholesterolemia during pregnancy promotes early atherogenesis in LDL receptor-deficient mice and alters aortic gene expression determined by microarray. Circulation. 2002; 105: 1360-1367.

Woodside KJ, Hernandez A, Smith FW, Xue XY, Hu M, Daller JA, Hunter GC. Differential gene expression in primary and recurrent carotid stenosis. Biochem Biophys Res Commun. 2003; 302: 509-514.

Randi AM, Biguzzi E, Falciani F, Merlini P, Blakemore S, Bramucci E, Lucreziotti S, Lennon M, Faioni EM, Ardissino D, Mannucci PM. Identification of differentially expressed genes in coronary atherosclerotic plaques from patients with stable or unstable angina by cDNA array analysis. J Thromb Haemost. 2003; 1: 829-835.

Hiltunen MO, Tuomisto TT, Niemi M, Brasen JH, Rissanen TT, Toronen P, Vajanto I, Yla-Herttuala S. Changes in gene expression in atherosclerotic plaques analyzed using DNA array. Atherosclerosis. 2002; 165: 23-32.

Archacki SR, Angheloiu G, Tian XL, Tan FL, DiPaola N, Shen GQ, Moravec C, Ellis S, Topol EJ, Wang Q. Identification of new genes differentially expressed in coronary artery disease by expression profiling. Physiol Genomics. 2003; 15: 65-74.

Durier S, Fassot C, Laurent S, Boutouyrie P, Couetil JP, Fine E, Lacolley P, Dzau VJ, Pratt RE. Physiological genomics of human arteries: quantitative relationship between gene expression and arterial stiffness. Circulation. 2003; 108: 1845-1851.

King JY, Ferrara R, Tabibiazar R, Spin JM, Chen MM, Kuchinsky A, Vailaya A, Kincaid R, Tsalenko A, Deng DX, Connolly A, Zhang P, Yang E, Watt C, Yakhini Z, Ben-Dor A, Adler A, Bruhn L, Tsao P, Quertermous T, Ashley EA. Pathway analysis of coronary atherosclerosis. Physiol Genomics. 2005; 23: 103-118.

Seo D, Wang T, Dressman H, Herderick EE, Iversen ES, Dong C, Vata K, Milano CA, Rigat F, Pittman J, Nevins JR, West M, Goldschmidt-Clermont PJ. Gene expression phenotypes of atherosclerosis. Arterioscler Thromb Vasc Biol. 2004; 24: 1922-1927.

Faber BC, Cleutjens KB, Niessen RL, Aarts PL, Boon W, Greenberg AS, Kitslaar PJ, Tordoir JH, Daemen MJ. Identification of genes potentially involved in rupture of human atherosclerotic plaques. Circ Res. 2001; 89: 547-554.

Borang S, Andersson T, Thelin A, Odeberg J, Lundeberg J. Vascular gene expression in atherosclerotic plaque-prone regions analyzed by representational difference analysis. Pathobiology. 2004; 71: 107-114.

Tyson KL, Weissberg PL, Shanahan CM. Heterogeneity of gene expression in human atheroma unmasked using cDNA representational difference analysis. Physiol Genomics. 2002; 9: 121-130.

Adams LD, Geary RL, Li J, Rossini A, Schwartz SM. Expression profiling identifies smooth muscle cell diversity within human intima and plaque fibrous cap: loss of RGS5 distinguishes the cap. Arterioscler Thromb Vasc Biol. 2006; 26: 319-325.

McCaffrey TA, Fu C, Du B, Eksinar S, Kent KC, Bush H, Jr, Kreiger K, Rosengart T, Cybulsky MI, Silverman ES, Collins T. High-level expression of Egr-1 and Egr-1-inducible genes in mouse and human atherosclerosis. J Clin Invest. 2000; 105: 653-662.

Tuomisto TT, Korkeela A, Rutanen J, Viita H, Brasen JH, Riekkinen MS, Rissanen TT, Karkola K, Kiraly Z, Kolble K, Yla-Herttuala S. Gene expression in macrophage-rich inflammatory cell infiltrates in human atherosclerotic lesions as studied by laser microdissection and DNA array: overexpression of HMG-CoA reductase, colony stimulating factor receptors, CD11A/CD18 integrins, and interleukin (IL) receptors. Arterioscler Thromb Vasc Biol. 2003; 23: 2235-2240.

Polacek DC, Passerini AG, Shi C, Francesco NM, Manduchi E, Grant GR, Powell S, Bischof H, Winkler H, Stoeckert CJ Jr, Davies PF. Fidelity and enhanced sensitivity of differential transcription profiles following linear amplification of nanogram amounts of endothelial mRNA. Physiol Genomics. 2003; 13: 147-156.

Schneider J, Buness A, Huber W, Volz J, Kioschis P, Hafner M, Poustka A, Sultmann H. Systematic analysis of T7 RNA polymerase based in vitro linear RNA amplification for use in microarray experiments. BMC Genomics. 2004; 5: 29.

Zhang QJ, Goddard M, Shanahan C, Shapiro L, Bennett M. Differential gene expression in vascular smooth muscle cells in primary atherosclerosis and in stent stenosis in humans. Arterioscler Thromb Vasc Biol. 2002; 22: 2030-2036.

Holloway AJ, van Laar RK, Tothill RW, Bowtell DD. Options available-from start to finish-for obtaining data from DNA microarrays II. Nat Genet. 2002; 32 (Suppl): 481-489.

Chudin E, Walker R, Kosaka A, Wu SX, Rabert D, Chang TK, Kreder DE. Assessment of the relationship between signal intensities and transcript concentration for Affymetrix GeneChip arrays. Genome Biol. 2002; 3: RESEARCH0005.

Carter MG, Sharov AA, VanBuren V, Dudekula DB, Carmack CE, Nelson C, Ko MS. Transcript copy number estimation using a mouse whole-genome oligonucleotide microarray. Genome Biol. 2005; 6: R61.

Andersson T, Unneberg P, Nilsson P, Odeberg J, Quackenbush J, Lundeberg J. Monitoring of representational difference analysis subtraction procedures by global microarrays. Biotechniques. 2002; 32: 1348-1358.

Draghici S. Statistical intelligence: effective analysis of high-density microarray data. Drug Discov Today. 2002; 7: S55-63.

Tusher VG, Tibshirani R, Chu G. Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci U S A. 2001; 98: 5116-5121.

Quackenbush J. Computational analysis of microarray data. Nat Rev Genet. 2001; 2: 418-427.

Glinsky GV, Glinskii AB, Stephenson AJ, Hoffman RM, Gerald WL. Gene expression profiling predicts clinical outcome of prostate cancer. J Clin Invest. 2004; 113: 913-923.

Glinsky GV, Higashiyama T, Glinskii AB. Classification of human breast cancer using gene expression profiling as a component of the survival predictor algorithm. Clin Cancer Res. 2004; 10: 2272-2283.

Curtis RK, Oresic M, Vidal-Puig A. Pathways to the analysis of microarray data. Trends Biotechnol. 2005; 23: 429-435.

Kockx MM, Herman AG. Apoptosis in atherosclerosis: beneficial or detrimental? Cardiovasc Res. 2000; 45: 736-746.

Libby P, Ridker PM, Maseri A. Inflammation and atherosclerosis. Circulation. 2002; 105: 1135-1143.

Schwartz SM, Virmani R, Rosenfeld ME. The good smooth muscle cells in atherosclerosis. Curr Atheroscler Rep. 2000; 2: 422-429.

Galis ZS, Khatri JJ. Matrix metalloproteinases in vascular remodeling and atherogenesis: the good, the bad, and the ugly. Circ Res. 2002; 90: 251-262.

Bijnens AP, Gils A, Jutten B, Faber BC, Heeneman S, Kitslaar PJ, Tordoir JH, de Vries CJ, Kroon AA, Daemen MJ, Cleutjens KB. Vasculin, a novel vascular protein differentially expressed in human atherogenesis. Blood. 2003; 102: 2803-2810.

Ni W, Egashira K, Kitamoto S, Kataoka C, Koyanagi M, Inoue S, Imaizumi K, Akiyama C, Nishida KI, Takeshita A. New anti-monocyte chemoattractant protein-1 gene therapy attenuates atherosclerosis in apolipoprotein E-knockout mice. Circulation. 2001; 103: 2096-2101.

Inoue S, Egashira K, Ni W, Kitamoto S, Usui M, Otani K, Ishibashi M, Hiasa K, Nishida K, Takeshita A. Anti-monocyte chemoattractant protein-1 gene therapy limits progression and destabilization of established atherosclerosis in apolipoprotein E-knockout mice. Circulation. 2002; 106: 2700-2706.

Veillard NR, Kwak B, Pelli G, Mulhaupt F, James RW, Proudfoot AE, Mach F. Antagonism of RANTES receptors reduces atherosclerotic plaque formation in mice. Circ Res. 2004; 94: 253-261.

Boring L, Gosling J, Cleary M, Charo IF. Decreased lesion formation in CCR2-/- mice reveals a role for chemokines in the initiation of atherosclerosis. Nature. 1998; 394: 894-897.

Guo J, Van Eck M, Twisk J, Maeda N, Benson GM, Groot PH, Van Berkel TJ. Transplantation of monocyte CC-chemokine receptor 2-deficient bone marrow into ApoE3-Leiden mice inhibits atherogenesis. Arterioscler Thromb Vasc Biol. 2003; 23: 447-453.

Baker SC, Bauer SR, Beyer RP, Brenton JD, Bromley B, Burrill J, Causton H, Conley MP, Elespuru R, Fero M, Foy C, Fuscoe J, Gao X, Gerhold DL, Gilles P, Goodsaid F, Guo X, Hackett J, Hockett RD, Ikonomi P, Irizarry RA, Kawasaki ES, Kaysser-Kranich T, Kerr K, Kiser G, Koch WH, Lee KY, Liu C, Liu ZL, Lucas A, Manohar CF, Miyada G, Modrusan Z, Parkes H, Puri RK, Reid L, Ryder TB, Salit M, Samaha RR, Scherf U, Sendera TJ, Setterquist RA, Shi L, Shippy R, Soriano JV, Wagar EA, Warrington JA, Williams M, Wilmer F, Wilson M, Wolber PK, Wu X, Zadro R. The External RNA Controls Consortium: a progress report. Nat Methods. 2005; 2: 731-734.

Irizarry RA, Warren D, Spencer F, Kim IF, Biswal S, Frank BC, Gabrielson E, Garcia JG, Geoghegan J, Germino G, Griffin C, Hilmer SC, Hoffman E, Jedlicka AE, Kawasaki E, Martinez-Murillo F, Morsberger L, Lee H, Petersen D, Quackenbush J, Scott A, Wilson M, Yang Y, Ye SQ, Yu W. Multiple-laboratory comparison of microarray platforms. Nat Methods. 2005; 2: 345-350.

Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005; 102: 15545-15550.

Segal E, Friedman N, Kaminski N, Regev A, Koller D. From signatures to models: understanding cancer using microarrays. Nat Genet. 2005; 37 (Suppl): S38-S45.

Mootha VK, Lindgren CM, Eriksson KF, Subramanian A, Sihag S, Lehar J, Puigserver P, Carlsson E, Ridderstrale M, Laurila E, Houstis N, Daly MJ, Patterson N, Mesirov JP, Golub TR, Tamayo P, Spiegelman B, Lander ES, Hirschhorn JN, Altshuler D, Groop LC. PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Genet. 2003; 34: 267-273.


作者单位:A.P.J.J. Bijnens; E. Lutgens; T. Ayoubi; J. Kuiper; A.J. Horrevoets; M.J.A.P. DaemenFrom the Department of Pathology (A.P.J.J.B., E.L., M.J.A.P.D.), Cardiovascular Research Institute Maastricht (CARIM), University of Maastricht; the Department of Population Genetics (T.A.), Cardiovascular Research I

作者: Critical Issues in Methodology, Analysis, Interpre
医学百科App—中西医基础知识学习工具
  • 相关内容
  • 近期更新
  • 热文榜
  • 医学百科App—健康测试工具