接上一篇DESeq2包差异分析,使用clusterProfiler包进行KEGG富集分析。
1. 提取基因集ID
得到差异分析中基因上下调的结果之后,就可以进行KEGG分析了,KEGG分析只需要一串基因id就可以了,因此首先先提取上下调基因集,这里我们使用entrez id进行基因富集。
1 2 3 4 5 6 7 8 9 10
| library(clusterProfiler) library(ggplot2) library(ggrepel) library("AnnotationDbi") library("org.Dm.eg.db")
gene_up <- DEG[DEG$change == "UP","entrez"] gene_down <- DEG[DEG$change == "DOWN","entrez"] gene_diff <- c(gene_up,gene_down) gene_all <- as.character(DEG[,"entrez"])
|
2. KEGG富集
接下来进行KEGG富集,enrichKEGG函数的keytype参数有4个选项:”kegg”、’ncbi-geneid’、’ncib-proteinid’、’uniprot’。
1 2 3 4 5 6 7 8 9 10 11 12
| kk.up <- enrichKEGG(gene = gene_up, organism = "dme", keyType = "ncbi-geneid", universe = gene_all, pvalueCutoff = 0.8, qvalueCutoff = 0.8) kk.down <- enrichKEGG(gene = gene_down, organism = "dme", keyType = "ncbi-geneid", universe = gene_all, pvalueCutoff = 0.8, qvalueCutoff = 0.8)
|
提取KEGG分析的结果,将上调和下调的基因富集结果整合在一起。
1 2 3 4 5 6 7 8 9 10 11 12
| kegg_down_dt <- as.data.frame(kk.down) kegg_up_dt <- as.data.frame(kk.up) down_kegg <- kegg_down_dt[kegg_down_dt$pvalue < 0.05,] down_kegg$group = -1 up_kegg <- kegg_up_dt[kegg_up_dt$pvalue < 0.05,] up_kegg$group = 1
dat <- rbind(up_kegg, down_kegg) dat$pvalue <- -log10(dat$pvalue) dat$pvalue <- dat$pvalue * dat$group
dat <- dat[order(dat$pvalue, decreasing = F),]
|
绘图。
1 2 3 4 5 6 7 8
| ggplot(dat, aes(x = reorder(Description, order(pvalue, decreasing=F)), y = pvalue, fill = group)) + geom_bar(stat = "identity") + scale_fill_gradient(low = "blue", high = "red", guide = FALSE) + scale_x_discrete(name = "Pathway names") + scale_y_continuous(name = "log10P-value") + coord_flip() + theme_bw() + theme(plot.title = element_text(hjust = 0.5)) + ggtitle("Pathway Enrichment")
|