0%

RNA-seq分析(4)之KEGG富集分析

接上一篇DESeq2包差异分析,使用clusterProfiler包进行KEGG富集分析。

1. 提取基因集ID

得到差异分析中基因上下调的结果之后,就可以进行KEGG分析了,KEGG分析只需要一串基因id就可以了,因此首先先提取上下调基因集,这里我们使用entrez id进行基因富集。

1
2
3
4
5
6
7
8
9
10
library(clusterProfiler)
library(ggplot2)
library(ggrepel)
library("AnnotationDbi")
library("org.Dm.eg.db")

gene_up <- DEG[DEG$change == "UP","entrez"]
gene_down <- DEG[DEG$change == "DOWN","entrez"]
gene_diff <- c(gene_up,gene_down)
gene_all <- as.character(DEG[,"entrez"])

2. KEGG富集

接下来进行KEGG富集,enrichKEGG函数的keytype参数有4个选项:”kegg”、’ncbi-geneid’、’ncib-proteinid’、’uniprot’。

1
2
3
4
5
6
7
8
9
10
11
12
kk.up <- enrichKEGG(gene          =  gene_up,
organism = "dme",
keyType = "ncbi-geneid",
universe = gene_all,
pvalueCutoff = 0.8,
qvalueCutoff = 0.8)
kk.down <- enrichKEGG(gene = gene_down,
organism = "dme",
keyType = "ncbi-geneid",
universe = gene_all,
pvalueCutoff = 0.8,
qvalueCutoff = 0.8)

提取KEGG分析的结果,将上调和下调的基因富集结果整合在一起。

1
2
3
4
5
6
7
8
9
10
11
12
kegg_down_dt <- as.data.frame(kk.down)
kegg_up_dt <- as.data.frame(kk.up)
down_kegg <- kegg_down_dt[kegg_down_dt$pvalue < 0.05,]
down_kegg$group = -1
up_kegg <- kegg_up_dt[kegg_up_dt$pvalue < 0.05,]
up_kegg$group = 1

dat <- rbind(up_kegg, down_kegg)
dat$pvalue <- -log10(dat$pvalue)
dat$pvalue <- dat$pvalue * dat$group

dat <- dat[order(dat$pvalue, decreasing = F),]

绘图。

1
2
3
4
5
6
7
8
ggplot(dat, aes(x = reorder(Description, order(pvalue, decreasing=F)), 
y = pvalue, fill = group)) +
geom_bar(stat = "identity") +
scale_fill_gradient(low = "blue", high = "red", guide = FALSE) +
scale_x_discrete(name = "Pathway names") +
scale_y_continuous(name = "log10P-value") +
coord_flip() + theme_bw() + theme(plot.title = element_text(hjust = 0.5)) +
ggtitle("Pathway Enrichment")