用R中的聚类为PCA图着色
我有一些看起来像这样的生物数据,有两种不同类型的簇(A和B): 我清理数据:用R中的聚类为PCA图着色,r,pca,R,Pca,我有一些看起来像这样的生物数据,有两种不同类型的簇(A和B): 我清理数据: CAGE<-read.table("CAGE_expression_matrix.txt", header=T) CAGE_data <- as.data.frame(CAGE) #Remove clusters with 0 expression for all 6 samples CAGE_filter <- CAGE[rowSums(abs(CAGE[,2:7]))>0,] #Filt
CAGE<-read.table("CAGE_expression_matrix.txt", header=T)
CAGE_data <- as.data.frame(CAGE)
#Remove clusters with 0 expression for all 6 samples
CAGE_filter <- CAGE[rowSums(abs(CAGE[,2:7]))>0,]
#Filter whole file to keep only clusters with at least 5 TPM in at least 3 files
CAGE_filter_more <- CAGE_filter[apply(CAGE_filter[,2:7] >= 5,1,sum) >= 3,]
CAGE_data <- as.data.frame(CAGE_filter_more)
错误如下所示:
> qplot(PCA.CAGE$x[,1:3],PCA.CAGE$x[4:6,], xlab="Data 1", ylab="Data 2")
Error: Aesthetics must either be length one, or the same length as the dataProblems:PCA.CAGE$x[4:6, ]
> qplot(PC1, PC2, colour = CAGE_data, geom=c("point"), label=CAGE_data, data=as.data.frame(PCA.CAGE$x))
Don't know how to automatically pick scale for object of type data.frame. Defaulting to continuous
Don't know how to automatically pick scale for object of type data.frame. Defaulting to continuous
Error: Aesthetics must either be length one, or the same length as the dataProblems:CAGE_data, CAGE_data
> ggplot(data=PCA.CAGE, aes(x=PCA1, y=PCA2, colour=CAGE_filter_more, label=CAGE_filter_more)) + geom_point() + geom_text()
Error: ggplot2 doesn't know how to deal with data of class
你的问题(至少对我来说)没有意义。您似乎有两组3个变量(A组和B组)。当你对这6个变量运行PCA时,你会得到6个主成分,每个主成分都是所有6个变量的(不同的)线性组合。聚类基于案例(行)。如果您想基于前两台PC(一种常见的方法)对数据进行集群,那么您需要明确地做到这一点。下面是一个使用内置
iris
数据集的示例
pca <- prcomp(iris[,1:4], scale.=TRUE)
clust <- kmeans(pca$x[,1:2], centers=3)$cluster
library(ggbiplot)
ggbiplot(pca, groups=factor(clust)) + xlim(-3,3)
pca您遇到了什么错误?请在上面编辑以向您展示!我从未使用过qplot,但很明显,您从上一个函数中得到的错误是PCA.CAGE不是data.frame我在开始时将其设置为数据帧。。。对于在R中绘制PCA图,您还有其他建议吗?您没有在任何时候将PCA.CAGE设置为data.frame您是如何得到PCA$x矩阵的?我不明白prcomp(…)
返回一个“prcomp”对象,它是一个命名列表。其中一个元素,x
,是包含主要成分的矩阵。键入str(pca)
。谢谢,我使用:'pca.cage绘制了一个绘图。此代码不应运行:ggplot中的“文本”几何图形绘制标签而不是点,因此您必须指定标签的使用内容。我建议你读一本关于ggplot的教程。
qplot(PC1, PC2, colour = CAGE_data, geom=c("point"), label=CAGE_data, data=as.data.frame(PCA.CAGE$x))
ggplot(data=PCA.CAGE, aes(x=PCA1, y=PCA2, colour=CAGE_filter_more, label=CAGE_filter_more)) + geom_point() + geom_text()
qplot(PCA.CAGE[1:3], PCA.CAGE[4:6], label=colnames(PC1, PC2, PC3), geom=c("point", "text"))
> qplot(PCA.CAGE$x[,1:3],PCA.CAGE$x[4:6,], xlab="Data 1", ylab="Data 2")
Error: Aesthetics must either be length one, or the same length as the dataProblems:PCA.CAGE$x[4:6, ]
> qplot(PC1, PC2, colour = CAGE_data, geom=c("point"), label=CAGE_data, data=as.data.frame(PCA.CAGE$x))
Don't know how to automatically pick scale for object of type data.frame. Defaulting to continuous
Don't know how to automatically pick scale for object of type data.frame. Defaulting to continuous
Error: Aesthetics must either be length one, or the same length as the dataProblems:CAGE_data, CAGE_data
> ggplot(data=PCA.CAGE, aes(x=PCA1, y=PCA2, colour=CAGE_filter_more, label=CAGE_filter_more)) + geom_point() + geom_text()
Error: ggplot2 doesn't know how to deal with data of class
pca <- prcomp(iris[,1:4], scale.=TRUE)
clust <- kmeans(pca$x[,1:2], centers=3)$cluster
library(ggbiplot)
ggbiplot(pca, groups=factor(clust)) + xlim(-3,3)