R 根据注释矩阵中的信息对列进行分组
我正在寻求有关如何完成以下任务的建议: 我正在分析一个单细胞RNA序列数据集。我在一个表中有我的规范化表达式数据(每列有一个唯一的单元格ID,每行是一个基因) 我还有一个注释矩阵,其中我有每个单元格的信息(每行是一个单元格ID,每列是一条信息(如患者ID、站点等) 对于下游分析,我希望根据注释矩阵中的可用信息进行不同的分组。你们有什么建议我可以这样做吗 例如,我有这个R 根据注释矩阵中的信息对列进行分组,r,matrix,dplyr,R,Matrix,Dplyr,我正在寻求有关如何完成以下任务的建议: 我正在分析一个单细胞RNA序列数据集。我在一个表中有我的规范化表达式数据(每列有一个唯一的单元格ID,每行是一个基因) 我还有一个注释矩阵,其中我有每个单元格的信息(每行是一个单元格ID,每列是一条信息(如患者ID、站点等) 对于下游分析,我希望根据注释矩阵中的可用信息进行不同的分组。你们有什么建议我可以这样做吗 例如,我有这个 expression_matrix<-matrix(c(1:4), nrow = 4,ncol =4, dimnames
expression_matrix<-matrix(c(1:4), nrow = 4,ncol =4, dimnames = list(c("gene1", "gene2", "gene3", "gene4"),c("cell1","cell2","cell3","cell4")))
annotation_matrix<-matrix(c("1526","1788", "1526","1788","controller","noncontroller","controller","noncontroller","LN","PB","LN","PB"), nrow = 4,ncol =3, dimnames = list(c("cell1","cell2","cell3","cell4"),c("ID","Status","Site")))
expression\u matrixexpression\u matrixIt如果您能够提供一个示例,同时提供您的数据样本和您迄今为止尝试的代码,那么它将非常有用。
expression_matrix<-matrix(c(1:4), nrow = 4,ncol =4, dimnames = list(c("gene1", "gene2", "gene3", "gene4"),c("cell1","cell2","cell3","cell4")))
# cell1 cell2 cell3 cell4
# gene1 1 1 1 1
# gene2 2 2 2 2
# gene3 3 3 3 3
# gene4 4 4 4 4
annotation_matrix<-matrix(c("1526","1788", "1526","1788","controller","noncontroller","controller","noncontroller","LN","PB","LN","PB"), nrow = 4,ncol =3, dimnames = list(c("cell1","cell2","cell3","cell4"),c("ID","Status","Site")))
# ID Status Site
# cell1 "1526" "controller" "LN"
# cell2 "1788" "noncontroller" "PB"
# cell3 "1526" "controller" "LN"
# cell4 "1788" "noncontroller" "PB"
library(dplyr)
expression_df <- expression_matrix %>%
as.data.frame(stringsAsFactor=F) %>%
mutate(gene = rownames(.)) %>%
gather(cell,value,-gene)
# gene cell value
# 1 gene1 cell1 1
# 2 gene2 cell1 2
# 3 gene3 cell1 3
# 4 gene4 cell1 4
# 5 gene1 cell2 1
# 6 gene2 cell2 2
# 7 gene3 cell2 3
# 8 gene4 cell2 4
# 9 gene1 cell3 1
# 10 gene2 cell3 2
# 11 gene3 cell3 3
# 12 gene4 cell3 4
# 13 gene1 cell4 1
# 14 gene2 cell4 2
# 15 gene3 cell4 3
# 16 gene4 cell4 4
annotation_df <- annotation_matrix %>%
as.data.frame(stringsAsFactor=F) %>%
mutate(cell = rownames(.))
# ID Status Site cell
# 1 1526 controller LN cell1
# 2 1788 noncontroller PB cell2
# 3 1526 controller LN cell3
# 4 1788 noncontroller PB cell4
example1 <- annotation_df %>%
filter(Site == "LN") %>%
inner_join(expression_df)
# ID Status Site cell gene value
# 1 1526 controller LN cell1 gene1 1
# 2 1526 controller LN cell1 gene2 2
# 3 1526 controller LN cell1 gene3 3
# 4 1526 controller LN cell1 gene4 4
# 5 1526 controller LN cell3 gene1 1
# 6 1526 controller LN cell3 gene2 2
# 7 1526 controller LN cell3 gene3 3
# 8 1526 controller LN cell3 gene4 4
example2 <- example1 %>%
spread(gene,value)
# ID Status Site cell gene1 gene2 gene3 gene4
# 1 1526 controller LN cell1 1 2 3 4
# 2 1526 controller LN cell3 1 2 3 4