R 根据注释矩阵中的信息对列进行分组

R 根据注释矩阵中的信息对列进行分组,r,matrix,dplyr,R,Matrix,Dplyr,我正在寻求有关如何完成以下任务的建议: 我正在分析一个单细胞RNA序列数据集。我在一个表中有我的规范化表达式数据(每列有一个唯一的单元格ID,每行是一个基因) 我还有一个注释矩阵,其中我有每个单元格的信息(每行是一个单元格ID,每列是一条信息(如患者ID、站点等) 对于下游分析,我希望根据注释矩阵中的可用信息进行不同的分组。你们有什么建议我可以这样做吗 例如,我有这个 expression_matrix<-matrix(c(1:4), nrow = 4,ncol =4, dimnames

我正在寻求有关如何完成以下任务的建议:

我正在分析一个单细胞RNA序列数据集。我在一个表中有我的规范化表达式数据(每列有一个唯一的单元格ID,每行是一个基因)

我还有一个注释矩阵,其中我有每个单元格的信息(每行是一个单元格ID,每列是一条信息(如患者ID、站点等)

对于下游分析,我希望根据注释矩阵中的可用信息进行不同的分组。你们有什么建议我可以这样做吗

例如,我有这个

expression_matrix<-matrix(c(1:4), nrow = 4,ncol =4, dimnames = list(c("gene1", "gene2", "gene3", "gene4"),c("cell1","cell2","cell3","cell4")))

annotation_matrix<-matrix(c("1526","1788", "1526","1788","controller","noncontroller","controller","noncontroller","LN","PB","LN","PB"), nrow = 4,ncol =3, dimnames = list(c("cell1","cell2","cell3","cell4"),c("ID","Status","Site")))

expression\u matrix
expression\u matrixIt如果您能够提供一个示例,同时提供您的数据样本和您迄今为止尝试的代码,那么它将非常有用。
expression_matrix<-matrix(c(1:4), nrow = 4,ncol =4, dimnames = list(c("gene1", "gene2", "gene3", "gene4"),c("cell1","cell2","cell3","cell4")))

#       cell1 cell2 cell3 cell4
# gene1     1     1     1     1
# gene2     2     2     2     2
# gene3     3     3     3     3
# gene4     4     4     4     4

annotation_matrix<-matrix(c("1526","1788", "1526","1788","controller","noncontroller","controller","noncontroller","LN","PB","LN","PB"), nrow = 4,ncol =3, dimnames = list(c("cell1","cell2","cell3","cell4"),c("ID","Status","Site")))

#       ID     Status          Site
# cell1 "1526" "controller"    "LN"
# cell2 "1788" "noncontroller" "PB"
# cell3 "1526" "controller"    "LN"
# cell4 "1788" "noncontroller" "PB"
library(dplyr)

expression_df <- expression_matrix %>%
  as.data.frame(stringsAsFactor=F) %>%
  mutate(gene = rownames(.)) %>%
  gather(cell,value,-gene)

#     gene  cell value
# 1  gene1 cell1     1
# 2  gene2 cell1     2
# 3  gene3 cell1     3
# 4  gene4 cell1     4
# 5  gene1 cell2     1
# 6  gene2 cell2     2
# 7  gene3 cell2     3
# 8  gene4 cell2     4
# 9  gene1 cell3     1
# 10 gene2 cell3     2
# 11 gene3 cell3     3
# 12 gene4 cell3     4
# 13 gene1 cell4     1
# 14 gene2 cell4     2
# 15 gene3 cell4     3
# 16 gene4 cell4     4

annotation_df <- annotation_matrix %>%
  as.data.frame(stringsAsFactor=F) %>%
  mutate(cell = rownames(.))

#     ID        Status Site  cell
# 1 1526    controller   LN cell1
# 2 1788 noncontroller   PB cell2
# 3 1526    controller   LN cell3
# 4 1788 noncontroller   PB cell4
example1 <- annotation_df %>%
  filter(Site == "LN") %>%
  inner_join(expression_df)

#     ID     Status Site  cell  gene value
# 1 1526 controller   LN cell1 gene1     1
# 2 1526 controller   LN cell1 gene2     2
# 3 1526 controller   LN cell1 gene3     3
# 4 1526 controller   LN cell1 gene4     4
# 5 1526 controller   LN cell3 gene1     1
# 6 1526 controller   LN cell3 gene2     2
# 7 1526 controller   LN cell3 gene3     3
# 8 1526 controller   LN cell3 gene4     4

example2 <- example1 %>%
  spread(gene,value)

#     ID     Status Site  cell gene1 gene2 gene3 gene4
# 1 1526 controller   LN cell1     1     2     3     4
# 2 1526 controller   LN cell3     1     2     3     4