如何合并R中不同维度的多个矩阵
我有这些不同维度的矩阵。所有矩阵中的key.related.sheet列都有一些公共值和一些唯一值。我想匹配这些常用行并合并所有三个矩阵,但我还想包括唯一的行。结果列应仅包含key.related.sheet、Sample_B和trace_1、trace_2和trace_3列。有人能帮我吗如何合并R中不同维度的多个矩阵,r,R,我有这些不同维度的矩阵。所有矩阵中的key.related.sheet列都有一些公共值和一些唯一值。我想匹配这些常用行并合并所有三个矩阵,但我还想包括唯一的行。结果列应仅包含key.related.sheet、Sample_B和trace_1、trace_2和trace_3列。有人能帮我吗 aa<-structure(c("S05-F13-P01:S05-F13-P01", "S05-F13-P01:S08-F10-P01", "S05-F13-P01:S08-F11-P01", "S0
aa<-structure(c("S05-F13-P01:S05-F13-P01", "S05-F13-P01:S08-F10-P01",
"S05-F13-P01:S08-F11-P01", "S05-F13-P01:S09-F66-P01", "S05-F13-P01",
"S08-F10-P01", "S08-F11-P01", "S09-F66-P01", "1.25", "0.227",
"-0.183", "-0.217"), .Dim = c(4L, 3L), .Dimnames = list(NULL,
c("key.related.sheet", "sample_B", "trace_1")))
bb<-structure(c("S05-F13-P01:S08-F10-P01", "S05-F13-P01:S08-F11-P01",
"S05-F13-P01:S09-F66-P01", "S05-F13-P01:S09-F67-P01", "S08-F10-P01",
"S08-F11-P01", "S09-F66-P01", "S09-F67-P01", "0.227", "-0.183",
"-0.217", "0.292", "Unknown", "Unknown", "Unknown", "Unknown"
), .Dim = c(4L, 4L), .Dimnames = list(NULL, c("key.related.sheet",
"sample_B", "trace_2", "type")))
cc<-structure(c("S05-F13-P01:S08-F11-P01", "S05-F13-P01:S09-F66-P01",
"S05-F13-P01:S09-F67-P01", "S05-F13-P01:S09-F68-P01", "S05-F13-P01:S09-F01-P01",
"S08-F11-P01", "S09-F66-P01", "S09-F67-P01", "S09-F68-P01", "S09-F01-P01",
"-0.183", "-0.217", "0.292", "-0.314", "0.0418"), .Dim = c(5L,
3L), .Dimnames = list(NULL, c("key.related.sheet", "sample_B",
"trace_3")))
您可以将矩阵转换为data.frame,并使用dplyr包中的full_join命令将它们连接在一起
library(dplyr)
for(i in c("aa","bb", "cc")) assign(i, data.frame(get(i)))
aa %>% full_join(bb, by="key.related.sheet") %>% full_join(cc,
by="key.related.sheet")
key.related.sheet sample_B.x trace_1 sample_B.y trace_2 type sample_B trace_3
1 S05-F13-P01:S05-F13-P01 S05-F13-P01 1.25 <NA> <NA> <NA> <NA> <NA>
2 S05-F13-P01:S08-F10-P01 S08-F10-P01 0.227 S08-F10-P01 0.227 Unknown <NA> <NA>
3 S05-F13-P01:S08-F11-P01 S08-F11-P01 -0.183 S08-F11-P01 -0.183 Unknown S08-F11-P01 -0.183
4 S05-F13-P01:S09-F66-P01 S09-F66-P01 -0.217 S09-F66-P01 -0.217 Unknown S09-F66-P01 -0.217
5 S05-F13-P01:S09-F67-P01 <NA> <NA> S09-F67-P01 0.292 Unknown S09-F67-P01 0.292
6 S05-F13-P01:S09-F68-P01 <NA> <NA> <NA> <NA> <NA> S09-F68-P01 -0.314
7 S05-F13-P01:S09-F01-P01 <NA> <NA> <NA> <NA> <NA> S09-F01-P01 0.0418
您可以将矩阵转换为data.frame,并使用dplyr包中的full_join命令将它们连接在一起
library(dplyr)
for(i in c("aa","bb", "cc")) assign(i, data.frame(get(i)))
aa %>% full_join(bb, by="key.related.sheet") %>% full_join(cc,
by="key.related.sheet")
key.related.sheet sample_B.x trace_1 sample_B.y trace_2 type sample_B trace_3
1 S05-F13-P01:S05-F13-P01 S05-F13-P01 1.25 <NA> <NA> <NA> <NA> <NA>
2 S05-F13-P01:S08-F10-P01 S08-F10-P01 0.227 S08-F10-P01 0.227 Unknown <NA> <NA>
3 S05-F13-P01:S08-F11-P01 S08-F11-P01 -0.183 S08-F11-P01 -0.183 Unknown S08-F11-P01 -0.183
4 S05-F13-P01:S09-F66-P01 S09-F66-P01 -0.217 S09-F66-P01 -0.217 Unknown S09-F66-P01 -0.217
5 S05-F13-P01:S09-F67-P01 <NA> <NA> S09-F67-P01 0.292 Unknown S09-F67-P01 0.292
6 S05-F13-P01:S09-F68-P01 <NA> <NA> <NA> <NA> <NA> S09-F68-P01 -0.314
7 S05-F13-P01:S09-F01-P01 <NA> <NA> <NA> <NA> <NA> S09-F01-P01 0.0418
您还可以使用base R中的merge方法在all=TRUE时执行完全联接 在这里,合并是通过所有公共列来完成的,即key.related.sheet和sample_B-但在这里应该可以,因为sample_B取决于key.related.sheet
使用by=key.related.sheet可以获得与使用dplyr在Adams答案中相同的输出。然后,合并仅在w.r.t.key.related.sheet中完成,来自左侧和右侧联接伙伴的示例_B列同时出现在结果中,即,对您的数据进行复制您还可以使用基本r的合并方法进行完全联接,且all=TRUE 在这里,合并是通过所有公共列来完成的,即key.related.sheet和sample_B-但在这里应该可以,因为sample_B取决于key.related.sheet
使用by=key.related.sheet可以获得与使用dplyr在Adams答案中相同的输出。然后,合并仅在w.r.t.key.related.sheet中完成,并且来自左侧和右侧联接伙伴的示例_B列都出现在结果中,即为您的数据复制两个嵌套合并并删除无关列
merge(merge(aa,bb[, -4], by=c("key.related.sheet", "sample_B") ,all=TRUE),
cc, by=c("key.related.sheet", "sample_B") ,all=TRUE)
key.related.sheet sample_B trace_1 trace_2 trace_3
1 S05-F13-P01:S05-F13-P01 S05-F13-P01 1.25 <NA> <NA>
2 S05-F13-P01:S08-F10-P01 S08-F10-P01 0.227 0.227 <NA>
3 S05-F13-P01:S08-F11-P01 S08-F11-P01 -0.183 -0.183 -0.183
4 S05-F13-P01:S09-F66-P01 S09-F66-P01 -0.217 -0.217 -0.217
5 S05-F13-P01:S09-F67-P01 S09-F67-P01 <NA> 0.292 0.292
6 S05-F13-P01:S09-F01-P01 S09-F01-P01 <NA> <NA> 0.0418
7 S05-F13-P01:S09-F68-P01 S09-F68-P01 <NA> <NA> -0.314
两个嵌套合并和删除无关列
merge(merge(aa,bb[, -4], by=c("key.related.sheet", "sample_B") ,all=TRUE),
cc, by=c("key.related.sheet", "sample_B") ,all=TRUE)
key.related.sheet sample_B trace_1 trace_2 trace_3
1 S05-F13-P01:S05-F13-P01 S05-F13-P01 1.25 <NA> <NA>
2 S05-F13-P01:S08-F10-P01 S08-F10-P01 0.227 0.227 <NA>
3 S05-F13-P01:S08-F11-P01 S08-F11-P01 -0.183 -0.183 -0.183
4 S05-F13-P01:S09-F66-P01 S09-F66-P01 -0.217 -0.217 -0.217
5 S05-F13-P01:S09-F67-P01 S09-F67-P01 <NA> 0.292 0.292
6 S05-F13-P01:S09-F01-P01 S09-F01-P01 <NA> <NA> 0.0418
7 S05-F13-P01:S09-F68-P01 S09-F68-P01 <NA> <NA> -0.314
这可以通过组合使用Reduce和merge来完成,如下所示:
Reduce(function(x, y) merge(x, y, all=TRUE), list(aa, bb[,-4], cc))
结果是:
key.related.sheet sample_B trace_1 trace_2 trace_3
1 S05-F13-P01:S05-F13-P01 S05-F13-P01 1.25 <NA> <NA>
2 S05-F13-P01:S08-F10-P01 S08-F10-P01 0.227 0.227 <NA>
3 S05-F13-P01:S08-F11-P01 S08-F11-P01 -0.183 -0.183 -0.183
4 S05-F13-P01:S09-F66-P01 S09-F66-P01 -0.217 -0.217 -0.217
5 S05-F13-P01:S09-F67-P01 S09-F67-P01 <NA> 0.292 0.292
6 S05-F13-P01:S09-F01-P01 S09-F01-P01 <NA> <NA> 0.0418
7 S05-F13-P01:S09-F68-P01 S09-F68-P01 <NA> <NA> -0.314
尤其是当您有三个以上的矩阵/数据帧时,使用带Reduce scales的merge比嵌套merge更好。这可以通过Reduce和merge的组合来实现,如下所示:
Reduce(function(x, y) merge(x, y, all=TRUE), list(aa, bb[,-4], cc))
结果是:
key.related.sheet sample_B trace_1 trace_2 trace_3
1 S05-F13-P01:S05-F13-P01 S05-F13-P01 1.25 <NA> <NA>
2 S05-F13-P01:S08-F10-P01 S08-F10-P01 0.227 0.227 <NA>
3 S05-F13-P01:S08-F11-P01 S08-F11-P01 -0.183 -0.183 -0.183
4 S05-F13-P01:S09-F66-P01 S09-F66-P01 -0.217 -0.217 -0.217
5 S05-F13-P01:S09-F67-P01 S09-F67-P01 <NA> 0.292 0.292
6 S05-F13-P01:S09-F01-P01 S09-F01-P01 <NA> <NA> 0.0418
7 S05-F13-P01:S09-F68-P01 S09-F68-P01 <NA> <NA> -0.314
尤其是当您有三个以上的矩阵/数据帧时,使用带缩减比例的合并比嵌套合并更好。@Ronaksha请查看预期输出。@Ronaksha请查看预期输出。这与@Patrick的答案不同吗?唯一的区别是删除了无关列。在我这么做之前,我犯了一些错误。直到我发布之后,我才看到Patrick的。使用Reduce而不是嵌套合并调用是否会更具可伸缩性和可读性?这是一个很好的主意。我最初尝试了do.call解决方案,但失败了。也许你应该发布一个更好的解决方案?我认为这是有价值的。这和@Patrick的答案有什么不同吗?唯一的区别是删除了无关的专栏。在我这么做之前,我犯了一些错误。直到我发布之后,我才看到Patrick的。使用Reduce而不是嵌套合并调用是否会更具可伸缩性和可读性?这是一个很好的主意。我最初尝试了do.call解决方案,但失败了。也许你应该发布一个更好的解决方案?我认为它有优点。