如何合并R中不同维度的多个矩阵

如何合并R中不同维度的多个矩阵,r,R,我有这些不同维度的矩阵。所有矩阵中的key.related.sheet列都有一些公共值和一些唯一值。我想匹配这些常用行并合并所有三个矩阵,但我还想包括唯一的行。结果列应仅包含key.related.sheet、Sample_B和trace_1、trace_2和trace_3列。有人能帮我吗 aa<-structure(c("S05-F13-P01:S05-F13-P01", "S05-F13-P01:S08-F10-P01", "S05-F13-P01:S08-F11-P01", "S0

我有这些不同维度的矩阵。所有矩阵中的key.related.sheet列都有一些公共值和一些唯一值。我想匹配这些常用行并合并所有三个矩阵,但我还想包括唯一的行。结果列应仅包含key.related.sheet、Sample_B和trace_1、trace_2和trace_3列。有人能帮我吗

aa<-structure(c("S05-F13-P01:S05-F13-P01", "S05-F13-P01:S08-F10-P01", 
"S05-F13-P01:S08-F11-P01", "S05-F13-P01:S09-F66-P01", "S05-F13-P01", 
"S08-F10-P01", "S08-F11-P01", "S09-F66-P01", "1.25", "0.227", 
"-0.183", "-0.217"), .Dim = c(4L, 3L), .Dimnames = list(NULL, 
    c("key.related.sheet", "sample_B", "trace_1")))

bb<-structure(c("S05-F13-P01:S08-F10-P01", "S05-F13-P01:S08-F11-P01", 
"S05-F13-P01:S09-F66-P01", "S05-F13-P01:S09-F67-P01", "S08-F10-P01", 
"S08-F11-P01", "S09-F66-P01", "S09-F67-P01", "0.227", "-0.183", 
"-0.217", "0.292", "Unknown", "Unknown", "Unknown", "Unknown"
), .Dim = c(4L, 4L), .Dimnames = list(NULL, c("key.related.sheet", 
"sample_B", "trace_2", "type")))

cc<-structure(c("S05-F13-P01:S08-F11-P01", "S05-F13-P01:S09-F66-P01", 
"S05-F13-P01:S09-F67-P01", "S05-F13-P01:S09-F68-P01", "S05-F13-P01:S09-F01-P01", 
"S08-F11-P01", "S09-F66-P01", "S09-F67-P01", "S09-F68-P01", "S09-F01-P01", 
"-0.183", "-0.217", "0.292", "-0.314", "0.0418"), .Dim = c(5L, 
3L), .Dimnames = list(NULL, c("key.related.sheet", "sample_B", 
"trace_3")))

您可以将矩阵转换为data.frame,并使用dplyr包中的full_join命令将它们连接在一起

library(dplyr)
for(i in c("aa","bb", "cc")) assign(i, data.frame(get(i)))
aa %>% full_join(bb, by="key.related.sheet") %>% full_join(cc,
by="key.related.sheet")

        key.related.sheet  sample_B.x trace_1  sample_B.y trace_2    type    sample_B trace_3
1 S05-F13-P01:S05-F13-P01 S05-F13-P01    1.25        <NA>    <NA>    <NA>        <NA>    <NA>
2 S05-F13-P01:S08-F10-P01 S08-F10-P01   0.227 S08-F10-P01   0.227 Unknown        <NA>    <NA>
3 S05-F13-P01:S08-F11-P01 S08-F11-P01  -0.183 S08-F11-P01  -0.183 Unknown S08-F11-P01  -0.183
4 S05-F13-P01:S09-F66-P01 S09-F66-P01  -0.217 S09-F66-P01  -0.217 Unknown S09-F66-P01  -0.217
5 S05-F13-P01:S09-F67-P01        <NA>    <NA> S09-F67-P01   0.292 Unknown S09-F67-P01   0.292
6 S05-F13-P01:S09-F68-P01        <NA>    <NA>        <NA>    <NA>    <NA> S09-F68-P01  -0.314
7 S05-F13-P01:S09-F01-P01        <NA>    <NA>        <NA>    <NA>    <NA> S09-F01-P01  0.0418

您可以将矩阵转换为data.frame,并使用dplyr包中的full_join命令将它们连接在一起

library(dplyr)
for(i in c("aa","bb", "cc")) assign(i, data.frame(get(i)))
aa %>% full_join(bb, by="key.related.sheet") %>% full_join(cc,
by="key.related.sheet")

        key.related.sheet  sample_B.x trace_1  sample_B.y trace_2    type    sample_B trace_3
1 S05-F13-P01:S05-F13-P01 S05-F13-P01    1.25        <NA>    <NA>    <NA>        <NA>    <NA>
2 S05-F13-P01:S08-F10-P01 S08-F10-P01   0.227 S08-F10-P01   0.227 Unknown        <NA>    <NA>
3 S05-F13-P01:S08-F11-P01 S08-F11-P01  -0.183 S08-F11-P01  -0.183 Unknown S08-F11-P01  -0.183
4 S05-F13-P01:S09-F66-P01 S09-F66-P01  -0.217 S09-F66-P01  -0.217 Unknown S09-F66-P01  -0.217
5 S05-F13-P01:S09-F67-P01        <NA>    <NA> S09-F67-P01   0.292 Unknown S09-F67-P01   0.292
6 S05-F13-P01:S09-F68-P01        <NA>    <NA>        <NA>    <NA>    <NA> S09-F68-P01  -0.314
7 S05-F13-P01:S09-F01-P01        <NA>    <NA>        <NA>    <NA>    <NA> S09-F01-P01  0.0418

您还可以使用base R中的merge方法在all=TRUE时执行完全联接

在这里,合并是通过所有公共列来完成的,即key.related.sheet和sample_B-但在这里应该可以,因为sample_B取决于key.related.sheet


使用by=key.related.sheet可以获得与使用dplyr在Adams答案中相同的输出。然后,合并仅在w.r.t.key.related.sheet中完成,来自左侧和右侧联接伙伴的示例_B列同时出现在结果中,即,对您的数据进行复制

您还可以使用基本r的合并方法进行完全联接,且all=TRUE

在这里,合并是通过所有公共列来完成的,即key.related.sheet和sample_B-但在这里应该可以,因为sample_B取决于key.related.sheet


使用by=key.related.sheet可以获得与使用dplyr在Adams答案中相同的输出。然后,合并仅在w.r.t.key.related.sheet中完成,并且来自左侧和右侧联接伙伴的示例_B列都出现在结果中,即为您的数据复制两个嵌套合并并删除无关列

merge(merge(aa,bb[, -4], by=c("key.related.sheet", "sample_B") ,all=TRUE), 
      cc,  by=c("key.related.sheet", "sample_B") ,all=TRUE)

        key.related.sheet    sample_B trace_1 trace_2 trace_3
1 S05-F13-P01:S05-F13-P01 S05-F13-P01    1.25    <NA>    <NA>
2 S05-F13-P01:S08-F10-P01 S08-F10-P01   0.227   0.227    <NA>
3 S05-F13-P01:S08-F11-P01 S08-F11-P01  -0.183  -0.183  -0.183
4 S05-F13-P01:S09-F66-P01 S09-F66-P01  -0.217  -0.217  -0.217
5 S05-F13-P01:S09-F67-P01 S09-F67-P01    <NA>   0.292   0.292
6 S05-F13-P01:S09-F01-P01 S09-F01-P01    <NA>    <NA>  0.0418
7 S05-F13-P01:S09-F68-P01 S09-F68-P01    <NA>    <NA>  -0.314

两个嵌套合并和删除无关列

merge(merge(aa,bb[, -4], by=c("key.related.sheet", "sample_B") ,all=TRUE), 
      cc,  by=c("key.related.sheet", "sample_B") ,all=TRUE)

        key.related.sheet    sample_B trace_1 trace_2 trace_3
1 S05-F13-P01:S05-F13-P01 S05-F13-P01    1.25    <NA>    <NA>
2 S05-F13-P01:S08-F10-P01 S08-F10-P01   0.227   0.227    <NA>
3 S05-F13-P01:S08-F11-P01 S08-F11-P01  -0.183  -0.183  -0.183
4 S05-F13-P01:S09-F66-P01 S09-F66-P01  -0.217  -0.217  -0.217
5 S05-F13-P01:S09-F67-P01 S09-F67-P01    <NA>   0.292   0.292
6 S05-F13-P01:S09-F01-P01 S09-F01-P01    <NA>    <NA>  0.0418
7 S05-F13-P01:S09-F68-P01 S09-F68-P01    <NA>    <NA>  -0.314

这可以通过组合使用Reduce和merge来完成,如下所示:

Reduce(function(x, y) merge(x, y, all=TRUE), list(aa, bb[,-4], cc))
结果是:

        key.related.sheet    sample_B trace_1 trace_2 trace_3
1 S05-F13-P01:S05-F13-P01 S05-F13-P01    1.25    <NA>    <NA>
2 S05-F13-P01:S08-F10-P01 S08-F10-P01   0.227   0.227    <NA>
3 S05-F13-P01:S08-F11-P01 S08-F11-P01  -0.183  -0.183  -0.183
4 S05-F13-P01:S09-F66-P01 S09-F66-P01  -0.217  -0.217  -0.217
5 S05-F13-P01:S09-F67-P01 S09-F67-P01    <NA>   0.292   0.292
6 S05-F13-P01:S09-F01-P01 S09-F01-P01    <NA>    <NA>  0.0418
7 S05-F13-P01:S09-F68-P01 S09-F68-P01    <NA>    <NA>  -0.314

尤其是当您有三个以上的矩阵/数据帧时,使用带Reduce scales的merge比嵌套merge更好。

这可以通过Reduce和merge的组合来实现,如下所示:

Reduce(function(x, y) merge(x, y, all=TRUE), list(aa, bb[,-4], cc))
结果是:

        key.related.sheet    sample_B trace_1 trace_2 trace_3
1 S05-F13-P01:S05-F13-P01 S05-F13-P01    1.25    <NA>    <NA>
2 S05-F13-P01:S08-F10-P01 S08-F10-P01   0.227   0.227    <NA>
3 S05-F13-P01:S08-F11-P01 S08-F11-P01  -0.183  -0.183  -0.183
4 S05-F13-P01:S09-F66-P01 S09-F66-P01  -0.217  -0.217  -0.217
5 S05-F13-P01:S09-F67-P01 S09-F67-P01    <NA>   0.292   0.292
6 S05-F13-P01:S09-F01-P01 S09-F01-P01    <NA>    <NA>  0.0418
7 S05-F13-P01:S09-F68-P01 S09-F68-P01    <NA>    <NA>  -0.314

尤其是当您有三个以上的矩阵/数据帧时,使用带缩减比例的合并比嵌套合并更好。

@Ronaksha请查看预期输出。@Ronaksha请查看预期输出。这与@Patrick的答案不同吗?唯一的区别是删除了无关列。在我这么做之前,我犯了一些错误。直到我发布之后,我才看到Patrick的。使用Reduce而不是嵌套合并调用是否会更具可伸缩性和可读性?这是一个很好的主意。我最初尝试了do.call解决方案,但失败了。也许你应该发布一个更好的解决方案?我认为这是有价值的。这和@Patrick的答案有什么不同吗?唯一的区别是删除了无关的专栏。在我这么做之前,我犯了一些错误。直到我发布之后,我才看到Patrick的。使用Reduce而不是嵌套合并调用是否会更具可伸缩性和可读性?这是一个很好的主意。我最初尝试了do.call解决方案,但失败了。也许你应该发布一个更好的解决方案?我认为它有优点。