合并data.frames，将R中相同列的值相加_R_Merge_Dataframe_Aggregate

合并data.frames，将R中相同列的值相加

r merge dataframe

合并data.frames，将R中相同列的值相加,r,merge,dataframe,aggregate,R,Merge,Dataframe,Aggregate,我有3个数据框（行：站点，列：物种名称）来描述站点内的物种丰富度。行数相同，但列数不同，因为并非所有物种都在所有三个数据框中。我想把它们合并到一个数据框架中，总结出相同物种的丰富度。例如：数据结构1 Sp1 Sp2 Sp3 Sp4 site1 1 2 3 1 site2 0 2 0 1 site3 1 1 1 1 数据结构2 Sp1 Sp2 Sp4 site1 0 1 2

我有3个数据框（行：站点，列：物种名称）来描述站点内的物种丰富度。行数相同，但列数不同，因为并非所有物种都在所有三个数据框中。我想把它们合并到一个数据框架中，总结出相同物种的丰富度。例如：

数据结构1

       Sp1  Sp2  Sp3  Sp4
site1   1    2    3    1
site2   0    2    0    1
site3   1    1    1    1

数据结构2

       Sp1  Sp2  Sp4
 site1  0    1    2
 site2  1    2    0
 site3  1    1    1

数据结构3

       Sp1  Sp2  Sp5  Sp6
 site1  0    1    1    1     
 site2  1    1    1    5
 site3  2    0    0    0

我想要的是：

       Sp1  Sp2  Sp3  Sp4  Sp5  Sp6
 site1  1    4    3    3    1    1
 site2  2    5    0    1    1    5
 site3  4    2    1    2    0    0

我想我必须使用merge，但到目前为止，我的尝试都没有达到我想要的效果

非常感谢您的帮助。

我会使用

plyr

的

rbind.fill

如下：

pp <- cbind(names=c(rownames(df1), rownames(df2), rownames(df3)), 
                        rbind.fill(list(df1, df2, df3)))

#   names Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
# 1 site1   1   2   3   1  NA  NA
# 2 site2   0   2   0   1  NA  NA
# 3 site3   1   1   1   1  NA  NA
# 4 site1   0   1  NA   2  NA  NA
# 5 site2   1   2  NA   0  NA  NA
# 6 site3   1   1  NA   1  NA  NA
# 7 site1   0   1  NA  NA   1   1
# 8 site2   1   1  NA  NA   1   5
# 9 site3   2   0  NA  NA   0   0

阿伦回答的另一种选择：创建一个包含所有所需列的“模板”数组

Rgames> bbar<-data.frame('one'=rep(0,3),'two'=rep(0,3),'three'=rep(0,3))
Rgames> bbar
  one two three
1  0    0    0
2   0    0    0
3   0    0    0

Rgames>bbar-bbar
123
1  0    0    0
2   0    0    0
3   0    0    0

然后，给定每个数据帧，如

Rgames> bar1<-data.frame('one'=c(1,2,3),'two'=c(4,5,6))
Rgames> bar1
  one two
1   1   4
2   2   5
3   3   6

Rgames>bar1 bar1
12
1   1   4
2   2   5
3   3   6

创建扩展数据帧：

Rgames> newbar1<-bbar
Rgames> for (jj in names(bar) )  newbar1[[jj]]<-bar[[jj]]
Rgames> newbar1
  one two three
1   1   4    0
2   2   5    0
3   3   6    0

Rgames>newbar1 for（名称中的jj（bar））newbar1[[jj]]newbar1
123
1   1   4    0
2   2   5    0
3   3   6    0

然后对所有这些扩展数据帧求和。笨拙但简单。

另一种选择是使用

重塑2

中的

熔化/铸造。下面是一个简单的例子：
df1 <- read.table(header=T, text="
    Sp1  Sp2  Sp3  Sp4
    site1   1    2    3    1
    site2   0    2    0    1
    site3   1    1    1    1")

df2 <- read.table(header=T, text="
       Sp1  Sp2  Sp4
 site1  0    1    2
 site2  1    2    0
 site3  1    1    1")

df3 <- read.table(header=T, text="
       Sp1  Sp2  Sp5  Sp6
 site1  0    1    1    1     
 site2  1    1    1    5
 site3  2    0    0    0")

df1$site <- rownames(df1)
df2$site <- rownames(df2)
df3$site <- rownames(df3)

DF <- rbind(melt(df1,id="site"),melt(df2,id="site"),melt(df3,id="site"))
dcast(data=DF,formula=site ~ variable,fun.aggregate=sum)

   site Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
1 site1   1   4   3   3   1   1
2 site2   2   5   0   1   1   5
3 site3   4   2   1   2   0   0

df1除了可用的选项之外，还有两个基本R选项
第一个选项：广泛聚合（排序）
temp#尽管可以安全忽略。。。
>temp2 row.name未使用
>xtabs（值~ind+var，temp2）
变量
ind Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
地点1 4 3 1
地点2 2 5 0 1 5
网站3 4 2 1 2 0 0 0
也许聚合
比合并更好？我想到了一个解决方案，我保证它不会这么优雅+1@eugenego您可以在最能回答问题的解决方案旁边打勾。
df1 <- read.table(header=T, text="
    Sp1  Sp2  Sp3  Sp4
    site1   1    2    3    1
    site2   0    2    0    1
    site3   1    1    1    1")

df2 <- read.table(header=T, text="
       Sp1  Sp2  Sp4
 site1  0    1    2
 site2  1    2    0
 site3  1    1    1")

df3 <- read.table(header=T, text="
       Sp1  Sp2  Sp5  Sp6
 site1  0    1    1    1     
 site2  1    1    1    5
 site3  2    0    0    0")

df1$site <- rownames(df1)
df2$site <- rownames(df2)
df3$site <- rownames(df3)

DF <- rbind(melt(df1,id="site"),melt(df2,id="site"),melt(df3,id="site"))
dcast(data=DF,formula=site ~ variable,fun.aggregate=sum)

   site Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
1 site1   1   4   3   3   1   1
2 site2   2   5   0   1   1   5
3 site3   4   2   1   2   0   0

temp <- cbind(df1, df2, df3)
temp
#       Sp1 Sp2 Sp3 Sp4 Sp1 Sp2 Sp4 Sp1 Sp2 Sp5 Sp6
# site1   1   2   3   1   0   1   2   0   1   1   1
# site2   0   2   0   1   1   2   0   1   1   1   5
# site3   1   1   1   1   1   1   1   2   0   0   0
sapply(unique(colnames(temp)), 
       function(x) rowSums(temp[, colnames(temp) == x, drop = FALSE]))
#       Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
# site1   1   4   3   3   1   1
# site2   2   5   0   1   1   5
# site3   4   2   1   2   0   0

> temp1 <- t(cbind(df1, df2, df3))
> # You'll get a warning in the next step
> # Safe to ignore though...
> temp2 <- data.frame(var = rownames(temp), stack(data.frame(temp)))
Warning message:
In data.row.names(row.names, rowsi, i) :
  some row.names duplicated: 5,6,7,8,9 --> row.names NOT used
> xtabs(values ~ ind + var, temp2)
       var
ind     Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
  site1   1   4   3   3   1   1
  site2   2   5   0   1   1   5
  site3   4   2   1   2   0   0