基于R中的特定列值添加具有相同列名的多个数据帧
我有多个具有相同列名和维度的数据帧:基于R中的特定列值添加具有相同列名的多个数据帧,r,dataframe,merge,R,Dataframe,Merge,我有多个具有相同列名和维度的数据帧: df1 device_id price tax 1 a 200 5 2 b 100 2 3 c 50 1 df2 device_id price tax 1 b 200 7 2 a 100 3 3
df1
device_id price tax
1 a 200 5
2 b 100 2
3 c 50 1
df2
device_id price tax
1 b 200 7
2 a 100 3
3 c 50 1
df3
device_id price tax
1 c 50 5
2 b 300 1
3 a 50 2
我想做的是创建另一个数据帧df
,在这里我将使用匹配的device\u id
s从上述三个数据帧中添加价格和税收值
所以,
df
会像
df
device_id price tax
1 a 350 10
2 b 600 10
3 c 150 7
我怎么做?此外,如果该解决方案可以推广到更多的数据帧,而不仅仅是3个,那就太好了。首先,将所有数据帧放入一个列表中(这里称为
dflist
,定义如下)。在对列表元素进行行绑定之后,就可以轻松地使用aggregate()
aggregate(. ~ device_id, do.call(rbind, dflist), sum)
# device_id price tax
# 1 a 350 10
# 2 b 600 10
# 3 c 150 7
或者您可以使用data.table包
或dplyr
库(dplyr)
绑定行(dflist)%>%
分组依据(设备id)%>%
各汇总(funs(sum))
#来源:本地数据帧[3 x 3]
#
#设备id价格税
#
#1 a 350 10
#2B600 10
#3 c 150 7
数据:
dflist <- structure(list(df1 = structure(list(device_id = structure(1:3, .Label = c("a",
"b", "c"), class = "factor"), price = c(200L, 100L, 50L), tax = c(5L,
2L, 1L)), .Names = c("device_id", "price", "tax"), class = "data.frame", row.names = c("1",
"2", "3")), df2 = structure(list(device_id = structure(c(2L,
1L, 3L), .Label = c("a", "b", "c"), class = "factor"), price = c(200L,
100L, 50L), tax = c(7L, 3L, 1L)), .Names = c("device_id", "price",
"tax"), class = "data.frame", row.names = c("1", "2", "3")),
df3 = structure(list(device_id = structure(c(3L, 2L, 1L), .Label = c("a",
"b", "c"), class = "factor"), price = c(50L, 300L, 50L),
tax = c(5L, 1L, 2L)), .Names = c("device_id", "price",
"tax"), class = "data.frame", row.names = c("1", "2", "3"
))), .Names = c("df1", "df2", "df3"))
dflist首先,将所有数据帧放入一个列表(此处称为dflist
,定义如下)。在对列表元素进行行绑定之后,就可以轻松地使用aggregate()
aggregate(. ~ device_id, do.call(rbind, dflist), sum)
# device_id price tax
# 1 a 350 10
# 2 b 600 10
# 3 c 150 7
或者您可以使用data.table包
或dplyr
库(dplyr)
绑定行(dflist)%>%
分组依据(设备id)%>%
各汇总(funs(sum))
#来源:本地数据帧[3 x 3]
#
#设备id价格税
#
#1 a 350 10
#2B600 10
#3 c 150 7
数据:
dflist <- structure(list(df1 = structure(list(device_id = structure(1:3, .Label = c("a",
"b", "c"), class = "factor"), price = c(200L, 100L, 50L), tax = c(5L,
2L, 1L)), .Names = c("device_id", "price", "tax"), class = "data.frame", row.names = c("1",
"2", "3")), df2 = structure(list(device_id = structure(c(2L,
1L, 3L), .Label = c("a", "b", "c"), class = "factor"), price = c(200L,
100L, 50L), tax = c(7L, 3L, 1L)), .Names = c("device_id", "price",
"tax"), class = "data.frame", row.names = c("1", "2", "3")),
df3 = structure(list(device_id = structure(c(3L, 2L, 1L), .Label = c("a",
"b", "c"), class = "factor"), price = c(50L, 300L, 50L),
tax = c(5L, 1L, 2L)), .Names = c("device_id", "price",
"tax"), class = "data.frame", row.names = c("1", "2", "3"
))), .Names = c("df1", "df2", "df3"))
dflist将所有data.frame对象放入列表后,我们可以通过base R
在rbind
之后使用(mget(粘贴0(“df”,1:3))
)
dfN将所有data.frame对象放入列表后,我们可以通过base R
在rbind
之后使用(mget(粘贴0(“df”,1:3))
)
dfN
dfN <- do.call(rbind, mget(paste0("df", 1:3)))
do.call(rbind, by(dfN[-1], dfN[1], FUN = colSums))