r来自不同长度和不同键的数据帧列表的多个连接
假设我得到了以下数据帧列表:r来自不同长度和不同键的数据帧列表的多个连接,r,list,dataframe,left-join,R,List,Dataframe,Left Join,假设我得到了以下数据帧列表: library(tidyverse) df_list <- list(data.frame(cheese = c("ex","ok","bd"), cheese_val = c(3:1), stringsAsFactors = F), data.frame(egg = c("great","good","bad", "eww
library(tidyverse)
df_list <- list(data.frame(cheese = c("ex","ok","bd"),
cheese_val = c(3:1),
stringsAsFactors = F),
data.frame(egg = c("great","good","bad", "eww"),
egg_val = c(4:1),
stringsAsFactors = F),
data.frame(milk = c("good","bad"),
milk_val = c(2:1),
stringsAsFactors = F))
它运行,但只将联接应用于milk
列,因此core\u dat
中唯一的附加列是milk\u val
,但我希望看到cheese\u val
,以及egg\u val
我怀疑这里有比for循环更合适的选项,我正在寻找建议。注意,我的实际数据集的df比这个小示例多得多
我不应该期望生成的数据帧(在本例中为gg
)总共包含6列(3个标准名称+3个带有“val”后缀),因此它看起来像以下内容的打印版本:
data.frame(cheese = c("ex","ok","ok", "bd", "ok"),
egg = c("great", "bad", "bad", "eww", "great"),
milk = c("good", "good", "good", "bad", "good"),
chees_val = c(3, 2, 2, 1, 2),
egg_val = c(4, 2, 2, 1, 4),
milk_val = c(2, 2, 2, 1, 2))
我在这里看到了许多“多重连接”的答案,但没有一个与我在这里试图实现的目标完全一致(不同的键列,不同的数据长度)。您可以使用
map
获得连接数据帧的列表,然后使用reduce
将它们连接在一起
map(df_list, right_join, rownames_to_column(core_dat)) %>%
reduce(full_join)
# Joining, by = "cheese"
# Joining, by = "egg"
# Joining, by = "milk"
# Joining, by = c("cheese", "rowname", "egg", "milk")
# Joining, by = c("cheese", "rowname", "egg", "milk")
# cheese cheese_val rowname egg milk egg_val milk_val
# 1 ex 3 1 great good 4 2
# 2 ok 2 2 bad good 2 2
# 3 ok 2 3 bad good 2 2
# 4 bd 1 4 eww bad 1 1
# 5 ok 2 5 great good 4 2
您可以使用
map
获取连接的数据帧列表,然后使用reduce
将它们连接在一起
map(df_list, right_join, rownames_to_column(core_dat)) %>%
reduce(full_join)
# Joining, by = "cheese"
# Joining, by = "egg"
# Joining, by = "milk"
# Joining, by = c("cheese", "rowname", "egg", "milk")
# Joining, by = c("cheese", "rowname", "egg", "milk")
# cheese cheese_val rowname egg milk egg_val milk_val
# 1 ex 3 1 great good 4 2
# 2 ok 2 2 bad good 2 2
# 3 ok 2 3 bad good 2 2
# 4 bd 1 4 eww bad 1 1
# 5 ok 2 5 great good 4 2
这将提供所需的输出:
Reduce(merge,c(df_list,list(core_dat)))
cheese egg milk cheese_val egg_val milk_val
1 bd eww bad 1 1 1
2 ex great good 3 4 2
3 ok bad good 2 2 2
4 ok bad good 2 2 2
5 ok great good 2 4 2
这将提供所需的输出:
Reduce(merge,c(df_list,list(core_dat)))
cheese egg milk cheese_val egg_val milk_val
1 bd eww bad 1 1 1
2 ex great good 3 4 2
3 ok bad good 2 2 2
4 ok bad good 2 2 2
5 ok great good 2 4 2
尝试
map(df\u list,left\u join,core\u dat)
或right\u join
不确定预期的输出是什么lappy(df\u list,merge,core\u data)
@missue-map
没有提供期望的结果:我更新了一点问题。我们可以看到您期望的输出吗?对于这个例子?尝试map(df\u list,left\u join,core\u dat)
或right\u join
不确定预期的输出是什么lappy(df\u list,merge,core\u data)
@missue-map
没有提供预期的结果:我更新了一点问题。我们能看到您期望的输出吗?对于这个例子?我对原始核心数据数量为5的记录数量感到困惑。我添加了行ID以使其具有所需的行数。这在我的小例子和43变量版本中也起到了作用-谢谢!原始core_data
qty为5的记录数让我感到困惑。我添加了行ID以使其具有所需的行数。这在我的小示例和43变量版本中也起到了作用-谢谢!这在我的小示例中有效,但在我的原始数据中,我的内存不足。您的数据有多大?这是一个基本的解决方案。尝试dplyr
解决方案。我不能写它,因为它已经给出了我的小示例,但在我的原始数据中,我的内存不足。您的数据有多大?这是一个基本的解决方案。尝试dplyr
解决方案。我不能写,因为它已经给了我