Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/70.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R使用不同的列追加2个数据帧_R_Dataframe_Tidyr_Rbind - Fatal编程技术网

R使用不同的列追加2个数据帧

R使用不同的列追加2个数据帧,r,dataframe,tidyr,rbind,R,Dataframe,Tidyr,Rbind,我想将dfToAdd附加到df,其中第一个缺少列。重要的细节是df有两种类型的列。第一组列相互关联。 e、 g.组=A表示名称=组A,颜色=蓝色。不可能有a组a-Red的组合。 第二类列之间相互关联。 动物=狗的动作=吠叫 我想添加第二个数据框,其中缺少第一类列中的列。这些列应该由第一类列的组合填充,如以下dfResult行的顺序无关紧要: df = data.frame(group = c("A", "A", "A", "B", "B", "B"), name

我想将dfToAdd附加到df,其中第一个缺少列。重要的细节是df有两种类型的列。第一组列相互关联。 e、 g.组=A表示名称=组A,颜色=蓝色。不可能有a组a-Red的组合。 第二类列之间相互关联。 动物=狗的动作=吠叫 我想添加第二个数据框,其中缺少第一类列中的列。这些列应该由第一类列的组合填充,如以下dfResult行的顺序无关紧要:

df = data.frame(group = c("A", "A", "A", "B", "B", "B"),
                name = c("Group A", "Group A", "Group A", "Group B", "Group B", "Group B"),
                color = c("Blue", "Blue", "Blue", "Red", "Red", "Red"),
                animal = c("Dog", "Cat", "Mouse", "Dog", "Cat", "Mouse"),
                action = c("Bark", "Meow", "Squeak", "Bark", "Meow", "Squeak")
                )

dfToAdd = data.frame(animal = c("Lion", "Bird"), 
                     action = c("Roar", "Chirp"))

dfResult = data.frame(group = c("A", "A", "A", "B", "B", "B", "A", "A", "B", "B"),
                      name = c("Group A", "Group A", "Group A", "Group B", "Group B", "Group B", "Group A", "Group A", "Group B", "Group B"),
                      color = c("Blue", "Blue", "Blue", "Red", "Red", "Red", "Blue", "Blue", "Red", "Red"),
                      animal = c("Dog", "Cat", "Mouse", "Dog", "Cat", "Mouse", "Lion", "Bird", "Lion", "Bird"),
                      action = c("Bark", "Meow", "Squeak", "Bark", "Meow", "Squeak", "Roar", "Chirp", "Roar", "Chirp"))
> df
  group    name color animal action
1     A Group A  Blue    Dog   Bark
2     A Group A  Blue    Cat   Meow
3     A Group A  Blue  Mouse Squeak
4     B Group B   Red    Dog   Bark
5     B Group B   Red    Cat   Meow
6     B Group B   Red  Mouse Squeak
> dfToAdd
  animal action
1   Lion   Roar
2   Bird  Chirp
> dfResult
   group    name color animal action
1      A Group A  Blue    Dog   Bark
2      A Group A  Blue    Cat   Meow
3      A Group A  Blue  Mouse Squeak
4      B Group B   Red    Dog   Bark
5      B Group B   Red    Cat   Meow
6      B Group B   Red  Mouse Squeak
7      A Group A  Blue   Lion   Roar
8      A Group A  Blue   Bird  Chirp
9      B Group B   Red   Lion   Roar
10     B Group B   Red   Bird  Chirp

但是第一种类型的列组、名称、颜色还不完全清楚。我正在处理任意数量的多个分组变量。您可以想象,可能存在也可能不存在描述列=a组是一个好组或日期=2020.04.13。我们只知道第二种类型的列:animal和action。

在写这篇文章时,我想到在tidyr的[complete][2]功能的两侧使用[nesting][1],手动检测缺失的列。也许有一种更优雅的解决方案:

# First find all grouping columns
groupCols = colnames(df)[!(colnames(df) %in% colnames(dfToAdd))]
otherCols = colnames(df)[colnames(df) %in% colnames(dfToAdd)]
# Populate missing columns with first grouping appearing in the df
dfToAdd[groupCols] = df[1, groupCols]
# rbind it to append
dfResult = rbind(df, dfToAdd)
# Now we have obvious missing combinations, tidyr::complete accepts nesting information to generate combinations only for those, which needs to be different.
dfResult %>% tidyr::complete(tidyr::nesting(!!! syms(otherCols)), tidyr::nesting(!!! syms(groupCols)))
编辑:实际上意识到我在结尾使用了未知的列名。这真的不管用。我需要将groupCols字符向量提供给第二个嵌套调用


edit2:现在多亏了akrun的回答,我也可以纠正这个问题。

我们可以在单个%>%中完成这项工作,从“df”中切片第一行,选择“dftoad”中没有的列,用“dftoad”绑定,然后用“df”绑定行,并使用complete


我已经更新了关于使用未知列名问题的答案。它需要是动态的。我真的不知道团体,名字,颜色。但我可以把它们放在一个字符向量中。@Genom。你们想改变哪种筑巢方式?我了解动物和行为。所以筑巢动物,行动是好的。但是,嵌套组、名称、颜色可以更多。如嵌套组、名称、颜色、描述、日期、位置。。。。实际上,dfToAdd中所有缺少的列都被映射到df。我得到了一个错误:参数2的长度必须是1,而不是bind_cols处的2。我接受了,因为这是最简单的答案。我正在以sym为母语更新我的答案。谢谢!
library(dplyr)
library(tidyr)
library(rlang)
library(purrr)
df %>%
       slice(1) %>%
       select(-names(dfToAdd)) %>%  
       uncount(nrow(dfToAdd))  %>%     
       bind_cols(dfToAdd) %>%
       bind_rows(df, .) %>% 
       complete(nesting(!!! syms(names(dfToAdd))), 
             nesting(!!! syms(setdiff(names(.), names(dfToAdd)))))
# A tibble: 10 x 5
#   animal action group name    color
# * <fct>  <fct>  <fct> <fct>   <fct>
# 1 Cat    Meow   A     Group A Blue 
# 2 Cat    Meow   B     Group B Red  
# 3 Dog    Bark   A     Group A Blue 
# 4 Dog    Bark   B     Group B Red  
# 5 Mouse  Squeak A     Group A Blue 
# 6 Mouse  Squeak B     Group B Red  
# 7 Bird   Chirp  A     Group A Blue 
# 8 Bird   Chirp  B     Group B Red  
# 9 Lion   Roar   A     Group A Blue 
#10 Lion   Roar   B     Group B Red