R 函数不接受列调用

R 函数不接受列调用,r,dplyr,R,Dplyr,我已经构建了一个函数,希望在其中传递数据帧和数据帧中的列。例如: testdf <- structure(list(date = c("2016-04-04", "2016-04-04", "2016-04-04", "2016-04-04", "2016-04-04", "2016-04-04"), sensorheight = c(1L, 16L, 1L, 16L, 1L, 16L), farm = c("McDonald", "McDonald", "McDonald", "M

我已经构建了一个函数,希望在其中传递数据帧和数据帧中的列。例如:

testdf <- structure(list(date = c("2016-04-04", "2016-04-04", "2016-04-04", 
"2016-04-04", "2016-04-04", "2016-04-04"), sensorheight = c(1L, 
16L, 1L, 16L, 1L, 16L), farm = c("McDonald", "McDonald", "McDonald", 
"McDonald", "McDonald", "McDonald"), location = c("4", "4", "5", 
"5", "Outside", "Outside"), Temp = c(122.8875, 117.225, 102.0375, 
98.3625, 88.5125, 94.7)), .Names = c("date", "sensorheight", 
"farm", "location", "Temp"), row.names = c(NA, 6L), class = "data.frame")

> testdf
        date sensorheight     farm location     Temp
1 2016-04-04            1 McDonald        4 122.8875
2 2016-04-04           16 McDonald        4 117.2250
3 2016-04-04            1 McDonald        5 102.0375
4 2016-04-04           16 McDonald        5  98.3625
5 2016-04-04            1 McDonald  Outside  88.5125
6 2016-04-04           16 McDonald  Outside  94.7000

我想知道这些错误消息的含义以及如何处理它们

我尝试了这些解决方案,但是没有一个对我有效

如果将该列作为输入删除,则不会发生错误,但我需要该列,因为我将该函数应用于大型数据帧中的多个列

我想要的输出:

        date sensorheight     farm location     Temp
1 2016-04-04            1 McDonald        4  34.3750
2 2016-04-04           16 McDonald        4  22.5250
3 2016-04-04            1 McDonald        5  13.5250
4 2016-04-04           16 McDonald        5   3.6625

下面调用函数DailyInOutDiff,并将testdf分配给df,将Temp分配给变量

   test <- DailyInOutDiff(testdf, "Temp")
   test <- DailyInOutDiff(testdf, quote(Temp))
如果你的电话是

    test <- DailyInOutDiff(testdf, testdf["Temp"]) 
如果你打电话给他

    test <- DailyInOutDiff(testdf, testdf["Temp"]) 

test下面调用函数DailyInOutDiff,并将testdf分配给df,将Temp分配给变量

   test <- DailyInOutDiff(testdf, "Temp")
   test <- DailyInOutDiff(testdf, quote(Temp))
如果你的电话是

    test <- DailyInOutDiff(testdf, testdf["Temp"]) 
如果你打电话给他

    test <- DailyInOutDiff(testdf, testdf["Temp"]) 

test我无法复制第二个错误,但我可以复制第一个错误。似乎
summary
函数调用
Temp
时遇到问题,因为它认为它是
字符
对象。换句话说,您调用的是列名,而不是列。如果您在函数中逐行运行代码,而不是使用
df$variable
,您将看到它是有效的

尽管如此,解决方案还是相当简单的。我刚刚添加了行
变量%
总结(Diff=if(n()==1)一个else变量[location==“4”]-变量[location==“Outside”],
位置=“4”)%>%
选择(1、2、3、5、4)
每日收入差异05%
筛选器(位置%位于%c(5,'外部'))%>%
分组依据(日期、传感器高度、农场)%>%
安排(传感器高度、农场、位置)%>%
汇总(Diff=if(n()==1)一个else变量[location==“5”]-变量[location==“Outside”],
位置=“5”)%>%
选择(1、2、3、5、4)
临时列表测试
来源:本地数据帧[4 x 5]
分组:日期、传感器高度[2]
日期传感器高度场位置差异
1 2016-04-04麦当劳4 34.3750
2 2016-04-04 16麦当劳4 22.5250
3 2016-04-04麦当劳513.5250
4 2016-04-04 16麦当劳5 3.6625

我无法复制第二个错误,但我可以复制第一个错误。似乎
summary
函数调用
Temp
时遇到问题,因为它认为它是
字符
对象。换句话说,您调用的是列名,而不是列。如果您在函数中逐行运行代码,而不是使用
df$variable
,您将看到它是有效的

尽管如此,解决方案还是相当简单的。我刚刚添加了行
变量%
总结(Diff=if(n()==1)一个else变量[location==“4”]-变量[location==“Outside”],
位置=“4”)%>%
选择(1、2、3、5、4)
每日收入差异05%
筛选器(位置%位于%c(5,'外部'))%>%
分组依据(日期、传感器高度、农场)%>%
安排(传感器高度、农场、位置)%>%
汇总(Diff=if(n()==1)一个else变量[location==“5”]-变量[location==“Outside”],
位置=“5”)%>%
选择(1、2、3、5、4)
临时列表测试
来源:本地数据帧[4 x 5]
分组:日期、传感器高度[2]
日期传感器高度场位置差异
1 2016-04-04麦当劳4 34.3750
2 2016-04-04 16麦当劳4 22.5250
3 2016-04-04麦当劳513.5250
4 2016-04-04 16麦当劳5 3.6625

如果您使用的是最新的
dplyr
(0.7),则可以使用
.data
以字符串引用列名,您的函数将被修改为:

DailyInOutDiff <- function (df, variable) {

  DailyInOutDiff04 <- df %>%
    filter(location %in% c(4, 'Outside')) %>% 
    group_by(date, sensorheight, farm) %>%
    arrange(sensorheight, farm, location) %>%
    summarise(Diff = if(n()==1) NA else .data[[variable]][location=="4"] - .data[[variable]][location=='Outside'], 
              location = "4")  %>%
    select(1, 2, 3, 5, 4)

  DailyInOutDiff05 <- df %>%
    filter(location %in% c(5, 'Outside')) %>% 
    group_by(date, sensorheight, farm) %>%
    arrange(sensorheight, farm, location) %>%
    summarise(Diff = if(n()==1) NA else .data[[variable]][location=="5"] - .data[[variable]][location=='Outside'], 
              location = "5")  %>%
    select(1, 2, 3, 5, 4)

  temp.list <- list(DailyInOutDiff04, DailyInOutDiff05)
  final.df = bind_rows(temp.list)
  return(final.df)
}

如果您使用的是最新的
dplyr
(0.7),则可以使用
.data
以字符串引用列名,您的函数将被修改为:

DailyInOutDiff <- function (df, variable) {

  DailyInOutDiff04 <- df %>%
    filter(location %in% c(4, 'Outside')) %>% 
    group_by(date, sensorheight, farm) %>%
    arrange(sensorheight, farm, location) %>%
    summarise(Diff = if(n()==1) NA else .data[[variable]][location=="4"] - .data[[variable]][location=='Outside'], 
              location = "4")  %>%
    select(1, 2, 3, 5, 4)

  DailyInOutDiff05 <- df %>%
    filter(location %in% c(5, 'Outside')) %>% 
    group_by(date, sensorheight, farm) %>%
    arrange(sensorheight, farm, location) %>%
    summarise(Diff = if(n()==1) NA else .data[[variable]][location=="5"] - .data[[variable]][location=='Outside'], 
              location = "5")  %>%
    select(1, 2, 3, 5, 4)

  temp.list <- list(DailyInOutDiff04, DailyInOutDiff05)
  final.df = bind_rows(temp.list)
  return(final.df)
}

建议的复制者:还可以查看软件包的渐晕图。您可以提供希望获得的输出吗?@beigel请参见编辑。建议的复制者:还可以查看软件包渐晕图。您可以提供希望获得的输出吗?@beigel请参见编辑。
    test <- DailyInOutDiff(testdf, testdf["Temp"]) 
DailyInOutDiff <- function (df, variable) {

  variable<- as.name(variable)
  DailyInOutDiff04 <- df %>%
    filter(location %in% c(4, 'Outside')) %>% 
    group_by(date, sensorheight, farm) %>%
    arrange(sensorheight, farm, location) %>%
    summarise(Diff = if(n()==1) NA else variable[location=="4"] - variable[location=='Outside'], 
              location = "4")  %>%
    select(1, 2, 3, 5, 4)

  DailyInOutDiff05 <- df %>%
    filter(location %in% c(5, 'Outside')) %>% 
    group_by(date, sensorheight, farm) %>%
    arrange(sensorheight, farm, location) %>%
    summarise(Diff = if(n()==1) NA else variable[location=="5"] - variable[location=='Outside'], 
              location = "5")  %>%
    select(1, 2, 3, 5, 4)

  temp.list <- list(DailyInOutDiff04, DailyInOutDiff05)
  final.df = bind_rows(temp.list)
  return(final.df)
}
> test <- DailyInOutDiff(testdf, "Temp")
> test
Source: local data frame [4 x 5]
Groups: date, sensorheight [2]

        date sensorheight     farm location    Diff
       <chr>        <int>    <chr>    <chr>   <dbl>
1 2016-04-04            1 McDonald        4 34.3750
2 2016-04-04           16 McDonald        4 22.5250
3 2016-04-04            1 McDonald        5 13.5250
4 2016-04-04           16 McDonald        5  3.6625
DailyInOutDiff <- function (df, variable) {

  DailyInOutDiff04 <- df %>%
    filter(location %in% c(4, 'Outside')) %>% 
    group_by(date, sensorheight, farm) %>%
    arrange(sensorheight, farm, location) %>%
    summarise(Diff = if(n()==1) NA else .data[[variable]][location=="4"] - .data[[variable]][location=='Outside'], 
              location = "4")  %>%
    select(1, 2, 3, 5, 4)

  DailyInOutDiff05 <- df %>%
    filter(location %in% c(5, 'Outside')) %>% 
    group_by(date, sensorheight, farm) %>%
    arrange(sensorheight, farm, location) %>%
    summarise(Diff = if(n()==1) NA else .data[[variable]][location=="5"] - .data[[variable]][location=='Outside'], 
              location = "5")  %>%
    select(1, 2, 3, 5, 4)

  temp.list <- list(DailyInOutDiff04, DailyInOutDiff05)
  final.df = bind_rows(temp.list)
  return(final.df)
}
DailyInOutDiff(testdf, "Temp")
#> # A tibble: 4 x 5
#> # Groups:   date, sensorheight [2]
#>         date sensorheight     farm location    Diff
#>        <chr>        <int>    <chr>    <chr>   <dbl>
#> 1 2016-04-04            1 McDonald        4 34.3750
#> 2 2016-04-04           16 McDonald        4 22.5250
#> 3 2016-04-04            1 McDonald        5 13.5250
#> 4 2016-04-04           16 McDonald        5  3.6625