在R中使用列名作为函数参数
我试图创建一个R函数,将平均值填入数据框中的特定列在R中使用列名作为函数参数,r,function,dplyr,R,Function,Dplyr,我试图创建一个R函数,将平均值填入数据框中的特定列 impute_means <- function(df, group_by, column){ vals_to_impute <- df %>% group_by_at(group_by) %>% summarise(x = mean(get(column), na.rm = TRUE)) df %>% filter(is.na(get(column))) %>%
impute_means <- function(df, group_by, column){
vals_to_impute <- df %>%
group_by_at(group_by) %>%
summarise(x = mean(get(column), na.rm = TRUE))
df %>%
filter(is.na(get(column))) %>%
select(group_by, column) %>%
left_join(vals_to_impute, by=group_by)
}
impute_means(df = weather_data, group_by = c("year","month","code","type"), column = "temperature")
你可以试试这个功能
impute_means <- function(df, group_by, column){
df %>%
group_by_at(group_by) %>%
mutate(across(c(column), mean))
}
inpute_表示%
分组依据(分组依据)%>%
变异(跨越(c(列),平均值))
}
或者,如果您需要一个新列:
impute_means <- function(df, group_by, column){
df %>%
group_by_at(group_by) %>%
mutate(x=across(c(column), mean))
}
inpute_表示%
分组依据(分组依据)%>%
变异(x=交叉(c(列),平均值))
}
您可以这样做-
library(dplyr)
impute_means <- function(df, group_by, column){
df %>%
mutate(val = .data[[column]]) %>%
group_by(across(all_of(group_by))) %>%
mutate(!!column := mean(.data[[column]], na.rm = TRUE)) %>%
filter(is.na(val)) %>%
select(-val) %>%
ungroup
}
impute_means(df = weather_data,
group_by = c("year","month","code","type"),
column = "temperature")
库(dplyr)
插补_意味着%
变异(val=.data[[列]])%>%
分组依据(跨所有分组依据))%>%
变异(!!列:=平均值(.data[[column]],na.rm=TRUE))%>%
过滤器(is.na(val))%>%
选择(-val)%>%
解组
}
估算平均值(df=天气数据,
分组依据=c(“年”、“月”、“代码”、“类型”),
列=“温度”)
我使用mutate
来保持数据中的行数,而不是汇总数据并执行连接
如果您觉得更容易理解,可以将.data[[column]]
替换为get(column)
。这两种方法应该是一样的。看。我不太确定你想做什么,但我认为你应该一步一步地做,而不是试图在一条线上做每件事,这样会更容易。只要从一行开始,比如stuff\u to\u calculate\u mean@RonakShah addedKeep记住,平均插补是一种次优的方法(例如),并且有更好的替代方法。我想我错过的是mutate函数中的:=
impute_means <- function(df, group_by, column){
df %>%
group_by_at(group_by) %>%
mutate(x=across(c(column), mean))
}
library(dplyr)
impute_means <- function(df, group_by, column){
df %>%
mutate(val = .data[[column]]) %>%
group_by(across(all_of(group_by))) %>%
mutate(!!column := mean(.data[[column]], na.rm = TRUE)) %>%
filter(is.na(val)) %>%
select(-val) %>%
ungroup
}
impute_means(df = weather_data,
group_by = c("year","month","code","type"),
column = "temperature")