函数在可选参数上子集dataframe
我有一个数据框,如下所示:函数在可选参数上子集dataframe,r,R,我有一个数据框,如下所示: df1 <- data.frame( Country = c("France", "England", "India", "America", "England"), City = c("Paris", "London", "Mumbai", "Los Angeles", "London"), Order_No = c("1", "2", "3", "4", "5"), delivered = c("Yes", "no", "Yes",
df1 <- data.frame(
Country = c("France", "England", "India", "America", "England"),
City = c("Paris", "London", "Mumbai", "Los Angeles", "London"),
Order_No = c("1", "2", "3", "4", "5"),
delivered = c("Yes", "no", "Yes", "No", "yes"),
stringsAsFactors = FALSE
)
df1我们可以在这里使用missing
函数来检查参数是否存在
select_cols <- function(df, cols) {
if(missing(cols))
df
else
df[cols]
}
select_cols(df1, c("Country", "City"))
# Country City
#1 France Paris
#2 England London
#3 India Mumbai
#4 America Los Angeles
#5 England London
select_cols(df1)
# Country City Order_No delivered
#1 France Paris 1 Yes
#2 England London 2 no
#3 India Mumbai 3 Yes
#4 America Los Angeles 4 No
#5 England London 5 yes
select\u cols使用vector(…)
在这里产生了问题。省略号必须转换为列表。因此,为了最终从三个点参数中获得向量,应该使用看似笨拙的结构unlist(list(…)
,而不是vector(…)
:
我假设您不想使用dplyr::select()
?它似乎具有您试图实现的所有功能。只需使用df[df$Country%in%selection,]
选择要选择的国家,例如c(“法国”、“英国”)
。对于几个标准的使用,例如,df[df$Country%in%selection&df$delivered==“no”,]
请查看示例数据框在“delivered”列中存在问题。为什么仅仅使用base::subset
是不够的?为什么要设置新函数?@ManuelBickel我正在创建一个包含自定义错误日志消息的包,因此,我正在尝试将代码的某些部分移动到函数中,这些函数将与代码结构的其余部分保持一致。是的,我意识到我只是想在这里复制基本子集;但理由就是我上面所说的。也许我不需要为这个案子尝试。谢谢感谢@Ronak的评论!哇@RHertel,非常感谢您的解释和代码片段。它完美地解决了我的问题。
select_cols <- function(df, cols) {
if(missing(cols))
df
else
df[cols]
}
select_cols(df1, c("Country", "City"))
# Country City
#1 France Paris
#2 England London
#3 India Mumbai
#4 America Los Angeles
#5 England London
select_cols(df1)
# Country City Order_No delivered
#1 France Paris 1 Yes
#2 England London 2 no
#3 India Mumbai 3 Yes
#4 America Los Angeles 4 No
#5 England London 5 yes
SubsetFunction <- function(inputdf, ...){
params <- unlist(list(...))
subset.df <- subset(inputdf, select=params)
return(subset.df)
}
> SubsetFunction(df1, "City")
# City
#1 Paris
#2 London
#3 Mumbai
#4 Los Angeles
#5 London
> SubsetFunction (df1, "City", "delivered")
# City delivered
#1 Paris Yes
#2 London no
#3 Mumbai Yes
#4 Los Angeles No
#5 London yes