R 按位置和条件筛选器访问列

R 按位置和条件筛选器访问列,r,dplyr,symbols,R,Dplyr,Symbols,我有以下数据框: df = structure(list(age = c("F", "F", "M", "M", "M", "F", "M", "M", "F", "F", "M", "M", "F", "F", &q

我有以下数据框:

df = structure(list(age = c("F", "F", "M", "M", "M", "F", "M", "M", 
"F", "F", "M", "M", "F", "F", "F", "F", "F", "M", "M", "F", "F"
), gender = c(52.8547945205479, 70.617475870193, 47.6986301369863, 
85.4876712328767, 56.0288204261033, 27.0219178082192, 40.8583963494959, 
24.6553462722298, 80.4027397260274, 55.6684931506849, 70.6904109589041, 
64.5095890410959, 45.5397260273973, 78.5909038861022, 42.4219178082192, 
44.0712328767123, 77.7068493150685, 70.5199279905761, 43.7178082191781, 
77.7205479452055, 74.972602739726)), row.names = c(NA, -21L), class = c("tbl_df", 
"tbl", "data.frame"))
我想按性别筛选大于该性别平均年龄的年龄。 但我想通过列号而不是名称来实现这一点

所以我试着:

df %>% group_by_at(1) %>% filter_at(vars(2) > mean(vars(2))
但这不起作用


有什么建议吗?

当它在一列上按分组时,它会将其从过滤器中排除。试试这个

df %>% group_by(across(1)) %>%
  filter(across(1, ~ mean(.) <= .))

# A tibble: 11 x 2
# Groups:   age [2]
   age   gender
   <chr>  <dbl>
 1 F       70.6
 2 M       85.5
 3 M       56.0
 4 F       80.4
 5 M       70.7
 6 M       64.5
 7 F       78.6
 8 F       77.7
 9 M       70.5
10 F       77.7
11 F       75.0
df%>%分组依据(跨(1))%>%
筛选(跨(1)~平均值(%)%group_按(跨(1))%>%

过滤(1,~mean(.)位于
/
\u的
\u所有
变量在
dplyr
中都已被
替代。下面是一种方法

library(dplyr)

df %>% 
   group_by(across(1)) %>% 
   filter(cur_data()[[1]] > mean(cur_data()[[1]])) %>% 
   ungroup

#   age   gender
#   <chr>  <dbl>
# 1 F       70.6
# 2 M       85.5
# 3 M       56.0
# 4 F       80.4
# 5 M       70.7
# 6 M       64.5
# 7 F       78.6
# 8 F       77.7
# 9 M       70.5
#10 F       77.7
#11 F       75.0
库(dplyr)
df%>%
分组依据(跨(1))%>%
过滤器(cur_data()[[1]]>mean(cur_data()[[1]]))%>%
解组
#年龄性别
#     
#1 F 70.6
#2米85.5
#3米56.0
#4 F 80.4
#5米70.7
#6米64.5
#7 F 78.6
#8 F 77.7
#9米70.5
#10 F 77.7
#11 F 75.0

filter
中,使用了
[[1]]
,因为分组列不包括在
cur_data()

中,但是如果我有多个列,并且只想按年龄列进行过滤,会发生什么情况呢?然后”这还不够…请参阅编辑后的答案。实际上,列索引将按group_by语句中的数字减少
library(dplyr)

df %>% 
   group_by(across(1)) %>% 
   filter(cur_data()[[1]] > mean(cur_data()[[1]])) %>% 
   ungroup

#   age   gender
#   <chr>  <dbl>
# 1 F       70.6
# 2 M       85.5
# 3 M       56.0
# 4 F       80.4
# 5 M       70.7
# 6 M       64.5
# 7 F       78.6
# 8 F       77.7
# 9 M       70.5
#10 F       77.7
#11 F       75.0