R 为整列返回NA的聚合函数
如果答案显而易见,请原谅我,我对R 我试图聚合这组数据,但其中一列不断返回NAR 为整列返回NA的聚合函数,r,R,如果答案显而易见,请原谅我,我对R 我试图聚合这组数据,但其中一列不断返回NA > dput(head(DrivingDistance,50)) structure(list(player_name = c("Brian Stuard", "Billy Hurley III", "Greg Chalmers", "William McGirt", "Russell Knox", "
> dput(head(DrivingDistance,50))
structure(list(player_name = c("Brian Stuard", "Billy Hurley III",
"Greg Chalmers", "William McGirt", "Russell Knox", "Cody Gribble",
"Tony Finau", "Dustin Johnson", "Justin Thomas", "Vaughn Taylor",
"Jason Day", "Brendan Steele", "Si Woo Kim", "Brandt Snedeker",
"Jason Dufner", "Ryan Moore", "Rod Pampling", "Fabián Gómez",
"Jimmy Walker", "Jim Herman", "Pat Perez", "Daniel Berger", "Patrick Reed",
"James Hahn", "Mackenzie Hughes", "Branden Grace", "Jordan Spieth",
"Hideki Matsuyama", "Charley Hoffman", "Jhonattan Vegas", "Aaron Baddeley",
"Bubba Watson", "J.T. Poston", "Shawn Stefani", "Stewart Cink",
"William McGirt", "Fabián Gómez", "David Lingmerth", "Henrik Norlander",
"Tim Wilkinson", "Gonzalo Fernandez-Castaño", "Daniel Summerhays",
"Webb Simpson", "Peter Malnati", "Jason Bohn", "Vaughn Taylor",
"Daniel Berger", "Zac Blair", "Ryan Brehm", "Chez Reavie"), date = structure(c(17174,
17174, 17174, 17174, 17174, 17174, 17174, 17174, 17174, 17174,
17174, 17174, 17174, 17174, 17174, 17174, 17174, 17174, 17174,
17174, 17174, 17174, 17174, 17174, 17174, 17174, 17174, 17174,
17174, 17174, 17174, 17174, 17181, 17181, 17181, 17181, 17181,
17181, 17181, 17181, 17181, 17181, 17181, 17181, 17181, 17181,
17181, 17181, 17181, 17181), class = "Date"), DrDis = c("263.1",
"265.4", "266.5", "267.9", "269.3", "270.8", "304.8", "319.6",
"301.6", "269.6", "300.4", "288.5", "271.6", "271.9", "272.0",
"272.6", "275.1", "275.4", "275.6", "276.6", "278.4", "278.5",
"279.3", "279.8", "280.4", "283.3", "283.4", "283.6", "286.0",
"286.3", "287.9", "300.3", "304.3", "304.1", "304.0", "303.9",
"303.5", "303.3", "304.5", "303.0", "301.6", "301.6", "299.6",
"298.9", "297.6", "296.3", "302.6", "295.1", "305.3", "305.5"
)), row.names = c(NA, -50L), class = c("tbl_df", "tbl", "data.frame"
))
这是尝试汇总后的回报。
player_name date DrDis
<chr> <date> <dbl>
1 A.J. McInerney 2018-02-21 NA
2 Aaron Baddeley 2018-08-01 NA
3 Aaron Rai 2019-06-06 NA
4 Aaron Wise 2018-10-28 NA
5 Abraham Ancer 2019-02-13 NA
6 Adam Bland 2018-03-04 NA
7 Adam Hadwin 2018-08-11 NA
8 Adam Long 2019-09-22 NA
9 Adam Schenk 2019-03-03 NA
10 Adam Scott 2018-08-12 NA
# ... with 551 more rows
There were 50 or more warnings (use warnings() to see the first 50)
DrivingDistance <-CurrentData[CurrentData$statistic == 'Driving Distance' & CurrentData$variable == 'AVG.',] %>%
select(player_name, date, value) %>%
dplyr::rename(DrDis = value)
DrivingDistance %>%
group_by(player_name) %>%
summarize_all(mean, na.rm = TRUE)
player\u name日期DrDis
1 A.J.麦金纳尼2018-02-21北美
2 Aaron Baddeley 2018-08-01北美
3 Aaron Rai 2019-06-06北美
4 Aaron Wise 2018-10-28北美
5亚伯拉罕·安塞尔2019-02-13北美
6亚当·布兰德2018-03-04北美
7亚当·哈德温2018-08-11北美
8亚当·朗2019-09-22北美
9亚当申克2019-03-03北美
10亚当·斯科特2018-08-12北美
# ... 还有551行
有50个或更多警告(使用warnings()查看前50个)
以下是我用来创建行驶距离,然后汇总这组数据的代码。
player_name date DrDis
<chr> <date> <dbl>
1 A.J. McInerney 2018-02-21 NA
2 Aaron Baddeley 2018-08-01 NA
3 Aaron Rai 2019-06-06 NA
4 Aaron Wise 2018-10-28 NA
5 Abraham Ancer 2019-02-13 NA
6 Adam Bland 2018-03-04 NA
7 Adam Hadwin 2018-08-11 NA
8 Adam Long 2019-09-22 NA
9 Adam Schenk 2019-03-03 NA
10 Adam Scott 2018-08-12 NA
# ... with 551 more rows
There were 50 or more warnings (use warnings() to see the first 50)
DrivingDistance <-CurrentData[CurrentData$statistic == 'Driving Distance' & CurrentData$variable == 'AVG.',] %>%
select(player_name, date, value) %>%
dplyr::rename(DrDis = value)
DrivingDistance %>%
group_by(player_name) %>%
summarize_all(mean, na.rm = TRUE)
行驶距离%
选择(玩家名称、日期、值)%>%
dplyr::重命名(DrDis=value)
行驶距离%>%
分组人(玩家姓名)%>%
汇总所有数据(平均值,na.rm=TRUE)
尝试以下解决方案:
DrivingDistance %>% mutate(DrDis=as.numeric(DrDis)) %>%
group_by(player_name) %>%
summarize_all(mean, na.rm = TRUE)
# A tibble: 46 x 3
player_name date DrDis
<chr> <date> <dbl>
1 Aaron Baddeley 2017-01-08 288.
2 Billy Hurley III 2017-01-08 265.
3 Branden Grace 2017-01-08 283.
4 Brandt Snedeker 2017-01-08 272.
5 Brendan Steele 2017-01-08 288.
6 Brian Stuard 2017-01-08 263.
7 Bubba Watson 2017-01-08 300.
8 Charley Hoffman 2017-01-08 286
9 Chez Reavie 2017-01-15 306.
10 Cody Gribble 2017-01-08 271.
# ... with 36 more rows
DrivingDistance%%>%mutate(DrDis=as.numeric(DrDis))%%>%
分组人(玩家姓名)%>%
汇总所有数据(平均值,na.rm=TRUE)
#A tibble:46 x 3
球员姓名日期DrDis
1 Aaron Baddeley 2017-01-08 288。
2比利·赫尔利三世2017-01-08 265。
3布兰登·格雷斯2017-01-08 283。
4布兰特·斯奈德克2017-01-08 272。
5布伦丹·斯蒂尔2017-01-08 288。
6 Brian Stuard 2017-01-08 263。
7布巴·沃森2017-01-08 300。
8查利·霍夫曼2017-01-08 286
9 Chez Reavie 2017-01-15 306。
10科迪·格里布尔2017-01-08 271。
# ... 还有36行
如果您包含一个简单的示例输入和所需的输出,可以用来测试和验证可能的解决方案,那么就更容易为您提供帮助。你的数据中有NA值吗?听起来像。很可能是重复的:因为您看到正在使用dplyr
,所以可以使用CurrentData%>%filter(statistic=='Driving Distance',variable=='AVG.')
而不是[,]
您可以使用dput(head(CurrentData))
为了帮助生成数据的可行子集……您的数据框中是否有一个名为value
的列?示例输出与示例命令不匹配(有其他字段),因此查看CurrentData
而不是看起来像DrivingDistance
的内容会很有用。我也避免使用date
作为变量名,因为它有其他含义。@beroe抱歉,我应该更具体一些。我刚刚用CurrentData的负责人更新了原始帖子。值在CurrentData中,但我在中将其重命名为DrDisDrivingDistance@MrFlick我只是回顾了数据集中的每个值,没有NA。我还编辑了原始帖子,以便您可以看到CurrentData的标题。