如何在R中对名称重复的行进行分组?
我对R很陌生,正在为子集数据集而挣扎。 这就是数据集的来源以及我如何清理它如何在R中对名称重复的行进行分组?,r,dataframe,group-by,data-visualization,subset,R,Dataframe,Group By,Data Visualization,Subset,我对R很陌生,正在为子集数据集而挣扎。 这就是数据集的来源以及我如何清理它 board_game_original<- read.csv("https://raw.githubusercontent.com/bryandmartin/STAT302/master/docs/Projects/project1_bgdataviz/board_game_raw.csv") #tidy up the column of mechanic and category with
board_game_original<- read.csv("https://raw.githubusercontent.com/bryandmartin/STAT302/master/docs/Projects/project1_bgdataviz/board_game_raw.csv")
#tidy up the column of mechanic and category with cSplit function
library(splitstackshape)
mechanic <- board_game$mechanic
board_game_tidy <- cSplit(board_game,splitCols=c("mechanic","category"), sep = ",", direction = "long")
board\u game\u original你非常接近。您需要dplyr::summary()
complexity\u top\u 5\u类别%
组别(类别)%>%
dplyr::总结(平均平均复杂度=平均复杂度,na.rm=真))%>%
顶级(5,平均复杂度)
#选择(平均复杂性)%>%#您不需要此选项
#过滤器(类别==c(“抽象战略行动/灵活性”、“冒险”、“理性时代”、“美国内战”))
复杂度排名前五
您不必在summary()之前包含dplyr::
。但是,其他一些常用软件包也有其summary()版本,因此更安全的做法是针对特定的软件包
您可以使用top\u n()
自动选择前n个类别,而不是使用filter()
filter
前5个类别的值,然后按类别
分组,并取平均复杂度的平均值
library(dplyr)
board_game_tidy %>%
filter(category %in% names(top_5_category)) %>%
group_by(category) %>%
summarise(average_complexity = mean(average_complexity))
# category average_complexity
# <fct> <dbl>
#1 Abstract Strategy 0.844
#2 Action / Dexterity 0.469
#3 Adventure 1.25
#4 Age of Reason 1.95
#5 American Civil War 1.68
库(dplyr)
棋盘游戏整洁%>%
筛选器(类别%in%名称(前5个类别))%>%
组别(类别)%>%
总结(平均复杂度=平均复杂度)
#类别平均复杂度
#
#1抽象策略0.844
#2动作/灵巧度0.469
#3.1.25
#4理性年龄1.95
#5美国内战1.68
可能是过滤器(类别%c(…)
?您好,谢谢您的回答!我尝试了你的代码,结果显示:错误:filter()
input.1
有问题。找不到x对象“top_5_类别”ℹ 输入.1
是类别%中的%names(top_5_category)
@harperzhutop_5_category
出现在您的帖子中。你跑了吗?
complexity_top_5_category <- board_game_tidy %>%
group_by(category) %>%
dplyr::summarise(mean_average_complexity = mean(average_complexity, na.rm=TRUE)) %>%
top_n(5, mean_average_complexity)
#select(average_complexity) %>% # you don't need this
#filter(category == c("Abstract Strategy Action / Dexterity", "Adventure", "Age of Reason","American Civil War "))
complexity_top_5_category
library(dplyr)
board_game_tidy %>%
filter(category %in% names(top_5_category)) %>%
group_by(category) %>%
summarise(average_complexity = mean(average_complexity))
# category average_complexity
# <fct> <dbl>
#1 Abstract Strategy 0.844
#2 Action / Dexterity 0.469
#3 Adventure 1.25
#4 Age of Reason 1.95
#5 American Civil War 1.68