如何在R中对名称重复的行进行分组？_R_Dataframe_Group By_Data Visualization_Subset

如何在R中对名称重复的行进行分组？

r dataframe

如何在R中对名称重复的行进行分组？,r,dataframe,group-by,data-visualization,subset,R,Dataframe,Group By,Data Visualization,Subset,我对R很陌生，正在为子集数据集而挣扎。这就是数据集的来源以及我如何清理它 board_game_original<- read.csv("https://raw.githubusercontent.com/bryandmartin/STAT302/master/docs/Projects/project1_bgdataviz/board_game_raw.csv") #tidy up the column of mechanic and category with

我对R很陌生，正在为子集数据集而挣扎。这就是数据集的来源以及我如何清理它

board_game_original<- read.csv("https://raw.githubusercontent.com/bryandmartin/STAT302/master/docs/Projects/project1_bgdataviz/board_game_raw.csv")

#tidy up the column of mechanic and category with cSplit function
library(splitstackshape)
mechanic <- board_game$mechanic
board_game_tidy <- cSplit(board_game,splitCols=c("mechanic","category"), sep = ",", direction = "long")

board\u game\u original你非常接近。您需要dplyr:：summary（）

complexity\u top\u 5\u类别%
组别(类别)%>%
dplyr:：总结（平均平均复杂度=平均复杂度，na.rm=真））%>%
顶级（5，平均复杂度）
#选择（平均复杂性）%>%#您不需要此选项
#过滤器（类别==c（“抽象战略行动/灵活性”、“冒险”、“理性时代”、“美国内战”））
复杂度排名前五

您不必在summary（）之前包含dplyr:：
。但是，其他一些常用软件包也有其summary（）版本，因此更安全的做法是针对特定的软件包
您可以使用top\u n（）
自动选择前n个类别，而不是使用filter（）
filter
前5个类别的值，然后按类别
分组，并取平均复杂度的平均值
library(dplyr)

board_game_tidy %>% 
  filter(category %in% names(top_5_category)) %>%
  group_by(category) %>%
  summarise(average_complexity = mean(average_complexity))

# category           average_complexity
#  <fct>                           <dbl>
#1 Abstract Strategy               0.844
#2 Action / Dexterity              0.469
#3 Adventure                       1.25 
#4 Age of Reason                   1.95 
#5 American Civil War              1.68 

库（dplyr）
棋盘游戏整洁%>%
筛选器（类别%in%名称（前5个类别））%>%
组别(类别)%>%
总结（平均复杂度=平均复杂度）
#类别平均复杂度
#                             
#1抽象策略0.844
#2动作/灵巧度0.469
#3.1.25
#4理性年龄1.95
#5美国内战1.68
可能是过滤器（类别%c（…）
？您好，谢谢您的回答！我尝试了你的代码，结果显示：错误：filter（）
input.1
有问题。找不到x对象“top_5_类别”ℹ 输入.1
是类别%中的%names（top_5_category）
@harperzhutop_5_category出现在您的帖子中。你跑了吗？
complexity_top_5_category <- board_game_tidy %>% 
        group_by(category) %>%
        dplyr::summarise(mean_average_complexity = mean(average_complexity, na.rm=TRUE)) %>% 
        top_n(5, mean_average_complexity) 
        #select(average_complexity) %>% # you don't need this
        #filter(category == c("Abstract Strategy Action / Dexterity", "Adventure", "Age of Reason","American Civil War "))
complexity_top_5_category

library(dplyr)

board_game_tidy %>% 
  filter(category %in% names(top_5_category)) %>%
  group_by(category) %>%
  summarise(average_complexity = mean(average_complexity))

# category           average_complexity
#  <fct>                           <dbl>
#1 Abstract Strategy               0.844
#2 Action / Dexterity              0.469
#3 Adventure                       1.25 
#4 Age of Reason                   1.95 
#5 American Civil War              1.68