Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/82.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 分组、汇总和重塑数据框_R_Dataframe - Fatal编程技术网

R 分组、汇总和重塑数据框

R 分组、汇总和重塑数据框,r,dataframe,R,Dataframe,我有一个原始数据如下 raw_data <- data.frame( name=c("Ronak","Bob","Moh"), l_name=c("Shah","Marly","Salah"), R_pro

我有一个原始数据如下

raw_data <- data.frame(
                            name=c("Ronak","Bob","Moh"),
                            l_name=c("Shah","Marly","Salah"),
                            R_programming=c(1,5,2),
                            football=c(2,4,6),
                            snooker=c(6,3,2),
                            Python=c(3,2,6),
                            location=c("Maly","US","Maly")
                      )
然后,我根据上述平均值计算分数和等级,并重塑数据

我的预期产出是

output <- data.frame(
                        location=c("US","Maly"),
                        indicator=rep(c("programming","sport"),each=2),
                        average=c(3,3.5,3.5,4),
                        score=c(0.6,0.7,0.7,0.8),
                        scale=c("Moderate high","high","high","high")
                    )

输出
位置分组
并取节目和运动的平均值。以长格式获取数据,并为
分数
比例
创建新列

library(dplyr)

raw_data %>%
  group_by(location) %>%
  summarise(programming_average = mean(c(R_programming, Python), na.rm = TRUE), 
            sport_average = mean(c(football,snooker),na.rm=TRUE)) %>%
  tidyr::pivot_longer(cols = -location, 
               names_to = c('indicator', '.value'), 
               names_sep = '_') %>%
  mutate(score = average/5, 
         scale = case_when(score <= 0.6 ~ 'moderate high', 
                           score > 0.6 ~ 'high'))

#  location indicator  average score scale        
#  <chr>    <chr>        <dbl> <dbl> <chr>        
#1 Maly     programming     3     0.6 moderate high
#2 Maly     sport          4     0.8 high         
#3 US       programming     3.5   0.7 high         
#4 US       sport          3.5   0.7 high         
库(dplyr)
原始数据%>%
分组依据(位置)%>%
总结(编程平均值=平均值(c(R\u编程,Python),na.rm=真),
运动平均值=平均值(c(足球、斯诺克),na.rm=真实值))%>%
tidyr::pivot_更长(cols=-位置,
名称_to=c('indicator','.value'),
名称\u sep=''.''%>%
变异(分数=平均值/5,
量表=情况(得分0.6~‘高’)
#位置指标平均得分量表
#                        
#1 Maly编程3 0.6中等偏高
#2 Maly sport 4 0.8偏高
#3美国编程3.5 0.7高
#4美国体育3.5 0.7偏高
library(dplyr)

raw_data %>%
  group_by(location) %>%
  summarise(programming_average = mean(c(R_programming, Python), na.rm = TRUE), 
            sport_average = mean(c(football,snooker),na.rm=TRUE)) %>%
  tidyr::pivot_longer(cols = -location, 
               names_to = c('indicator', '.value'), 
               names_sep = '_') %>%
  mutate(score = average/5, 
         scale = case_when(score <= 0.6 ~ 'moderate high', 
                           score > 0.6 ~ 'high'))

#  location indicator  average score scale        
#  <chr>    <chr>        <dbl> <dbl> <chr>        
#1 Maly     programming     3     0.6 moderate high
#2 Maly     sport          4     0.8 high         
#3 US       programming     3.5   0.7 high         
#4 US       sport          3.5   0.7 high