计算度量值的汇总统计信息,并将其透视到R中的列

计算度量值的汇总统计信息,并将其透视到R中的列,r,dataframe,dplyr,data.table,reshape2,R,Dataframe,Dplyr,Data.table,Reshape2,我有一个这样的数据帧 Step <- c("1","1","4","3","2","2","3","4","4","3","1","3","2","4","3","1","2") Length <- c(0.1,0.5,0.7,0.8,0.2,0.1,0.3,0.8,0.9,0.15,0.25,0.27,0.28,0.61,0.15,0.37,0.18) Breadth <- c(0.13,0.35,0.87,0.38,0.52,0.71,0.43,0.8,0.9,0.15,0

我有一个这样的数据帧

Step <- c("1","1","4","3","2","2","3","4","4","3","1","3","2","4","3","1","2")
Length <- c(0.1,0.5,0.7,0.8,0.2,0.1,0.3,0.8,0.9,0.15,0.25,0.27,0.28,0.61,0.15,0.37,0.18)
Breadth <- c(0.13,0.35,0.87,0.38,0.52,0.71,0.43,0.8,0.9,0.15,0.45,0.7,0.8,0.11,0.11,0.47,0.28)
Height <- c(0.31,0.35,0.37,0.38,0.32,0.51,0.53,0.48,0.9,0.15,0.35,0.32,0.22,0.11,0.17,0.27,0.38)
Width <- c(0.21,0.25,0.27,0.8,0.2,0.21,0.3,0.28,0.29,0.65,0.55,0.37,0.26,0.31,0.5,0.7,0.8)

df <- data.frame(Step,Length,Breadth,Height,Width) 
我试图用这种方法来计算汇总统计数据,但这不是一种有效的方法

library(dplyr)
df1 <- df %>%
  group_by(Step) %>%
  summarise(Length_Mean = mean(Length),
            Breadth_Mean = mean(Breadth),
            Height_Mean = mean(Height),
            Width_Mean = mean(Width))
库(dplyr)
df1%
分组依据(步进)%>%
总结(长度=平均值(长度),
宽度\平均值=平均值(宽度),
高度\平均值=平均值(高度),
宽度\平均值=平均值(宽度))
如何以最少的代码高效地完成所需的输出?有人能给我指出正确的方向吗?

你可以使用一个版本的
summary
来计算相同的摘要 同时统计多个列。从
?范围限定的

后缀为_-if、_-at或_的变体都应用一个表达式 (有时几个)指定子集内的所有变量。这 子集可以包含所有变量(_allvariants),一个vars()选择 (_atvariants)或使用谓词选择的变量(_ifvariants)

在这里,
summary\u all
可能是一个不错的选择;它将选择除 用于分组列。您还可以为 计算选择中的每个变量

library(tidyverse)

# Calculate the summary statistics
sums <- df %>% 
  group_by(Step) %>% 
  summarize_all(funs(max, min, mean, median, sd))

sums
#> # A tibble: 4 x 21
#>   Step  Length_max Breadth_max Height_max Width_max Length_min Breadth_min
#>   <fct>      <dbl>       <dbl>      <dbl>     <dbl>      <dbl>       <dbl>
#> 1 1           0.5         0.47       0.35      0.7        0.1         0.13
#> 2 2           0.28        0.8        0.51      0.8        0.1         0.28
#> 3 3           0.8         0.7        0.53      0.8        0.15        0.11
#> 4 4           0.9         0.9        0.9       0.31       0.61        0.11
#> # ... with 14 more variables: Height_min <dbl>, Width_min <dbl>,
#> #   Length_mean <dbl>, Breadth_mean <dbl>, Height_mean <dbl>,
#> #   Width_mean <dbl>, Length_median <dbl>, Breadth_median <dbl>,
#> #   Height_median <dbl>, Width_median <dbl>, Length_sd <dbl>,
#> #   Breadth_sd <dbl>, Height_sd <dbl>, Width_sd <dbl>
由(v0.2.0)于2018年5月24日创建。

您可以使用版本的
汇总
计算相同的汇总 同时统计多个列。从
?范围限定的

后缀为_-if、_-at或_的变体都应用一个表达式 (有时几个)指定子集内的所有变量。这 子集可以包含所有变量(_allvariants),一个vars()选择 (_atvariants)或使用谓词选择的变量(_ifvariants)

在这里,
summary\u all
可能是一个不错的选择;它将选择除 用于分组列。您还可以为 计算选择中的每个变量

library(tidyverse)

# Calculate the summary statistics
sums <- df %>% 
  group_by(Step) %>% 
  summarize_all(funs(max, min, mean, median, sd))

sums
#> # A tibble: 4 x 21
#>   Step  Length_max Breadth_max Height_max Width_max Length_min Breadth_min
#>   <fct>      <dbl>       <dbl>      <dbl>     <dbl>      <dbl>       <dbl>
#> 1 1           0.5         0.47       0.35      0.7        0.1         0.13
#> 2 2           0.28        0.8        0.51      0.8        0.1         0.28
#> 3 3           0.8         0.7        0.53      0.8        0.15        0.11
#> 4 4           0.9         0.9        0.9       0.31       0.61        0.11
#> # ... with 14 more variables: Height_min <dbl>, Width_min <dbl>,
#> #   Length_mean <dbl>, Breadth_mean <dbl>, Height_mean <dbl>,
#> #   Width_mean <dbl>, Length_median <dbl>, Breadth_median <dbl>,
#> #   Height_median <dbl>, Width_median <dbl>, Length_sd <dbl>,
#> #   Breadth_sd <dbl>, Height_sd <dbl>, Width_sd <dbl>
由(v0.2.0)于2018年5月24日创建

sums %>% 
  # Reshape to long format
  gather(col, val, -Step) %>% 
  # Separate the measurement and the summary statistic
  separate(col, into = c("Measurement", "stat")) %>% 
  arrange(Step) %>% 
  # Create the desired column headings
  unite(col, stat, Step) %>% 
  # Need to use factors to preserve order
  mutate_at(vars(col, Measurement), fct_inorder) %>% 
  # Reshape back to wide format
  spread(col, val)
#> # A tibble: 4 x 21
#>   Measurement max_1 min_1 mean_1 median_1   sd_1 max_2 min_2 mean_2
#>   <fct>       <dbl> <dbl>  <dbl>    <dbl>  <dbl> <dbl> <dbl>  <dbl>
#> 1 Length       0.5   0.1   0.305    0.31  0.171   0.28  0.1   0.19 
#> 2 Breadth      0.47  0.13  0.35     0.4   0.156   0.8   0.28  0.578
#> 3 Height       0.35  0.27  0.32     0.330 0.0383  0.51  0.22  0.358
#> 4 Width        0.7   0.21  0.428    0.4   0.237   0.8   0.2   0.368
#> # ... with 12 more variables: median_2 <dbl>, sd_2 <dbl>, max_3 <dbl>,
#> #   min_3 <dbl>, mean_3 <dbl>, median_3 <dbl>, sd_3 <dbl>, max_4 <dbl>,
#> #   min_4 <dbl>, mean_4 <dbl>, median_4 <dbl>, sd_4 <dbl>