计算度量值的汇总统计信息,并将其透视到R中的列
我有一个这样的数据帧计算度量值的汇总统计信息,并将其透视到R中的列,r,dataframe,dplyr,data.table,reshape2,R,Dataframe,Dplyr,Data.table,Reshape2,我有一个这样的数据帧 Step <- c("1","1","4","3","2","2","3","4","4","3","1","3","2","4","3","1","2") Length <- c(0.1,0.5,0.7,0.8,0.2,0.1,0.3,0.8,0.9,0.15,0.25,0.27,0.28,0.61,0.15,0.37,0.18) Breadth <- c(0.13,0.35,0.87,0.38,0.52,0.71,0.43,0.8,0.9,0.15,0
Step <- c("1","1","4","3","2","2","3","4","4","3","1","3","2","4","3","1","2")
Length <- c(0.1,0.5,0.7,0.8,0.2,0.1,0.3,0.8,0.9,0.15,0.25,0.27,0.28,0.61,0.15,0.37,0.18)
Breadth <- c(0.13,0.35,0.87,0.38,0.52,0.71,0.43,0.8,0.9,0.15,0.45,0.7,0.8,0.11,0.11,0.47,0.28)
Height <- c(0.31,0.35,0.37,0.38,0.32,0.51,0.53,0.48,0.9,0.15,0.35,0.32,0.22,0.11,0.17,0.27,0.38)
Width <- c(0.21,0.25,0.27,0.8,0.2,0.21,0.3,0.28,0.29,0.65,0.55,0.37,0.26,0.31,0.5,0.7,0.8)
df <- data.frame(Step,Length,Breadth,Height,Width)
我试图用这种方法来计算汇总统计数据,但这不是一种有效的方法
library(dplyr)
df1 <- df %>%
group_by(Step) %>%
summarise(Length_Mean = mean(Length),
Breadth_Mean = mean(Breadth),
Height_Mean = mean(Height),
Width_Mean = mean(Width))
库(dplyr)
df1%
分组依据(步进)%>%
总结(长度=平均值(长度),
宽度\平均值=平均值(宽度),
高度\平均值=平均值(高度),
宽度\平均值=平均值(宽度))
如何以最少的代码高效地完成所需的输出?有人能给我指出正确的方向吗?你可以使用一个版本的summary
来计算相同的摘要
同时统计多个列。从?范围限定的:
后缀为_-if、_-at或_的变体都应用一个表达式
(有时几个)指定子集内的所有变量。这
子集可以包含所有变量(_allvariants),一个vars()选择
(_atvariants)或使用谓词选择的变量(_ifvariants)
在这里,summary\u all
可能是一个不错的选择;它将选择除
用于分组列。您还可以为
计算选择中的每个变量
library(tidyverse)
# Calculate the summary statistics
sums <- df %>%
group_by(Step) %>%
summarize_all(funs(max, min, mean, median, sd))
sums
#> # A tibble: 4 x 21
#> Step Length_max Breadth_max Height_max Width_max Length_min Breadth_min
#> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 0.5 0.47 0.35 0.7 0.1 0.13
#> 2 2 0.28 0.8 0.51 0.8 0.1 0.28
#> 3 3 0.8 0.7 0.53 0.8 0.15 0.11
#> 4 4 0.9 0.9 0.9 0.31 0.61 0.11
#> # ... with 14 more variables: Height_min <dbl>, Width_min <dbl>,
#> # Length_mean <dbl>, Breadth_mean <dbl>, Height_mean <dbl>,
#> # Width_mean <dbl>, Length_median <dbl>, Breadth_median <dbl>,
#> # Height_median <dbl>, Width_median <dbl>, Length_sd <dbl>,
#> # Breadth_sd <dbl>, Height_sd <dbl>, Width_sd <dbl>
由(v0.2.0)于2018年5月24日创建。您可以使用版本的汇总
计算相同的汇总
同时统计多个列。从?范围限定的:
后缀为_-if、_-at或_的变体都应用一个表达式
(有时几个)指定子集内的所有变量。这
子集可以包含所有变量(_allvariants),一个vars()选择
(_atvariants)或使用谓词选择的变量(_ifvariants)
在这里,summary\u all
可能是一个不错的选择;它将选择除
用于分组列。您还可以为
计算选择中的每个变量
library(tidyverse)
# Calculate the summary statistics
sums <- df %>%
group_by(Step) %>%
summarize_all(funs(max, min, mean, median, sd))
sums
#> # A tibble: 4 x 21
#> Step Length_max Breadth_max Height_max Width_max Length_min Breadth_min
#> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 0.5 0.47 0.35 0.7 0.1 0.13
#> 2 2 0.28 0.8 0.51 0.8 0.1 0.28
#> 3 3 0.8 0.7 0.53 0.8 0.15 0.11
#> 4 4 0.9 0.9 0.9 0.31 0.61 0.11
#> # ... with 14 more variables: Height_min <dbl>, Width_min <dbl>,
#> # Length_mean <dbl>, Breadth_mean <dbl>, Height_mean <dbl>,
#> # Width_mean <dbl>, Length_median <dbl>, Breadth_median <dbl>,
#> # Height_median <dbl>, Width_median <dbl>, Length_sd <dbl>,
#> # Breadth_sd <dbl>, Height_sd <dbl>, Width_sd <dbl>
由(v0.2.0)于2018年5月24日创建
sums %>%
# Reshape to long format
gather(col, val, -Step) %>%
# Separate the measurement and the summary statistic
separate(col, into = c("Measurement", "stat")) %>%
arrange(Step) %>%
# Create the desired column headings
unite(col, stat, Step) %>%
# Need to use factors to preserve order
mutate_at(vars(col, Measurement), fct_inorder) %>%
# Reshape back to wide format
spread(col, val)
#> # A tibble: 4 x 21
#> Measurement max_1 min_1 mean_1 median_1 sd_1 max_2 min_2 mean_2
#> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Length 0.5 0.1 0.305 0.31 0.171 0.28 0.1 0.19
#> 2 Breadth 0.47 0.13 0.35 0.4 0.156 0.8 0.28 0.578
#> 3 Height 0.35 0.27 0.32 0.330 0.0383 0.51 0.22 0.358
#> 4 Width 0.7 0.21 0.428 0.4 0.237 0.8 0.2 0.368
#> # ... with 12 more variables: median_2 <dbl>, sd_2 <dbl>, max_3 <dbl>,
#> # min_3 <dbl>, mean_3 <dbl>, median_3 <dbl>, sd_3 <dbl>, max_4 <dbl>,
#> # min_4 <dbl>, mean_4 <dbl>, median_4 <dbl>, sd_4 <dbl>