R 从一大组数据帧生成平均值、标准差和标准误差_R

R 从一大组数据帧生成平均值、标准差和标准误差

R 从一大组数据帧生成平均值、标准差和标准误差,r,R,假设我有一个名为“data”的数据框，它看起来像这样： View(Data) Ball Day Expansion Red 1 5 Red 1 8 Red 1 3 Red 2 7 Red 2 9 Blue 1 5 Blue 1 3 Blue 2 7 Blue 2 5 Blue 2 4 ... 我想从这组数据中得到平均值（SE）、标准偏差（SD）和平均值的标准误差，这样最终的结果就像这样 #Note: 'Expansion' value show

假设我有一个名为“data”的数据框，它看起来像这样：

View(Data)
Ball Day Expansion
Red  1   5
Red  1   8
Red  1   3
Red  2   7
Red  2   9
Blue 1   5
Blue 1   3
Blue 2   7
Blue 2   5
Blue 2   4
...

我想从这组数据中得到平均值（SE）、标准偏差（SD）和平均值的标准误差，这样最终的结果就像这样

#Note: 'Expansion' value shown is showing the mean of the group, 'x' and 'y' are the result of the SE and SD

Ball Day Expansion SE SD
Red  1    7        X  Y
Red  2    5        X  Y
Red  3    6        X  Y
Red  4    5        X  Y
Blue 1    4        X  Y
Blue 2    8        X  Y
Blue 3    6        X  Y
...

有人知道怎么做吗？

我希望这就是你的想法：

library(dplyr)

df %>%
  group_by(Ball, Day) %>%
  summarise(across(Expansion, list(Mean = mean, 
                                SD = sd, 
                                SE = function(x) sqrt(var(x)/length(x))), 
                   .names = "{.fn}.{.col}"))

# A tibble: 4 x 5
# Groups:   Ball [2]
  Ball    Day Mean.Expansion SD.Expansion SE.Expansion
  <chr> <dbl>          <dbl>        <dbl>        <dbl>
1 Blue      1           4            1.41        1    
2 Blue      2           5.33         1.53        0.882
3 Red       1           5.33         2.52        1.45 
4 Red       2           8            1.41        1

数据：

df这里有一种方法。我们可以使用dplyr
包进行此类计算
library(dplyr)

Data2 <- Data %>%
  group_by(Ball, Day) %>%
  summarize(Mean = mean(Expansion),
            SE = sd(Expansion)/sqrt(n()),
            SD = sd(Expansion)) %>%
  rename(Expansion = Mean) %>%
  ungroup() 

Data2
# # A tibble: 4 x 5
#   Ball    Day Expansion    SE    SD
#   <chr> <int>     <dbl> <dbl> <dbl>
# 1 Blue      1      4    1      1.41
# 2 Blue      2      5.33 0.882  1.53
# 3 Red       1      5.33 1.45   2.52
# 4 Red       2      8    1      1.41

库（dplyr）
数据2%
分组依据（球，日）%>%
汇总（平均值=平均值（扩展），
SE=sd（扩展）/sqrt（n（）），
SD=SD（扩展））%>%
重命名（扩展=平均值）%>%
解组（）
数据2
##A tibble:4 x 5
#球日扩展SE SD
#           
#1蓝色1 4 1 1.41
#2蓝色2 5.33 0.882 1.53
#3红色1 5.33 1.45 2.52
#4红色2811.41

数据
Data <- read.table(
  text = "Ball Day Expansion
Red  1   5
Red  1   8
Red  1   3
Red  2   7
Red  2   9
Blue 1   5
Blue 1   3
Blue 2   7
Blue 2   5
Blue 2   4", header = TRUE
)

根据OP提供的输出判断数据，因为Ball
和Day
都是唯一的，我猜OP希望按Ball
和Day
分组，并使用summary
功能，而不是mutate
。但我可能错了，因为OP没有提供一个清晰的描述或可复制的例子。是的，我认为你是对的！我首先按球进行分组
，然后看到您的代码并修改了我的代码，我为此感谢您。但我相信这是留给OP以她/他想要的任何方式修改代码。但我确实认为，尽管summary
产生了非常整洁的输出，但所需的输出更接近mutate。
library(dplyr)

Data2 <- Data %>%
  group_by(Ball, Day) %>%
  summarize(Mean = mean(Expansion),
            SE = sd(Expansion)/sqrt(n()),
            SD = sd(Expansion)) %>%
  rename(Expansion = Mean) %>%
  ungroup() 

Data2
# # A tibble: 4 x 5
#   Ball    Day Expansion    SE    SD
#   <chr> <int>     <dbl> <dbl> <dbl>
# 1 Blue      1      4    1      1.41
# 2 Blue      2      5.33 0.882  1.53
# 3 Red       1      5.33 1.45   2.52
# 4 Red       2      8    1      1.41

Data <- read.table(
  text = "Ball Day Expansion
Red  1   5
Red  1   8
Red  1   3
Red  2   7
Red  2   9
Blue 1   5
Blue 1   3
Blue 2   7
Blue 2   5
Blue 2   4", header = TRUE
)