R 将总和和平均值嵌套在一个集合中，以获得每组得分的平均值_R_Aggregate_Summary

R 将总和和平均值嵌套在一个集合中，以获得每组得分的平均值

R 将总和和平均值嵌套在一个集合中，以获得每组得分的平均值,r,aggregate,summary,R,Aggregate,Summary,我找不到与我的问题类似的数据集，所以我将数据集Iris（R中的数据集）更改为类似的-它足够接近了 data = iris data$type = gl(5,30,150,labels=c("group1","group2","group3","group4","group5")) data$ID = gl(30,5,150) 然后我使用了下面的代码 xtabs(Sepal.Length ~ Species + type, aggregate(Sepal.Length ~ Species + t

我找不到与我的问题类似的数据集，所以我将数据集Iris（R中的数据集）更改为类似的-它足够接近了

data = iris
data$type = gl(5,30,150,labels=c("group1","group2","group3","group4","group5"))
data$ID = gl(30,5,150)

然后我使用了下面的代码

xtabs(Sepal.Length ~ Species + type, aggregate(Sepal.Length ~ Species + type + ID, data, mean))

导致

type
Species      group1 group2 group3 group4 group5
  setosa      30.16  19.90   0.00   0.00   0.00
  versicolor   0.00  12.20  35.88  11.28   0.00
  virginica    0.00   0.00   0.00  26.24  39.64

我的理解是，我的代码所做的是将每个ID的萼片长度相加，然后按每个物种和类型取这些值的平均值

这是正确的吗

如果没有，我怎么得到这个

此外，如果我的数据是这样的，每个ID都有多种类型，我将如何获得这些信息？（不知道如何在R中构造此）

事实上，我只是想说清楚

我想要的是一个代码，将每个ID和类型的萼片长度相加，然后取所有ID的平均值，然后用

数据发布类型和物种的平均萼片长度。表

：

library(data.table)
setDT(data)

#sum of Sepal.Length for each ID AND type
data[, id_type_sum := sum(Sepal.Length), by = .(ID, type)]

# mean of this variable by type and species
data[, mean(id_type_sum), by = .(type, Species)]

#   type    Species       V1
# 1: group1     setosa 25.13333
# 2: group2     setosa 24.87500
# 3: group2 versicolor 30.50000
# 4: group3 versicolor 29.90000
# 5: group4 versicolor 28.20000
# 6: group4  virginica 32.80000
# 7: group5  virginica 33.03333

如果您希望以表格格式显示，可以使用

data.table

的

dcast

方法：

library(magrittr) # for the %>% operator
data[, mean(id_type_sum), by = .(type, Species)] %>%
  dcast(Species ~ type)

结果:

      Species   group1 group2 group3 group4   group5
1:     setosa 25.13333 24.875     NA     NA       NA
2: versicolor       NA 30.500   29.9   28.2       NA
3:  virginica       NA     NA     NA   32.8 33.03333

我在我的实际数据上使用了这段代码，数据看起来像我预期的那样！非常感谢这太棒了