R 如何根据特定的数字对几行进行平均

R 如何根据特定的数字对几行进行平均,r,dataframe,R,Dataframe,我的数据是这样的 df<- structure(list(data1 = c(20171205L, 20171205L, 20171205L, 20171205L, 20171205L, 20171205L, 20171205L, 20171205L, 20171205L, 20171205L, 20171205L, 20171205L, 20171205L, 20171205L, 20171205L, 20171205L, 20171205L, 20171205L, 2017120

我的数据是这样的

df<- structure(list(data1 = c(20171205L, 20171205L, 20171205L, 20171205L, 
20171205L, 20171205L, 20171205L, 20171205L, 20171205L, 20171205L, 
20171205L, 20171205L, 20171205L, 20171205L, 20171205L, 20171205L, 
20171205L, 20171205L, 20171205L, 20171205L), data2 = c(0.00546273, 
0.00552377, 0.00549325, 0.00550851, 0.00556954, 0.00560006, 0.00555428, 
0.00560006, 0.0055848, 0.00561532, 0.00555428, 0.0055848, 0.00552377, 
0.00549325, 0.00550851, 0.00556954, 0.00560006, 0.00555428, 0.00560006, 
0.0055848), data3 = c(0.00546273, 0.00552377, 0.00549325, 0.00550851, 
0.00556954, 0.00560006, 0.00555428, 0.00560006, 0.0055848, 0.00561532, 
0.00555428, 0.0055848, 0.00552377, 0.00549325, 0.00550851, 0.00556954, 
0.00560006, 0.00555428, 0.00560006, 0.0055848), mydf = structure(1:20, .Label = c("B02", 
"B03", "B04", "B05", "B06", "C02", "C03", "C04", "C05", "C06", 
"D02", "D03", "D04", "D05", "D06", "E02", "E03", "E04", "E05", 
"E06"), class = "factor")), .Names = c("data1", "data2", "data3", 
"mydf"), class = "data.frame", row.names = c(NA, -20L))
2-将以下行也放在一个新的数据框中,并取每列的平均值

B04
B05
B06
C04
C05
C06
D04
D05
D06
E04
E05
E06
因此,每列有两个值(第一组和第二组的平均值)


我想从mydf列中取出这些值,然后以某种方式将其拆分,但我无法用
dplyr
找到解决方案
group_by
用于定义分组变量,
SUMMARY_at
用于计算除
mydf
之外的所有列的平均值,该平均值被
vars(-mydf)
排除

库(dplyr)
df2%
分组依据(当(
grepl(“02$| 03$”,mydf)~1L,
grepl(“04$| 05$| 06$”,mydf)~2L,
TRUE~NA_整数_
)) %>%
总结(vars(-mydf)、funs(平均值)())
df2
##A tibble:2 x 4
#组数据1数据2数据3
#                     
# 1     1 20171205 0.005556190 0.005556190
# 2     2 20171205 0.005553013 0.005553013

在base R中,您可以使用
grepl
根据行的后缀将行拆分为组。然后对每组进行汇总:

#添加组列(其中mydf有02,03后缀或04,05,06)
df$组数据1数据2数据3 mydf组
#>1 20171205 0.00546273 0.00546273 B02 1
#>2 20171205 0.00552377 0.00552377 B03 1
#>3 20171205 0.00549325 0.00549325 B04 2
#>4 20171205 0.00550851 0.00550851 B05 2
#>5 20171205 0.00556954 0.00556954 B06 2
#>6 20171205 0.00560006 0.00560006 C02 1
#>7 20171205 0.00555428 0.00555428 C03 1
#>8 20171205 0.00560006 0.00560006 C04 2
#>9 20171205 0.00558480 0.00558480 C05 2
#>10 20171205 0.00561532 0.00561532 C06 2
#>11 20171205 0.00555428 0.00555428 D02 1
#>12 20171205 0.00558480 0.00558480 D03 1
#>13 20171205 0.00552377 0.00552377 D04 2
#>14 20171205 0.00549325 0.00549325 D05 2
#>15 20171205 0.00550851 0.00550851 D06 2
#>16 20171205 0.00556954 0.00556954 E02 1
#>17 20171205 0.00560006 0.00560006 E03 1
#>18 20171205 0.00555428 0.00555428 E04 2
#>19 20171205 0.00560006 0.00560006 E05 2
#>20 20171205 0.00558480 0.00558480 E06 2
#按组取列平均值
聚合(x=df[,1:3],by=list(group=df$group),FUN=mean)
#>组数据1数据2数据3
#> 1     1 20171205 0.005556190 0.005556190
#> 2     2 20171205 0.005553013 0.005553013

我喜欢你的答案,因为我不能接受两个答案
B04
B05
B06
C04
C05
C06
D04
D05
D06
E04
E05
E06
data 1    data2    data2
library(dplyr)

df2 <- df %>%
  group_by(Group = case_when(
    grepl("02$|03$", mydf)       ~ 1L,
    grepl("04$|05$|06$", mydf)   ~ 2L,
    TRUE                       ~ NA_integer_
  )) %>%
  summarise_at(vars(-mydf), funs(mean(.)))
df2
# # A tibble: 2 x 4
#   Group    data1       data2       data3
#   <int>    <dbl>       <dbl>       <dbl>
# 1     1 20171205 0.005556190 0.005556190
# 2     2 20171205 0.005553013 0.005553013