R 创建由具有相同名称的多个列的平均值组成的唯一列

R 创建由具有相同名称的多个列的平均值组成的唯一列,r,dataframe,R,Dataframe,我有一个400列的数据框,大约有100行。以下是数据框的俯视图: MX MX ID BR MX FR BR MX ES FR ES ES MX FR 2/19/2015 111.45 122.46 98.16 101.20 98.60 100.74 93.15 98.61 110.69 102.28 143.21

我有一个400列的数据框,大约有100行。以下是数据框的俯视图:

           MX      MX      ID      BR      MX      FR      BR      MX      ES      FR      ES      ES      MX      FR
2/19/2015  111.45  122.46   98.16  101.20   98.60  100.74   93.15   98.61  110.69  102.28  143.21  135.32  103.30   98.50
2/12/2015  110.71  123.50   98.60  100.97   98.00  100.67   93.18   99.84  110.57  102.33  141.50  136.04  102.63   99.25
2/5/2015   111.51  125.27   99.25  101.27   97.75  100.83   93.38  101.09  111.62  102.30  145.76  137.74  102.50   96.75
1/29/2015  111.00  122.25   99.63  101.25   99.20  100.63   93.06   98.69  111.59  102.47  142.75  138.61  101.88   96.25
1/22/2015  111.39  124.00   98.13  100.55   98.92  100.52   93.00  100.21  108.99  102.46  140.96  134.14  101.75   95.75
1/15/2015  111.11  121.37   97.38  100.35   99.75  100.66   93.00  101.11  109.50  102.48  143.03  131.35  101.50   95.45

我需要创建一个列,它是具有该列名称的所有列的平均值,因此我有一个列“MX”,它是2015年2月19日所有MX的平均值,对于ID、BR、FR等也是如此。

如果您习惯于使用
$dollarsign
符号来引用数据帧中的列,这种格式可能会让人望而生畏。但是,请记住,仍然可以使用其索引明确地引用每一列

对于您的特定情况,您可以使用
names
,找出哪些列具有给定名称,并将这些索引传递给
colMeans

df$MX.mean <- rowMeans(df[which(names(df) == "MX")])

df$MX.mean您可以将
mapply
rowMeans

  nm1 <- unique(names(df1))
  res <-  mapply(function(x,y) rowMeans(df1[x == y], na.rm=TRUE),
                        list(names(df1)), nm1)
  colnames(res) <- paste0(nm1, '.Mean')
  res
  #         MX.Mean ID.Mean BR.Mean   FR.Mean  ES.Mean
  #2/19/2015 106.884   98.16  97.175 100.50667 129.7400
  #2/12/2015 106.936   98.60  97.075 100.75000 129.3700
  #2/5/2015  107.624   99.25  97.325  99.96000 131.7067
  #1/29/2015 106.604   99.63  97.155  99.78333 130.9833
  #1/22/2015 107.254   98.13  96.775  99.57667 128.0300
  #1/15/2015 106.968   97.38  96.675  99.53000 127.9600
或者将
wide
格式转换为
long
格式,然后将其重新转换回
wide

 library(reshape2)
 library(splitstackshape)
 res <- dcast.data.table(getanID(melt(as.matrix(df1)), 1:2)[,
      Var2:=paste0(Var2, '.Mean')], Var1~Var2, value.var='value', mean)
library(重塑2)
库(splitstackshape)

res使用
vapply()

其中
df

df <- read.table(check.names = FALSE, header = TRUE, text = "MX      MX      ID      BR      MX      FR      BR      MX      ES      FR      ES      ES      MX      FR
2/19/2015  111.45  122.46   98.16  101.20   98.60  100.74   93.15   98.61  110.69  102.28  143.21  135.32  103.30   98.50
2/12/2015  110.71  123.50   98.60  100.97   98.00  100.67   93.18   99.84  110.57  102.33  141.50  136.04  102.63   99.25
2/5/2015   111.51  125.27   99.25  101.27   97.75  100.83   93.38  101.09  111.62  102.30  145.76  137.74  102.50   96.75
1/29/2015  111.00  122.25   99.63  101.25   99.20  100.63   93.06   98.69  111.59  102.47  142.75  138.61  101.88   96.25
1/22/2015  111.39  124.00   98.13  100.55   98.92  100.52   93.00  100.21  108.99  102.46  140.96  134.14  101.75   95.75
1/15/2015  111.11  121.37   97.38  100.35   99.75  100.66   93.00  101.11  109.50  102.48  143.03  131.35  101.50   95.45")
df也许你可以使用colMeans。查看更多详细信息和示例。
 library(reshape2)
 library(splitstackshape)
 res <- dcast.data.table(getanID(melt(as.matrix(df1)), 1:2)[,
      Var2:=paste0(Var2, '.Mean')], Var1~Var2, value.var='value', mean)
 df1 <- structure(list(MX = c(111.45, 110.71, 111.51, 111, 111.39, 
 111.11
 ), MX = c(122.46, 123.5, 125.27, 122.25, 124, 121.37), ID = c(98.16, 
 98.6, 99.25, 99.63, 98.13, 97.38), BR = c(101.2, 100.97, 101.27, 
 101.25, 100.55, 100.35), MX = c(98.6, 98, 97.75, 99.2, 98.92, 
 99.75), FR = c(100.74, 100.67, 100.83, 100.63, 100.52, 100.66
 ), BR = c(93.15, 93.18, 93.38, 93.06, 93, 93), MX = c(98.61, 
 99.84, 101.09, 98.69, 100.21, 101.11), ES = c(110.69, 110.57, 
 111.62, 111.59, 108.99, 109.5), FR = c(102.28, 102.33, 102.3, 
 102.47, 102.46, 102.48), ES = c(143.21, 141.5, 145.76, 142.75, 
 140.96, 143.03), ES = c(135.32, 136.04, 137.74, 138.61, 134.14, 
 131.35), MX = c(103.3, 102.63, 102.5, 101.88, 101.75, 101.5), 
 FR = c(98.5, 99.25, 96.75, 96.25, 95.75, 95.45)), .Names = c("MX", 
"MX", "ID", "BR", "MX", "FR", "BR", "MX", "ES", "FR", "ES", "ES", 
"MX", "FR"), class = "data.frame", row.names = c("2/19/2015", 
"2/12/2015", "2/5/2015", "1/29/2015", "1/22/2015", "1/15/2015"))
vapply(
    unique(names(df)), 
    function(x) rowMeans(df[grepl(x, names(df), fixed = TRUE)]),
    double(nrow(df))
)
#                MX    ID     BR        FR       ES
# 2/19/2015 106.884 98.16 97.175 100.50667 129.7400
# 2/12/2015 106.936 98.60 97.075 100.75000 129.3700
# 2/5/2015  107.624 99.25 97.325  99.96000 131.7067
# 1/29/2015 106.604 99.63 97.155  99.78333 130.9833
# 1/22/2015 107.254 98.13 96.775  99.57667 128.0300
# 1/15/2015 106.968 97.38 96.675  99.53000 127.9600
df <- read.table(check.names = FALSE, header = TRUE, text = "MX      MX      ID      BR      MX      FR      BR      MX      ES      FR      ES      ES      MX      FR
2/19/2015  111.45  122.46   98.16  101.20   98.60  100.74   93.15   98.61  110.69  102.28  143.21  135.32  103.30   98.50
2/12/2015  110.71  123.50   98.60  100.97   98.00  100.67   93.18   99.84  110.57  102.33  141.50  136.04  102.63   99.25
2/5/2015   111.51  125.27   99.25  101.27   97.75  100.83   93.38  101.09  111.62  102.30  145.76  137.74  102.50   96.75
1/29/2015  111.00  122.25   99.63  101.25   99.20  100.63   93.06   98.69  111.59  102.47  142.75  138.61  101.88   96.25
1/22/2015  111.39  124.00   98.13  100.55   98.92  100.52   93.00  100.21  108.99  102.46  140.96  134.14  101.75   95.75
1/15/2015  111.11  121.37   97.38  100.35   99.75  100.66   93.00  101.11  109.50  102.48  143.03  131.35  101.50   95.45")