R 在数据帧行上应用两个向量的函数_R

R 在数据帧行上应用两个向量的函数

R 在数据帧行上应用两个向量的函数,r,R,我试图在数据帧行上应用Hmiscwdt.mean函数。它通常需要两个向量，一个表示平均值，一个表示平均值的权重。我试图找到一个dplyr/tidyr/purrr解决方案，但没有完全解决 library(Hmisc) #build data frame with 10 weight columns and 10 mean columns set.seed(10) w = matrix(runif(200,0,1),ncol = 20) w = w/rowSums(w) m = matrix(ru

我试图在数据帧行上应用

Hmisc

wdt.mean

函数。它通常需要两个向量，一个表示平均值，一个表示平均值的权重。我试图找到一个

dplyr

tidyr

purrr

解决方案，但没有完全解决

library(Hmisc)

#build data frame with 10 weight columns and 10 mean columns
set.seed(10)
w = matrix(runif(200,0,1),ncol = 20)
w = w/rowSums(w)
m = matrix(runif(200,50,100),ncol = 20)
df <- as.data.frame(cbind(w,m))
colnames(df) <- c(paste0("weight",seq(1,20,1)),paste0("mean",seq(1,20,1)))

# calculate weighted means for each row
for (i in 1:nrow(df)) {
  df$weighted.means [i] <-  wtd.mean(x =as.numeric(df[i,21:40]), weights = as.numeric(df[i,1:20]) )
}
> df$weighted.means
 [1] 70.74705 82.85015 82.40826 73.35798 70.02986 74.05543 73.64709 77.12899 72.56236 84.74055

库（Hmisc）
#构建包含10个权重列和10个平均列的数据框架
种子（10）
w=矩阵（runif（200,0,1），ncol=20）
w=w/行和（w）
m=矩阵（runif（200,50100），ncol=20）
df您可以执行以下任一操作：
df %>% 
  mutate(weighted.means = apply(df, 1, function(x) wtd.mean(x = as.numeric(x[21:40]), 
                                                            weights = as.numeric(x[1:20]))))

或使用此（长…）tidyverse解决方案：
df %>% 
  rownames_to_column("group") %>% 
  gather(name, value, -group) %>% 
  extract(name, into = c("weight_mean", "number"), regex = "([[:alpha:]]+)(\\d+)") %>% 
  spread(weight_mean, value) %>% 
  group_by(group = as.numeric(group)) %>% 
  summarise(weighted.means = wtd.mean(x = mean, weights = weight))

# A tibble: 10 x 2
#    group weighted.means
#    <dbl>          <dbl>
#  1 1               70.7
#  2 2               82.9
#  3 3               82.4
#  4 4               73.4
#  5 5               70.0
#  6 6               74.1
#  7 7               73.6
#  8 8               77.1
#  9 9               72.6
# 10 10              84.7

df%>%
行名称到列（“组”）%>%
聚集（名称、值，-组）%%>%
提取（名称，输入=c（“重量”、“数字”），regex=“（[:alpha:][]+）（\\d+）”）%>%
价差（重量、平均值）%>%
分组依据（组=作为数字（组））%>%
总结（加权平均数=加权平均数（x=平均数，加权=重量））
#一个tibble:10x2
#组加权平均数
#              
#  1 1               70.7
#  2 2               82.9
#  3 3               82.4
#  4 4               73.4
#  5 5               70.0
#  6 6               74.1
#  7 7               73.6
#  8 8               77.1
#  9 9               72.6
# 10 10              84.7
你可以做应用（df，1，function（x）wtd.mean（x=as.numeric（x[21:40]），weights=as.numeric（x[1:20]）
，但这并不漂亮，也不整洁。我想dplyr:：mutate
我在底部显示的weighted.means列，而不使用for循环。你的解决方案确实有效。谢谢。我不得不在末尾添加几行，将“组”转换为数字，并安排以正确的顺序获得平均值