R:构建循环函数以基于variablename/index聚合变量
我的数据具有以下结构。数据集称为“w12”。这只是数据的摘录R:构建循环函数以基于variablename/index聚合变量,r,loops,for-loop,indexing,apply,R,Loops,For Loop,Indexing,Apply,我的数据具有以下结构。数据集称为“w12”。这只是数据的摘录 ID ### w1_panas1 ### w1_panas2 ### w1_panas4 1 ######## 5 ########## 3 ########### 2 2 ######## 4 ########## 3 ########### 1 3
ID ### w1_panas1 ### w1_panas2 ### w1_panas4
1 ######## 5 ########## 3 ########### 2
2 ######## 4 ########## 3 ########### 1
3 ######## 3 ########## 2 ########### 4
我为每个ID构建了一个新变量,其中包含三项的平均值(w1_panas1
-w1_panas3
):
w12$w**1**_panas_host <- apply (cbind(w12$w**1**_panas1, w12$w**1**_panas2, w12$w**1**_panas3), 1, mean, na.rm= "T")
结果变量应命名为
w**2**_panas
w**3**_panas
我不想写12次以上的函数,但想使用一个循环函数,它只通过将w1
更改为w2
更改为w3
更改为w4
自动构建变量
有人能帮忙吗?您的数据应该是这样的(或者您可以很容易地以这种格式获取):
set.seed(1)
df%汇总(平均值=平均值)
总目(平均数)
id周变量平均值
1巴拿马1 133
2 1巴拿马2 298
3 1巴拿马4 215
4 1 2巴拿马1 496
5 1 2 panas2 167
6 1 2巴拿马4 526
结果并不完全符合您想要的格式,但这是一个快速紧凑的解决方案,它显示了您想要知道的一切。首先,一些测试数据:
set.seed(123)
df <- data.frame(ID = rep(1:3, each = 3),
w1_panas1 = runif(9),
w1_panas2 = runif(9),
w1_panas3 = runif(9),
w2_panas1 = runif(9),
w2_panas2 = runif(9),
w2_panas3 = runif(9),
w3_panas1 = runif(9),
w3_panas2 = runif(9),
w3_panas3 = runif(9))
df
# ID w1_panas1 w1_panas2 w1_panas3 w2_panas1 w2_panas2 w2_panas3 w3_panas1 w3_panas2 w3_panas3
#1 1 0.2875775 0.45661474 0.3279207 0.59414202 0.7584595 0.13880606 0.56094798 0.2743836 0.7101824014
#2 1 0.7883051 0.95683335 0.9545036 0.28915974 0.2164079 0.23303410 0.20653139 0.8146400 0.0006247733
#3 1 0.4089769 0.45333416 0.8895393 0.14711365 0.3181810 0.46596245 0.12753165 0.4485163 0.4753165741
#4 2 0.8830174 0.67757064 0.6928034 0.96302423 0.2316258 0.26597264 0.75330786 0.8100644 0.2201188852
#5 2 0.9404673 0.57263340 0.6405068 0.90229905 0.1428000 0.85782772 0.89504536 0.8123895 0.3798165377
#6 2 0.0455565 0.10292468 0.9942698 0.69070528 0.4145463 0.04583117 0.37446278 0.7943423 0.6127710033
#7 3 0.5281055 0.89982497 0.6557058 0.79546742 0.4137243 0.44220007 0.66511519 0.4398317 0.3517979092
#8 3 0.8924190 0.24608773 0.7085305 0.02461368 0.3688455 0.79892485 0.09484066 0.7544752 0.1111354243
#9 3 0.5514350 0.04205953 0.5440660 0.47779597 0.1524447 0.12189926 0.38396964 0.6292211 0.2436194727
您应该花一些时间创建一个可复制的示例。以后的问题请看如何回答。非常感谢!稍后我需要一个分层数据集,所以格式正是我所需要的。非常感谢!下次我在我的问题中创建类似您的数据集时。刚从R开始。我能再问一个问题吗?操作符%>%是什么意思?这是一个“管道”操作符。您可以使用它创建一系列(管道)操作,这些操作一个接一个,并将数据从一个传递到下一个。有关更多信息,请参阅。
set.seed(1)
df <- data.frame(id = rep(1:10, 12), panas1 = sample(1:500, 120),
panas2 = sample(1:300, 120), panas4 = sample(1:700, 120),
week = rep(1:12, each=10))
head(df)
id panas1 panas2 panas4 week
1 1 133 298 215 1
2 2 186 149 405 1
3 3 286 145 636 1
4 4 452 52 100 1
5 5 101 224 289 1
6 6 445 134 147 1
library(reshape2); library(dplyr)
dfmeans <- melt(df, id.vars = c("id","week")) %>%
group_by(id, week, variable) %>% summarise(avg = mean(value))
head(dfmeans)
id week variable avg
1 1 1 panas1 133
2 1 1 panas2 298
3 1 1 panas4 215
4 1 2 panas1 496
5 1 2 panas2 167
6 1 2 panas4 526
set.seed(123)
df <- data.frame(ID = rep(1:3, each = 3),
w1_panas1 = runif(9),
w1_panas2 = runif(9),
w1_panas3 = runif(9),
w2_panas1 = runif(9),
w2_panas2 = runif(9),
w2_panas3 = runif(9),
w3_panas1 = runif(9),
w3_panas2 = runif(9),
w3_panas3 = runif(9))
df
# ID w1_panas1 w1_panas2 w1_panas3 w2_panas1 w2_panas2 w2_panas3 w3_panas1 w3_panas2 w3_panas3
#1 1 0.2875775 0.45661474 0.3279207 0.59414202 0.7584595 0.13880606 0.56094798 0.2743836 0.7101824014
#2 1 0.7883051 0.95683335 0.9545036 0.28915974 0.2164079 0.23303410 0.20653139 0.8146400 0.0006247733
#3 1 0.4089769 0.45333416 0.8895393 0.14711365 0.3181810 0.46596245 0.12753165 0.4485163 0.4753165741
#4 2 0.8830174 0.67757064 0.6928034 0.96302423 0.2316258 0.26597264 0.75330786 0.8100644 0.2201188852
#5 2 0.9404673 0.57263340 0.6405068 0.90229905 0.1428000 0.85782772 0.89504536 0.8123895 0.3798165377
#6 2 0.0455565 0.10292468 0.9942698 0.69070528 0.4145463 0.04583117 0.37446278 0.7943423 0.6127710033
#7 3 0.5281055 0.89982497 0.6557058 0.79546742 0.4137243 0.44220007 0.66511519 0.4398317 0.3517979092
#8 3 0.8924190 0.24608773 0.7085305 0.02461368 0.3688455 0.79892485 0.09484066 0.7544752 0.1111354243
#9 3 0.5514350 0.04205953 0.5440660 0.47779597 0.1524447 0.12189926 0.38396964 0.6292211 0.2436194727
library(dplyr) # load the
library(tidyr) # required packages
df <- df %>%
group_by(ID) %>% # group the data by ID
mutate(n = row_number()) %>% # for each ID, create and index n
ungroup() # ungroup the data
df %>%
gather(panas, value, -c(ID, n)) %>% # reshape the data to long format
separate(panas, into = c("week", "number"), sep = "_") %>% # split the column "panas" into two columns based on the "_"
group_by(ID, week, n) %>% # group the data
summarise(mean = mean(value)) %>% # calculate mean values for each group
ungroup() %>% # ungroup..
spread(week, mean) %>% # reshape from long to wide format
left_join(df, ., by = c("ID", "n")) %>% # perform a join with the original data by ID and n so that all data is in one table
select(-n) # drop column "n"
#Source: local data frame [9 x 13]
#
# ID w1_panas1 w1_panas2 w1_panas3 w2_panas1 w2_panas2 w2_panas3 w3_panas1 w3_panas2 w3_panas3 w1 w2 w3
#1 1 0.2875775 0.45661474 0.3279207 0.59414202 0.7584595 0.13880606 0.56094798 0.2743836 0.7101824014 0.3573710 0.4971359 0.5151713
#2 1 0.7883051 0.95683335 0.9545036 0.28915974 0.2164079 0.23303410 0.20653139 0.8146400 0.0006247733 0.8998807 0.2462006 0.3405987
#3 1 0.4089769 0.45333416 0.8895393 0.14711365 0.3181810 0.46596245 0.12753165 0.4485163 0.4753165741 0.5839501 0.3104190 0.3504549
#4 2 0.8830174 0.67757064 0.6928034 0.96302423 0.2316258 0.26597264 0.75330786 0.8100644 0.2201188852 0.7511305 0.4868742 0.5944970
#5 2 0.9404673 0.57263340 0.6405068 0.90229905 0.1428000 0.85782772 0.89504536 0.8123895 0.3798165377 0.7178692 0.6343089 0.6957505
#6 2 0.0455565 0.10292468 0.9942698 0.69070528 0.4145463 0.04583117 0.37446278 0.7943423 0.6127710033 0.3809170 0.3836943 0.5938587
#7 3 0.5281055 0.89982497 0.6557058 0.79546742 0.4137243 0.44220007 0.66511519 0.4398317 0.3517979092 0.6945454 0.5504639 0.4855816
#8 3 0.8924190 0.24608773 0.7085305 0.02461368 0.3688455 0.79892485 0.09484066 0.7544752 0.1111354243 0.6156791 0.3974613 0.3201504
#9 3 0.5514350 0.04205953 0.5440660 0.47779597 0.1524447 0.12189926 0.38396964 0.6292211 0.2436194727 0.3791869 0.2507133 0.4189367