使用多个“;键”;R列

使用多个“;键”;R列,r,reshape,data-manipulation,R,Reshape,Data Manipulation,我希望能够使用纵向临床试验数据将长格式数据框重塑为宽格式数据框。下面是我希望更改的长格式示例: structure(list(study = structure(c(2L, 2L, 1L, 1L, 1L), .Label = c("Jones, 1996", "Smith, 1999"), class = "factor"), group_allocation = structure(c(2L, 1L, 2L, 3L, 1L), .Label = c("control", "interven

我希望能够使用纵向临床试验数据将长格式数据框重塑为宽格式数据框。下面是我希望更改的长格式示例:

structure(list(study = structure(c(2L, 2L, 1L, 1L, 1L), .Label = c("Jones, 
1996", "Smith, 1999"), class = "factor"), group_allocation = 
structure(c(2L, 1L, 2L, 3L, 1L), .Label = c("control", "intervention_1", 
"intervention_2"), class = "factor"), outcome = structure(c(2L, 2L, 1L, 
1L, 1L), .Label = c("anxiety", "depression"), class = "factor"), bl_mean = 
c(6.5, 4.5, 3.7, 4.2, 5.3), fu_timepoint = c(6L, 6L, 12L, 12L, 12L), 
fu_mean = c(5.2, 7.5, 2.5, 2.7, 6.3), mean_diff = c(-2.3, NA, -3.8, -3.6, 
NA)), class = "data.frame", row.names = c(NA, -5L))

  study       group_allocation outcome bl_mean fu_timepoint fu_mean mean_diff
1 Smith, 1999 intervention_1 depression  6.5            6     5.2      -2.3
2 Smith, 1999 control        depression  4.5            6     7.5       NA
3 Jones, 1996 intervention_1 anxiety     3.7           12     2.5      -3.8
4 Jones, 1996 intervention_2 anxiety     4.2           12     2.7      -3.6
5 Jones, 1996 control        anxiety     5.3           12     6.3       NA
我的问题是,对于每项研究,我只需要在组分配列中的每个干预组(标记为“干预组1”和“干预组2”)有一个观察/行,我需要对照组数据(在组分配列中标记为“对照组”)移动到与每个干预组相同行中的单独列中,以便分析比较干预组与对照组的数据(跨数据框)。以下是我想要的:

structure(list(study = structure(c(2L, 1L, 1L), .Label = c("Jones, 1996", 
"Smith, 1999"), class = "factor"), ig_group_allocation = structure(c(1L, 
1L, 2L), .Label = c("intervention_1", "intervention_2"), class = 
"factor"), outcome = structure(c(2L, 1L, 1L), .Label = c("anxiety", 
"depression"), class = "factor"), ig_bl_mean = c(6.5, 3.7, 4.2), 
fu_timepoint = c(6L, 12L, 12L), ig_fu_mean = c(5.2, 2.5, 2.7), mean_diff = 
c(-2.3, -3.8, -3.6), cg_group_allocation = structure(c(1L, 1L, 1L), .Label 
= "control", class = "factor"), cg_bl_mean = c(4.5, 5.3, 5.3), cg_fu_mean 
= c(7.5, 6.3, 6.3)), class = "data.frame", row.names = c(NA, -3L))

study             ig_group_allocation outcome ig_bl_mean fu_timepoint ig_fu_meanmean_diff cg_group_allocation cg_bl_mean cg_fu_mean
1 Smith, 1999      intervention_1    depression     6.5            6        5.2      -2.3             control        4.5        7.5
2 Jones, 1996      intervention_1    anxiety        3.7           12        2.5      -3.8             control        5.3        6.3
3 Jones, 1996      intervention_2    anxiety        4.2           12        2.7      -3.6             control        5.3        6.3
我已经阅读了许多关于堆栈溢出的其他数据重塑问题,但还没有找到类似于我的问题的解决方案


谢谢大家!

将数据拆分为两个数据帧,一个用于控制,一个用于干预,然后将它们合并在一起

df
        study group_allocation    outcome bl_mean fu_timepoint fu_mean mean_diff
1 Smith, 1999   intervention_1 depression     6.5            6     5.2      -2.3
2 Smith, 1999          control depression     4.5            6     7.5        NA
3 Jones, 1996   intervention_1    anxiety     3.7           12     2.5      -3.8
4 Jones, 1996   intervention_2    anxiety     4.2           12     2.7      -3.6
5 Jones, 1996          control    anxiety     5.3           12     6.3        NA

 interventions<-df[grep("intervention", df$group_allocation),]

 interventions
        study group_allocation    outcome bl_mean fu_timepoint fu_mean mean_diff
1 Smith, 1999   intervention_1 depression     6.5            6     5.2      -2.3
3 Jones, 1996   intervention_1    anxiety     3.7           12     2.5      -3.8
4 Jones, 1996   intervention_2    anxiety     4.2           12     2.7      -3.6


 controls<-df[grep("control", df$group_allocation),]

 controls
        study group_allocation    outcome bl_mean fu_timepoint fu_mean mean_diff
2 Smith, 1999          control depression     4.5            6     7.5        NA
5 Jones, 1996          control    anxiety     5.3           12     6.3        NA

 names(controls)<-paste0("cg_", names(controls)) #add cg prefix to colnames

 new_df<-merge(interventions, controls, by.x="study", by.y="cg_study", all.x=TRUE)

 new_df
        study group_allocation    outcome bl_mean fu_timepoint fu_mean mean_diff cg_group_allocation cg_outcome cg_bl_mean cg_fu_timepoint cg_fu_mean cg_mean_diff
1 Jones, 1996   intervention_1    anxiety     3.7           12     2.5      -3.8             control    anxiety        5.3              12        6.3           NA
2 Jones, 1996   intervention_2    anxiety     4.2           12     2.7      -3.6             control    anxiety        5.3              12        6.3           NA
3 Smith, 1999   intervention_1 depression     6.5            6     5.2      -2.3             control depression        4.5               6        7.5           NA
df
研究组分配结果bl_平均值fu_时间点fu_平均值fu_差异
1史密斯,1999年干预研究1大萧条6.5 6 5.2-2.3
2 Smith,1999控制抑郁症4.5 6 7.5 NA
3 Jones,1996干预措施1焦虑3.7122.5-3.8
4琼斯,1996年干预2焦虑4.2122.7-3.6
5琼斯,1996控制焦虑5.3 12 6.3 NA

干预在这种情况下,最好同时包括数据当前的外观和预期输出。这样在复制数据之前就可以仔细考虑了!抱歉,我假设您可以通过dput()命令判断当前和预期的输出。您希望我如何准确地显示数据的外观?只需运行它并复制并粘贴每个数据的
头。把dput也留在问题中!好的,谢谢你的澄清。我已经按照你的建议更新了这个问题。