R中的排列测试,并查看哪些分配导致更大的结果
我有一张这样的桌子:R中的排列测试,并查看哪些分配导致更大的结果,r,R,我有一张这样的桌子: Index Treatement Y(0) Y(1) 1 0 10 ? 2 0 20 ? 3 0 15 ? 4 1 ? 5 5 1 ? 9 我想改变所有的分配机制,3个分配给控制,2个分配给治疗。换句话说,我不希望所有的集合都是1s或0s,或者是41
Index Treatement Y(0) Y(1)
1 0 10 ?
2 0 20 ?
3 0 15 ?
4 1 ? 5
5 1 ? 9
我想改变所有的分配机制,3个分配给控制,2个分配给治疗。换句话说,我不希望所有的集合都是1s或0s,或者是41s或0s和11或10。我希望每个排列都有30和21,但这组中的项目不同。然后我想看看这些任务的哪个版本(即,如果1个任务分配给治疗组,2个对照组,3个任务分配给治疗组,4个任务分配给控制组,5个任务分配给治疗组)会导致与观察到的结果一样极端的结果。在R中我将如何执行此操作?这非常简单:
n <- 1000 # iterations
replicate(n, diff( by(df$Y[sample(1:nrow(df),nrow(df),FALSE)],
df$Treatment,
mean) ) )
n我假设您想彻底检查所有120个排列-
library(data.table)
library(reshape2)
y <- c(10,20,15,5,9)
#getting all combinations
allwduplicate <- data.table(expand.grid(p1 = 1:5, p2 = 1:5, p3 = 1:5, p4 = 1:5, p5 = 1:5) )
perms <- allwduplicate[(p1+p2+p3+p4+p5 == sum(1:5)) & (p1*p2*p3*p4*p5 == prod(1:5))]
#melting dataset into easier structure
perms[,permid := 1:prod(1:5)]
perms <- data.table(melt(perms, id.vars = 'permid'))
# assigning treatement values
perms[,yvalue := y[value]]
# assigning whether treated or not
perms[,treated := 1]
perms[variable %in% c('p1','p2'),treated := 0]
# calculating means of treated 3 vs. non treated 2
perms <- merge(
perms[treated == 1,list(yvmean1 = mean(yvalue)), by = c('permid')],
perms[treated == 0,list(yvmean0 = mean(yvalue)), by = c('permid')],
by = 'permid'
)
# treatementdiff is the value you want, I think
perms[,treatementdiff := ymean1 - ymean0]
库(data.table)
图书馆(E2)
y您可以使用ri2
很好地实现这一点,您可以使用install.packages(“ri2”)
库(ri2)
dat[1,]0 0 0 1 1 1
#> [2,] 0 0 0 1 1 1 0 0 0 1
#> [3,] 0 1 1 0 0 1 0 0 1 0
#> [4,] 1 0 1 0 1 0 0 1 0 0
#> [5,] 1 1 0 1 0 0 1 0 0 0
#做随机化推理
ri_out est_sim est_obs系数
#>Z.1-8.0000000-8 Z
#>Z.2 0.3333-8 Z
#>Z.3-3.0000000-8 Z
#>Z.4 4.5000000-8 Z
#>Z.5 1.1666667-8 Z
#>Z.6 9.5000000-8 Z
#>Z.7-3.8333333-8 Z
#>Z.8-7.1666667-8 Z
#>Z.9 1.1666667-8 Z
#>Z.10 5.3333333-8 Z
#得到一个p值
总结(RIU out)
#>系数估计双尾p_值null_ci_下null_ci_上
#>1 Z-80.2-7.8125 8.5625
library(ri2)
dat <- data.frame(Y = c(10, 20, 15, 5, 9),
Z = c(0, 0, 0, 1, 1))
declaration <- declare_ra(N = 5, m = 2)
# All 10 possibilities
obtain_permutation_matrix(declaration)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,] 0 0 0 0 0 0 1 1 1 1
#> [2,] 0 0 0 1 1 1 0 0 0 1
#> [3,] 0 1 1 0 0 1 0 0 1 0
#> [4,] 1 0 1 0 1 0 0 1 0 0
#> [5,] 1 1 0 1 0 0 1 0 0 0
# Do randomization inference
ri_out <- conduct_ri(formula = Y ~ Z, declaration = declaration, data = dat)
# check out the 10 possibilities
ri_out$sims_df
#> est_sim est_obs coefficient
#> Z.1 -8.0000000 -8 Z
#> Z.2 0.3333333 -8 Z
#> Z.3 -3.0000000 -8 Z
#> Z.4 4.5000000 -8 Z
#> Z.5 1.1666667 -8 Z
#> Z.6 9.5000000 -8 Z
#> Z.7 -3.8333333 -8 Z
#> Z.8 -7.1666667 -8 Z
#> Z.9 1.1666667 -8 Z
#> Z.10 5.3333333 -8 Z
# Get a p-value
summary(ri_out)
#> coefficient estimate two_tailed_p_value null_ci_lower null_ci_upper
#> 1 Z -8 0.2 -7.8125 8.5625