Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/loops/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 方差分析输出的x和y变量的p值矩阵_R_Loops_Anova - Fatal编程技术网

R 方差分析输出的x和y变量的p值矩阵

R 方差分析输出的x和y变量的p值矩阵,r,loops,anova,R,Loops,Anova,我有很多X和Y变量(大约500x500)。以下只是一些小数据: yvars <- data.frame (Yv1 = rnorm(100, 5, 3), Y2 = rnorm (100, 6, 4), Yv3 = rnorm (100, 14, 3)) xvars <- data.frame (Xv1 = sample (c(1,0, -1), 100, replace = T), X2 = sample (c(1,0, -1), 100, replace = T), Xv

我有很多X和Y变量(大约500x500)。以下只是一些小数据:

yvars <- data.frame (Yv1 = rnorm(100, 5, 3), Y2 = rnorm (100, 6, 4),
  Yv3 = rnorm (100, 14, 3))
xvars <- data.frame (Xv1 = sample (c(1,0, -1), 100, replace = T),
 X2 = sample (c(1,0, -1), 100, replace = T), 
 Xv3 = sample (c(1,0, -1), 100, replace = T), 
 D = sample (c(1,0, -1), 100, replace = T))
以下是我尝试循环流程的步骤:

prob = NULL
   anova.pmat <- function (x) {
            mydata <- data.frame(yvar = yvars[, x], xvars)
            for (i in seq(length(xvars))) {
              prob[[i]] <- anova(lm(yvar ~ mydata[, i + 1],
              data = mydata))$`Pr(>F)`[1]
              }
              }
    sapply (yvars,anova.pmat)
    Error in .subset(x, j) : only 0's may be mixed with negative subscripts
What could be the solution ?
prob=NULL

anova.pmat这里有一个解决方案,它包括生成Y和X变量的所有组合以进行测试(我们不能使用
combn
),并在每种情况下运行线性模型:

dfrm <- data.frame(y=gl(ncol(yvars), ncol(xvars), labels=names(yvars)),
                   x=gl(ncol(xvars), 1, labels=names(xvars)), pval=NA)
## little helper function to create formula on the fly
fm <- function(x) as.formula(paste(unlist(x), collapse="~"))
## merge both datasets
full.df <- cbind.data.frame(yvars, xvars)
## apply our LM row-wise
dfrm$pval <- apply(dfrm[,1:2], 1, 
                   function(x) anova(lm(fm(x), full.df))$`Pr(>F)`[1])
## arrange everything in a rectangular matrix of p-values
res <- matrix(dfrm$pval, nc=3, dimnames=list(levels(dfrm$x), levels(dfrm$y)))

dfrm这里有一种方法,它使用
plyr
为每个
xvars
yvars
循环数据帧的列(将其视为列表),返回适当的p值,将其排列成矩阵。添加行/列名只是额外的

library("plyr")

probs <- laply(xvars, function(x) {
    laply(yvars, function(y) {
        anova(lm(y~x))$`Pr(>F)`[1]
    })
})
rownames(probs) <- names(xvars)
colnames(probs) <- names(yvars)
库(“plyr”)
问题F)`[1]
})
})

rownames(probs)通常从让函数使用循环变量的一个实例开始。然后可以确定循环体是否有意义。目前我不知道你在尝试什么,但是你应该试着一次用一个“x”来调用anova.pmat。(这一努力可能会让你更好地了解多重测试的错误发现率。)我很难理解你试图用
mydata@DWin做什么请看我最近的一篇文章edits@mac这将选择一个yvar和其他XVariable(参见编辑),所以不同的y变量可以在不同的时间传递beleif@nilS. 我看到您进行了编辑,但您使用for循环替换了该函数。这并不能帮助你实现这个功能。您需要使用单个值“x”调用该函数。然后您会发现,将“prob”设置为NULL在函数内部没有任何作用。我认为初始化到正确的长度列表或向量会更有帮助。(+1)比我的解决方案更干净!
for (j in seq(length (yvars))){
        prob <- NULL
        mydata <- data.frame(yvar = yvars[, j], xvars)
         for (i in seq(length(xvars))) {
                  prob[[i]] <- anova(lm(yvar ~ mydata[, i + 1],
                  data = mydata))$`Pr(>F)`[1]
                  }
}

Gives the same result as above !!!
dfrm <- data.frame(y=gl(ncol(yvars), ncol(xvars), labels=names(yvars)),
                   x=gl(ncol(xvars), 1, labels=names(xvars)), pval=NA)
## little helper function to create formula on the fly
fm <- function(x) as.formula(paste(unlist(x), collapse="~"))
## merge both datasets
full.df <- cbind.data.frame(yvars, xvars)
## apply our LM row-wise
dfrm$pval <- apply(dfrm[,1:2], 1, 
                   function(x) anova(lm(fm(x), full.df))$`Pr(>F)`[1])
## arrange everything in a rectangular matrix of p-values
res <- matrix(dfrm$pval, nc=3, dimnames=list(levels(dfrm$x), levels(dfrm$y)))
library("plyr")

probs <- laply(xvars, function(x) {
    laply(yvars, function(y) {
        anova(lm(y~x))$`Pr(>F)`[1]
    })
})
rownames(probs) <- names(xvars)
colnames(probs) <- names(yvars)