R 如何使用apply在循环中运行chisq.test_R_Loops_Matrix_Apply_Chi Squared

R 如何使用apply在循环中运行chisq.test

r loops matrix

R 如何使用apply在循环中运行chisq.test,r,loops,matrix,apply,chi-squared,R,Loops,Matrix,Apply,Chi Squared,我是R的新手。由于我的项目的需要，我需要做Chisq测试十万个条目我自学了几天，并编写了一些循环运行chisq.test的代码。代码：这段代码可能有几个问题。但它是有效的但是，它运行得非常慢我试图通过使用“应用”来改进它我计划使用apply两次，而不是使用“for” 然而，有一个错误说矩阵不是一个函数。zsd chisq.test输出是一个列表，我不能使用write.table输出数据数据是这样的 SN0001 and 9 numbers cohort_1_

我是R的新手。由于我的项目的需要，我需要做Chisq测试十万个条目

我自学了几天，并编写了一些循环运行chisq.test的代码。代码：

这段代码可能有几个问题。但它是有效的

但是，它运行得非常慢

我试图通过使用“应用”来改进它

我计划使用apply两次，而不是使用“for”

然而，有一个错误说矩阵不是一个函数。zsd chisq.test输出是一个列表，我不能使用write.table输出数据

数据是这样的

SN0001 and 9 numbers
           cohort_1_AA cohort_1_AB cohort_1_BB cohort_2_AA cohort_2_AB cohort_2_BB cohort_3_AA cohort_3_AB cohort_3_BB
SN0001     197         964        1088       877      858      168     351    435      20
....
....

我日日夜夜都在努力。希望有人能帮助我。

非常感谢。

一个

for

循环意味着一个

应用

，而不是两个

大概是这样的：

result=apply(the.data, 1, function(data.row) {
   ## Your code using data.row
})

如果结果比循环的可读性更高，则使用它。否则，坚持你所拥有的<代码>应用的速度（更快或更慢）不会有明显差异。

要使用应用函数组，首先很容易定义我们自己的函数，然后应用它。让我们这样做吧

    ##first define the function to apply
    Chsq <- function(x){
   ## input is a row of your data
   ## creating a table from each row
         x <- matrix(x,byrow =TRUE,nrow=3)
    ### this will return the p value
      return(chisq.test(x)$p.value)
    }
## Now apply this function
data = read.table ("test_chisq_allelefrq.txt", header=T, sep="\t",row.names=1)
## by using as.vector convert the output into a vector
P_Values <- as.vector(apply(data,1,Chsq))
result <- cbind(rownames(data),P_Values)
write.table (results,  file = "chisq-test_output.txt", append=F, quote = F, sep = "\t ",eol = "\n", na = "NA", dec = ".", row.names = F, col.names = T)

##首先定义要应用的函数
Chsqapply
不是一个灵丹妙药。为
提供的唯一开销是扩展结果向量，但与测试相比，这是很小的，您可以通过预分配来解决这一问题<代码>应用

应该在使表达式更易于阅读时使用，而不是为了速度（因为它不会更快）。谢谢您的建议。那我怎么才能提高速度呢？我刚搜索过。人们提到了包裹plyr。plyr会帮忙吗？@user3766160可能不会。了解如何分析代码和搜索瓶颈。你可以从这里开始：谢谢你，库迪。代码是有效的。我刚刚意识到应用并没有提高速度。它基本上等于“for”。我可以使用“for”使它工作……但我的目的是使它更快，因为有十万个chisq测试。你有什么建议吗？

result=apply(the.data, 1, function(data.row) {
   ## Your code using data.row
})

    ##first define the function to apply
    Chsq <- function(x){
   ## input is a row of your data
   ## creating a table from each row
         x <- matrix(x,byrow =TRUE,nrow=3)
    ### this will return the p value
      return(chisq.test(x)$p.value)
    }
## Now apply this function
data = read.table ("test_chisq_allelefrq.txt", header=T, sep="\t",row.names=1)
## by using as.vector convert the output into a vector
P_Values <- as.vector(apply(data,1,Chsq))
result <- cbind(rownames(data),P_Values)
write.table (results,  file = "chisq-test_output.txt", append=F, quote = F, sep = "\t ",eol = "\n", na = "NA", dec = ".", row.names = F, col.names = T)