如何在R中执行pslda，显示列表中几个元素的错误率？_R_List_Loops_Lapply

如何在R中执行pslda，显示列表中几个元素的错误率？

r list loops

如何在R中执行pslda，显示列表中几个元素的错误率？,r,list,loops,lapply,R,List,Loops,Lapply,我在R中对10个数据帧（10个研究区域的数据）执行splsda模型，存储为列表（datalist）。所有这些数据帧都是相似的，具有相同的变量，但只是不同的值。我使用micromics库来做这件事这是第一个研究区域的负责人。它根据不同范围的TPI值，比较是否存在湿地（因子变量-wetl或无wetl） > head(datalist[[1]]) OID POINTID WETLAND TPI200 TPI350 TPI500 TPI700 TPI900 TPI1000 TPI2000

我在R中对10个数据帧（10个研究区域的数据）执行splsda模型，存储为列表（

datalist

）。所有这些数据帧都是相似的，具有相同的变量，但只是不同的值。我使用micromics库来做这件事

这是第一个研究区域的负责人。它根据不同范围的TPI值，比较是否存在湿地（因子变量-wetl或无wetl）

> head(datalist[[1]])
  OID POINTID WETLAND TPI200 TPI350 TPI500 TPI700 TPI900 TPI1000 TPI2000 TPI3000 TPI4000 TPI5000 TPI2500
1  -1       1 no wetl     70     67     55     50     48      46      53      47      49      63      48
2  -1       2 no wetl     37     42     35     29     32      16      17      35      49      63      26
3  -1       3 no wetl     45     55     45     39     41      41      53      47      49      63      48
4  -1       4 no wetl     46     58     51     43     46      36      54      47      49      62      49
5  -1       5 no wetl     58     55     53     49     47      46      54      47      49      62      49
6  -1       6 no wetl     56     53     51     49     46      46      54      47      49      61      49

我使用以下代码完成了交叉验证步骤：

library(mixOmics)

for (i in 1: length(model_list))
{
  myperf_plsda <- perf(model_list[[i]], validation = "Mfold", folds = 10, 
                            progressBar = FALSE, nrepeat = 10, auc = TRUE)
  save(myperf_plsda, file="performancePLSDA.RData")
}

因此，首先，我试图绘制prinipal组件函数中的错误（=plot，上面一个研究区域的第一个代码）。结果会像我想要的pdf格式一样
第二，我想知道总体错误率和每门课的错误率，上面提到的一个研究领域的代码就是从中得出的。一个研究区域的结果如下所示，例如：

总错误率：

每类错误率：

我尝试了一些方法，在
for
循环中，或使用
lappy
，以获得10个研究区域的这些结果
，例如：

### To see how many PCs is best ### pdf('overallerrorrate_wetlall_small.pdf') for (i in 1:length(myperf_plsda)) { plot(model_list[[i]], col = color.mixo(5:7), sd = TRUE, legend.position = "horizontal") } dev.off()
或
或
或

但所有这些代码都不起作用！如何为列表中的多个元素运行代码？非常感谢
根据您的输出，您必须创建一个新列表并将结果保存在其中。使用myperf_plsda可以覆盖循环中的每个步骤。另外，您需要的大多数度量值都是列表，因此我添加了一些处理函数来访问数据帧。我使用了下一个虚拟数据：

library(mixOmics) #Function custom_splsda <- function(datalist, ncomp, keepX, ..., Xcols, Ycol){ Y <- datalist[[Ycol]] X <- datalist[Xcols] res <- splsda(X, Y, ncomp = ncomp, keepX = keepX, ...) res } #Data datalist <- list(df1 = structure(list(OID = c(-1, -1, -1, -1, -1, -1), POINTID = c(1, 2, 3, 4, 5, 6), WETLAND = c("no wetl", "no wetl", "no wetl", "wetl", "wetl", "wetl"), TPI200 = c(70, 37, 45, 46, 58, 56), TPI350 = c(67, 42, 55, 58, 55, 53), TPI500 = c(55, 35, 45, 51, 53, 51), TPI700 = c(50, 29, 39, 43, 49, 49), TPI900 = c(48, 32, 41, 46, 47, 46), TPI1000 = c(46, 16, 41, 36, 46, 46), TPI2000 = c(53, 17, 53, 54, 54, 54), TPI3000 = c(47, 35, 47, 47, 47, 47), TPI4000 = c(49, 49, 49, 49, 49, 49), TPI5000 = c(63, 63, 63, 62, 62, 61), TPI2500 = c(48, 26, 48, 49, 49, 49)), row.names = c(NA, 6L), class = "data.frame"), df2 = structure(list(OID = c(-1, -1, -1, -1, -1, -1), POINTID = c(1, 2, 3, 4, 5, 6), WETLAND = c("no wetl", "no wetl", "no wetl", "wetl", "wetl", "wetl"), TPI200 = c(70, 37, 45, 46, 58, 56), TPI350 = c(67, 42, 55, 58, 55, 53), TPI500 = c(55, 35, 45, 51, 53, 51), TPI700 = c(50, 29, 39, 43, 49, 49), TPI900 = c(48, 32, 41, 46, 47, 46), TPI1000 = c(46, 16, 41, 36, 46, 46), TPI2000 = c(53, 17, 53, 54, 54, 54), TPI3000 = c(47, 35, 47, 47, 47, 47), TPI4000 = c(49, 49, 49, 49, 49, 49), TPI5000 = c(63, 63, 63, 62, 62, 61), TPI2500 = c(48, 26, 48, 49, 49, 49)), row.names = c(NA, 6L), class = "data.frame"))
在前面的代码中，您将得到三个数据帧，其中包含根据
型号列表
的名称识别的值，您可以通过vars
id1
、
id2
和
genid
导航以查看度量值、组件和数据集：

error.rate.df max.dist centroids.dist mahalanobis.dist id1 id2 genid 1 0.2222222 0.2222222 0.2222222 overall comp1 df1 2 0.2777778 0.3888889 0.2777778 overall comp2 df1 3 0.2222222 0.2222222 0.2222222 BER comp1 df1 4 0.2777778 0.3888889 0.2777778 BER comp2 df1 5 0.2222222 0.2222222 0.2222222 overall comp1 df2 6 0.2777778 0.3333333 0.2777778 overall comp2 df2 7 0.2222222 0.2222222 0.2222222 BER comp1 df2 8 0.2777778 0.3333333 0.2777778 BER comp2 df2 error.rate.class.df comp1 comp2 id1 id2 genid 1 0.3333333 0.3333333 max_dist no wetl df1 2 0.1111111 0.2222222 max_dist wetl df1 3 0.3333333 0.6666667 centroids_dist no wetl df1 4 0.1111111 0.1111111 centroids_dist wetl df1 5 0.3333333 0.3333333 mahalanobis_dist no wetl df1 6 0.1111111 0.2222222 mahalanobis_dist wetl df1 7 0.3333333 0.3333333 max_dist no wetl df2 8 0.1111111 0.2222222 max_dist wetl df2 9 0.3333333 0.5555556 centroids_dist no wetl df2 10 0.1111111 0.1111111 centroids_dist wetl df2 11 0.3333333 0.3333333 mahalanobis_dist no wetl df2 12 0.1111111 0.2222222 mahalanobis_dist wetl df2 auc.df x id1 id2 genid 1 0.62966667 AUC_mean comp1 df1 2 0.06414361 AUC_sd comp1 df1 3 0.81483333 AUC_mean comp2 df1 4 0.06414361 AUC_sd comp2 df1 5 0.62966667 AUC_mean comp1 df2 6 0.06414361 AUC_sd comp1 df2 7 0.77780000 AUC_mean comp2 df2 8 0.11110000 AUC_sd comp2 df2
最后，对于绘图，您可以使用下一个代码（我已将数据集的名称指定给x标签，以便您可以在绘图中识别它）：

#打印并保存 #指定名称姓名（myperf_plsda）你能说说什么不起作用吗？在我看来，你似乎在试图策划。你认为结果会是什么样子？@RomanLuštrik，我在帖子中添加了关于预期结果的额外信息。现在更清楚了吗？嗨，鸭子，这有帮助！这已经是你第三次帮我了！非常感谢你，英雄！！ for (i in 1:length(myperf_plsda)) {myperf_plsda[[1]]error.rate myperf_plsda[[1]]error.rate.class myperf_plsda[[i]]auc } lapply(myperf_plsda, [[, 'error.rate')` library(mixOmics) #Function custom_splsda <- function(datalist, ncomp, keepX, ..., Xcols, Ycol){ Y <- datalist[[Ycol]] X <- datalist[Xcols] res <- splsda(X, Y, ncomp = ncomp, keepX = keepX, ...) res } #Data datalist <- list(df1 = structure(list(OID = c(-1, -1, -1, -1, -1, -1), POINTID = c(1, 2, 3, 4, 5, 6), WETLAND = c("no wetl", "no wetl", "no wetl", "wetl", "wetl", "wetl"), TPI200 = c(70, 37, 45, 46, 58, 56), TPI350 = c(67, 42, 55, 58, 55, 53), TPI500 = c(55, 35, 45, 51, 53, 51), TPI700 = c(50, 29, 39, 43, 49, 49), TPI900 = c(48, 32, 41, 46, 47, 46), TPI1000 = c(46, 16, 41, 36, 46, 46), TPI2000 = c(53, 17, 53, 54, 54, 54), TPI3000 = c(47, 35, 47, 47, 47, 47), TPI4000 = c(49, 49, 49, 49, 49, 49), TPI5000 = c(63, 63, 63, 62, 62, 61), TPI2500 = c(48, 26, 48, 49, 49, 49)), row.names = c(NA, 6L), class = "data.frame"), df2 = structure(list(OID = c(-1, -1, -1, -1, -1, -1), POINTID = c(1, 2, 3, 4, 5, 6), WETLAND = c("no wetl", "no wetl", "no wetl", "wetl", "wetl", "wetl"), TPI200 = c(70, 37, 45, 46, 58, 56), TPI350 = c(67, 42, 55, 58, 55, 53), TPI500 = c(55, 35, 45, 51, 53, 51), TPI700 = c(50, 29, 39, 43, 49, 49), TPI900 = c(48, 32, 41, 46, 47, 46), TPI1000 = c(46, 16, 41, 36, 46, 46), TPI2000 = c(53, 17, 53, 54, 54, 54), TPI3000 = c(47, 35, 47, 47, 47, 47), TPI4000 = c(49, 49, 49, 49, 49, 49), TPI5000 = c(63, 63, 63, 62, 62, 61), TPI2500 = c(48, 26, 48, 49, 49, 49)), row.names = c(NA, 6L), class = "data.frame")) #Create model_list, you must have the object created model_list <- lapply(datalist, custom_splsda, ncomp = 2, keepX = c(5, 5), Xcols = 4:8, Ycol = "WETLAND") #Create empty list myperf_plsda <- list() #Loop for objects and saving for (i in 1: length(model_list)) { myperf_plsda[[i]] <- perf(model_list[[i]], validation = "Mfold", folds = 3, progressBar = FALSE, nrepeat = 3, auc = TRUE) object <- myperf_plsda[[i]] save(object,file = paste0("performancePLSDA.",i,".RData")) } #Process the object myperf_plsda #First function to get elements extract1 <- function(x) { #Object error.rate <- x$error.rate error.rate <- lapply(error.rate, as.data.frame) #Process O1 <- do.call(rbind,error.rate) #Separate vars O1$id <- rownames(O1) rownames(O1) <- NULL O1$id1 <- gsub("\\..*","", O1$id ) O1$id2 <- gsub(".*\\.","", O1$id ) O1$id <- NULL return(O1) } #Function 2 extract2 <- function(x) { #Object error.rate.class <- x$error.rate.class names(error.rate.class) <- gsub('.','_',names(error.rate.class),fixed = T) error.rate.class <- lapply(error.rate.class, as.data.frame) #Process O2 <- do.call(rbind,error.rate.class) #Separate vars O2$id <- rownames(O2) rownames(O2) <- NULL O2$id1 <- gsub("\\..*","", O2$id ) O2$id2 <- gsub(".*\\.","", O2$id ) O2$id <- NULL return(O2) } #Function 3 extract3 <- function(x) { #Object auc <- x$auc #Modify for dataframe change <- function(x) { y <- as.data.frame(x) y$id1 <- rownames(y) rownames(y)<-NULL y$id1 <- gsub('.','_',y$id1,fixed = T) return(y) } auc <- lapply(auc, change) #Process O3 <- do.call(rbind,auc) #Separate vars O3$id2 <- rownames(O3) rownames(O3) <- NULL O3$id2 <- gsub("\\..*","", O3$id2 ) return(O3) } #Apply functions and save in lists for late process L1 <- lapply(myperf_plsda,extract1) L2 <- lapply(myperf_plsda,extract2) L3 <- lapply(myperf_plsda,extract3) #Assign the same names from model_list names(L1) <- names(model_list) names(L2) <- names(model_list) names(L3) <- names(model_list) #Bind the data #Error rate error.rate.df <- do.call(rbind,L1) error.rate.df$genid <- gsub("\\..*","", rownames(error.rate.df) ) rownames(error.rate.df) <- NULL #Error rate class error.rate.class.df <- do.call(rbind,L2) error.rate.class.df$genid <- gsub("\\..*","", rownames(error.rate.class.df) ) rownames(error.rate.class.df) <- NULL #Auc auc.df <- do.call(rbind,L3) auc.df$genid <- gsub("\\..*","", rownames(auc.df) ) rownames(auc.df) <- NULL error.rate.df max.dist centroids.dist mahalanobis.dist id1 id2 genid 1 0.2222222 0.2222222 0.2222222 overall comp1 df1 2 0.2777778 0.3888889 0.2777778 overall comp2 df1 3 0.2222222 0.2222222 0.2222222 BER comp1 df1 4 0.2777778 0.3888889 0.2777778 BER comp2 df1 5 0.2222222 0.2222222 0.2222222 overall comp1 df2 6 0.2777778 0.3333333 0.2777778 overall comp2 df2 7 0.2222222 0.2222222 0.2222222 BER comp1 df2 8 0.2777778 0.3333333 0.2777778 BER comp2 df2 error.rate.class.df comp1 comp2 id1 id2 genid 1 0.3333333 0.3333333 max_dist no wetl df1 2 0.1111111 0.2222222 max_dist wetl df1 3 0.3333333 0.6666667 centroids_dist no wetl df1 4 0.1111111 0.1111111 centroids_dist wetl df1 5 0.3333333 0.3333333 mahalanobis_dist no wetl df1 6 0.1111111 0.2222222 mahalanobis_dist wetl df1 7 0.3333333 0.3333333 max_dist no wetl df2 8 0.1111111 0.2222222 max_dist wetl df2 9 0.3333333 0.5555556 centroids_dist no wetl df2 10 0.1111111 0.1111111 centroids_dist wetl df2 11 0.3333333 0.3333333 mahalanobis_dist no wetl df2 12 0.1111111 0.2222222 mahalanobis_dist wetl df2 auc.df x id1 id2 genid 1 0.62966667 AUC_mean comp1 df1 2 0.06414361 AUC_sd comp1 df1 3 0.81483333 AUC_mean comp2 df1 4 0.06414361 AUC_sd comp2 df1 5 0.62966667 AUC_mean comp1 df2 6 0.06414361 AUC_sd comp1 df2 7 0.77780000 AUC_mean comp2 df2 8 0.11110000 AUC_sd comp2 df2 #Plot and save #Assign names names(myperf_plsda) <- names(model_list) pdf('example.pdf') for (i in 1:length(myperf_plsda)) { plot(myperf_plsda[[i]], col = color.mixo(5:7), sd = TRUE, legend.position = "horizontal",xlab = paste0(names(myperf_plsda)[i],' (Comp)')) } dev.off()