R 断开空白文件的重叠
我试图检查不同示例文件中公共元素(BGC**)的完成百分比。我的输入文件格式如下:R 断开空白文件的重叠,r,R,我试图检查不同示例文件中公共元素(BGC**)的完成百分比。我的输入文件格式如下: file1.txt ----------- contig SRR5947942_idxstats.txt BGC0000972 0 BGC0000972 0 BGC0000972 0 BGC0000972 1 BGC0000972 0 BGC0000972 0 file2.txt ---------- contig SRR5947963_idxstats.txt BGC0000581 0 BGC0000581
file1.txt
-----------
contig SRR5947942_idxstats.txt
BGC0000972 0
BGC0000972 0
BGC0000972 0
BGC0000972 1
BGC0000972 0
BGC0000972 0
file2.txt
----------
contig SRR5947963_idxstats.txt
BGC0000581 0
BGC0000581 22
BGC0000581 60
BGC0000581 0
BGC0000972 14
BGC0000972 24
file1.txt
-----------
contig SRR5947942_idxstats.txt
BGC0000972 0
BGC0000972 0
BGC0000972 0
BGC0000972 0
BGC0000972 0
BGC0000972 0
我将它们保存在一个目录中,并以以下方式运行脚本:
filenames <- list.files(full.names=F, pattern=".txt")
output <-lapply(filenames,function(i){
t<-read.csv(i, header=T, check.names = F, sep = " ")
t$gene_count<-1
t[,2][t[,2]>0]<-1
presence_absence_df<-aggregate(. ~ contig, t, sum)
presence_absence_df$sample_name<-names(t[2])
colnames(presence_absence_df)<-c("BGC_Accession","Gene_presence", "Gene_count", "Sample_name")
presence_absence_df$Percentage<-(presence_absence_df$Gene_presence/presence_absence_df$Gene_count)*100
presence_absence_df<-presence_absence_df[presence_absence_df$Percentage != 0, ]
presence_absence_df$tp_step2_100_percent<-length(presence_absence_df$Percentage[presence_absence_df$Percentage>=100])
presence_absence_df<-presence_absence_df[presence_absence_df$Percentage >= 100, ]
presence_absence_df<-data.frame(presence_absence_df)
presence_absence_df <- subset(presence_absence_df, select = -c(Gene_presence, Gene_count, Percentage) )
colnames(presence_absence_df)<-c("BGC_name", "Sample", "BGCs_step2_100_percent")
presence_absence_df <- presence_absence_df [c("Sample", "BGCs_step2_100_percent", "BGC_name")]
})
Step2_results2_100<-do.call(rbind,output)
然后我得到:
Error in `$<-.data.frame`(`*tmp*`, "tp_step2_100_percent", value = 0L) :
replacement has 1 row, data has 0
如果第二列中的所有值都为0,`$中的
错误返回NULL
output <-lapply(filenames,function(i) {
t <- read.csv(i, header=T, check.names = F, sep = " ")
if(all(t[[2]] == 0)) return(NULL)
t$gene_count<-1
t[,2][t[,2]>0]<-1
#Rest of the code
#Rest of the code
})
Step2_results2_100 <- do.call(rbind,output)
output Debug以检测哪一行导致错误。感谢您的帮助,当我运行您的修改时,我得到了空白输出抱歉,我有一个输入错误。你能试试最新的答案吗?