R 断开空白文件的重叠

R 断开空白文件的重叠,r,R,我试图检查不同示例文件中公共元素(BGC**)的完成百分比。我的输入文件格式如下: file1.txt ----------- contig SRR5947942_idxstats.txt BGC0000972 0 BGC0000972 0 BGC0000972 0 BGC0000972 1 BGC0000972 0 BGC0000972 0 file2.txt ---------- contig SRR5947963_idxstats.txt BGC0000581 0 BGC0000581

我试图检查不同示例文件中公共元素(BGC**)的完成百分比。我的输入文件格式如下:

file1.txt
-----------

contig SRR5947942_idxstats.txt
BGC0000972 0
BGC0000972 0
BGC0000972 0
BGC0000972 1
BGC0000972 0
BGC0000972 0

file2.txt
----------
contig SRR5947963_idxstats.txt
BGC0000581 0
BGC0000581 22
BGC0000581 60
BGC0000581 0
BGC0000972 14
BGC0000972 24
file1.txt
-----------

contig SRR5947942_idxstats.txt
BGC0000972 0
BGC0000972 0
BGC0000972 0
BGC0000972 0
BGC0000972 0
BGC0000972 0
我将它们保存在一个目录中,并以以下方式运行脚本:

filenames <- list.files(full.names=F, pattern=".txt")
output <-lapply(filenames,function(i){
  t<-read.csv(i, header=T, check.names = F, sep = " ")
  t$gene_count<-1
  t[,2][t[,2]>0]<-1
  presence_absence_df<-aggregate(. ~ contig, t, sum)
  presence_absence_df$sample_name<-names(t[2])
  colnames(presence_absence_df)<-c("BGC_Accession","Gene_presence", "Gene_count", "Sample_name")
  presence_absence_df$Percentage<-(presence_absence_df$Gene_presence/presence_absence_df$Gene_count)*100
  presence_absence_df<-presence_absence_df[presence_absence_df$Percentage != 0, ]
  presence_absence_df$tp_step2_100_percent<-length(presence_absence_df$Percentage[presence_absence_df$Percentage>=100])
  presence_absence_df<-presence_absence_df[presence_absence_df$Percentage >= 100, ]
  presence_absence_df<-data.frame(presence_absence_df)
  presence_absence_df <- subset(presence_absence_df, select = -c(Gene_presence, Gene_count, Percentage) )
  colnames(presence_absence_df)<-c("BGC_name", "Sample", "BGCs_step2_100_percent")
  presence_absence_df <- presence_absence_df [c("Sample", "BGCs_step2_100_percent", "BGC_name")]
})
Step2_results2_100<-do.call(rbind,output)
然后我得到:

Error in `$<-.data.frame`(`*tmp*`, "tp_step2_100_percent", value = 0L) : 
  replacement has 1 row, data has 0

如果第二列中的所有值都为0,`$中的
错误返回
NULL

output <-lapply(filenames,function(i) {
  t <- read.csv(i, header=T, check.names = F, sep = " ")
  if(all(t[[2]] == 0)) return(NULL)
  t$gene_count<-1
  t[,2][t[,2]>0]<-1
  #Rest of the code
  #Rest of the code
})

Step2_results2_100 <- do.call(rbind,output)

output Debug以检测哪一行导致错误。感谢您的帮助,当我运行您的修改时,我得到了空白输出抱歉,我有一个输入错误。你能试试最新的答案吗?