R for循环从文件中提取信息并将其添加到TIBLE中？_R_Tidyverse

R for循环从文件中提取信息并将其添加到TIBLE中？

R for循环从文件中提取信息并将其添加到TIBLE中？,r,tidyverse,R,Tidyverse,我不喜欢tidyverse，所以如果这是一个简单的问题，请原谅我。我有一堆文件，其中包含我需要提取的数据，并将其添加到我创建的tibble中的不同列中我希望行名称以我创建的文件ID开头： filelist <- list.fileS(pattern=".txt") # Gives me the filenames in current directory. # The filenames are something like AA1230.report.txt for

我不喜欢tidyverse，所以如果这是一个简单的问题，请原谅我。我有一堆文件，其中包含我需要提取的数据，并将其添加到我创建的tibble中的不同列中

我希望行名称以我创建的文件ID开头：

filelist <- list.fileS(pattern=".txt") # Gives me the filenames in current directory.
# The filenames are something like AA1230.report.txt for example

file_ID <- trimws(filelist, whitespace="\\..*") # Gives me the ID which is before the "report.txt"

metadata <- as_tibble(file_ID[1:181]) # create dataframe with IDs as row names for 180 files.

filelist您还可以执行以下操作：
library(tidyverse)
filelist <- list.files(pattern=".txt") 
nms <- c("Percentage", "Num_reads_root", "Num_reads_taxon", "Rank", "NCBI_ID", "Name")

set_names(filelist,filelist) %>%
  map_dfr(read_table, col_names = nms, .id = 'file_ID') %>%
  filter(Rank == 'D') %>%
  select(file_ID, Name, Num_reads_root) %>%
  pivot_wider(id_cols = file_ID, names_from = Name, values_from = Num_reads_root) %>%
  mutate(file_ID = str_remove(file_ID, '.txt'))

库（tidyverse）
文件列表%
筛选器（秩='D'）%>%
选择（文件ID、名称、数量读取根）%>%
pivot\u wide（id\u cols=file\u id，names\u from=Name，values\u from=Num\u reads\u root）%>%
变异（file_ID=str_remove（file_ID，'.txt'））
我发现有时使用for循环是很好的，因为它可以保存整个过程的进度，以防遇到错误。然后您可以找到问题文件并对其进行调试，或者使用try（）
但抛出警告（）
库（tidyverse）
文件列表我尝试了此操作，但得到了以下错误列规范---------------------------------------------------------------------cols（百分比=col\u character（），Num\u reads\u root=col\u character（），Num\u reads\u taxon=col\u character（），Rank=col\u character（），NCBI\u ID=col\u character（）），Name=col\u character（））错误：无法组合'Num\u reads\u taxon'和'Num\u reads\u taxon'。运行`rlang:：last_error（）`查看错误发生的位置。
@CuriousDude，因此您似乎有不同的列类型。将在中编辑它minute@CuriousDude将read\u table
更改为read.table
并将col\u names
更改为col.names
我进行了更改，但出现以下错误：扫描错误（file=file，what=what，sep=sep，quote=quote，dec=dec，：第1行没有7个元素
@CuriousDude能否尝试运行map\u-dfr（filelist，read.table，col.names=nms，sep='\t'，header=FALSE，.id='file\u-id'）这会导致错误吗？
Percentage Num_Reads_Root Num_Reads_Taxon Rank  NCBI_ID Name     
       <dbl>          <int>           <int> <fct>   <int> <fct>    
1      75.9           60533              28 D           2 Bacteria 
2       0.48            386               0 D        2759 Eukaryota
3       0.01              4               0 D        2157 Archaea  
4       0.02             19               0 D       10239 Viruses  

> metadata
value     Bacteria_Counts    Eukaryota_Counts    Viruses_Counts     Archaea_Counts
<chr>     <int>              <int>               <int>               <int>
 1 AA1230  60533             386                 19                   4 
 2 AB0566
 3 AA1231
 4 AB0567
 5 BC1148
 6 AW0001
 7 AW0002
 8 BB1121
 9 BC0001
10 BC0002
....with 171 more rows

for (files in file.list()) {
  >> get_domains <<
}

library(tidyverse)
filelist <- list.files(pattern=".txt") 
nms <- c("Percentage", "Num_reads_root", "Num_reads_taxon", "Rank", "NCBI_ID", "Name")

set_names(filelist,filelist) %>%
  map_dfr(read_table, col_names = nms, .id = 'file_ID') %>%
  filter(Rank == 'D') %>%
  select(file_ID, Name, Num_reads_root) %>%
  pivot_wider(id_cols = file_ID, names_from = Name, values_from = Num_reads_root) %>%
  mutate(file_ID = str_remove(file_ID, '.txt'))

library(tidyverse)
filelist <- list.files(pattern=".txt") #list files

tmp_list <- list()
for (i in seq_along(filelist)) {
  my_table <- read_tsv(filelist[i]) %>% # It looks like your files are all .tsv's
    rename(Percentage=V1, Num_reads_root=V2, Num_reads_taxon=V3, Rank=V4, NCBI_ID=V5, Name=V6) %>%
    filter(Rank=="D") %>%
    mutate(file_ID <- trimws(filelist[i], whitespace="\\..*")) %>%
    select(file_ID, everything())
  tmp_list[[i]] <- my_table
}
out <- bind_rows(tmp_list)
out