将目录文件自动加载到python脚本中_Python_Batch Processing

将目录文件自动加载到python脚本中

python

将目录文件自动加载到python脚本中,python,batch-processing,Python,Batch Processing,我的linux机器上的目录中有大约125个文件。我有一个名为annotate.py的脚本它接收单个文件并向列中添加功能。基本上，我能够输入125个文件中的一个文件名并运行annotate.py脚本，但这不是有效的编程所有125个文件在列名和列号方面具有相似的格式。有人能告诉我如何在所有125个文件上运行annotate.py吗 annotate.py合并染色体和位置列上的两个文件。但是，我希望input_file1是所有125个文件，一次读入一个，并与input_file2合并。输出应该

我的linux机器上的目录中有大约125个文件。我有一个名为annotate.py的脚本它接收单个文件并向列中添加功能。基本上，我能够输入125个文件中的一个文件名并运行annotate.py脚本，但这不是有效的编程

所有125个文件在列名和列号方面具有相似的格式。有人能告诉我如何在所有125个文件上运行annotate.py吗

annotate.py合并染色体和位置列上的两个文件。但是，我希望input_file1是所有125个文件，一次读入一个，并与input_file2合并。输出应该是不同的文件，每个文件都具有原始输入文件1的名称

#!/usr/bin/python
#python snp_search.py  input_file1 input_file2
import numpy as np
import pandas as pd

snp_f=pd.read_table('input_file1.txt', sep="\t", header=None)#input_file1
snp_f.columns=['chr','pos']
lsnp_f=pd.read_table('input2_snpsearch.txt', sep="\t", header=True)#input_file2
lsnp_f.columns=['snpid','chr','pos']
final_snp=pd.merge(snp_f,lsnp_f, on=['chr','pos'])
final_snp.to_csv('input_file1_annotated.txt', index=False,sep='\t')

请帮忙！

谢谢

操作系统模块是您的朋友。基本思想是导入os，并使用

os.listdir（）

获取您感兴趣的目录中的文件列表。像下面这样的方法可以奏效

import numpy as np
import pandas as pd
import os


input_file2 = 'input2_snpssearch.txt'
input_dir = './' #or any other path
files = os.lisdir(input_dir) #listdir will give the file names

#you probably don't want to merge your input_file2 with itself and
#in this case it's in the same directory as the other files so
#filter it out.
files_of_interest = (f for f in files if f != input_file2)

for f in files_of_interest:
    full_name = os.path.join(input_dir, f) #necessary if input_dir is not './'
    snp_f=pd.read_table(full_name, sep="\t", header=None)#input_file1
    snp_f.columns=['chr','pos']
    lsnp_f=pd.read_table(input_file2, sep="\t", header=True)#input_file2
    lsnp_f.columns=['snpid','chr','pos']
    final_snp=pd.merge(snp_f,lsnp_f, on=['chr','pos'])
    new_fname = f.split('.')[0] + '_annotated.txt'
    final_snp.to_csv(os.path.join(input_dir, new_fname), index=False,sep='\t')