使用biopython重命名交错的fastq头_Python_Replace_Bioinformatics_Biopython_Fastq

使用biopython重命名交错的fastq头

python replace

使用biopython重命名交错的fastq头,python,replace,bioinformatics,biopython,fastq,Python,Replace,Bioinformatics,Biopython,Fastq,为了便于使用并与另一个下游管道兼容，我尝试使用biopython更改fastq序列ID的名称。例如从如下所示的标题开始： @D00602:32:H3LN7BCXX:1:1101:1205:2112 OP:i:1 @D00602:32:H3LN7BCXX:1:1101:1205:2112 OP:i:2 @D00602:32:H3LN7BCXX:1:1101:1182:2184 OP:i:1 @D00602:32:H3LN7BCXX:1:1101:1182:2184 OP:i:2 @0000000

为了便于使用并与另一个下游管道兼容，我尝试使用biopython更改fastq序列ID的名称。例如从如下所示的标题开始：

@D00602:32:H3LN7BCXX:1:1101:1205:2112 OP:i:1
@D00602:32:H3LN7BCXX:1:1101:1205:2112 OP:i:2
@D00602:32:H3LN7BCXX:1:1101:1182:2184 OP:i:1
@D00602:32:H3LN7BCXX:1:1101:1182:2184 OP:i:2

@000000000000001  OP:i:1
@000000000000001  OP:i:2
@000000000000002  OP:i:1
@000000000000002  OP:i:2

到如下所示的标题：

@D00602:32:H3LN7BCXX:1:1101:1205:2112 OP:i:1
@D00602:32:H3LN7BCXX:1:1101:1205:2112 OP:i:2
@D00602:32:H3LN7BCXX:1:1101:1182:2184 OP:i:1
@D00602:32:H3LN7BCXX:1:1101:1182:2184 OP:i:2

@000000000000001  OP:i:1
@000000000000001  OP:i:2
@000000000000002  OP:i:1
@000000000000002  OP:i:2

我有一些代码，但我似乎无法获得交替报头倒计时（即1,1,2,2,3,3等）

任何帮助都将不胜感激。谢谢

from Bio import SeqIO
import sys

FILE = sys.argv[1]

#Initialize numbering system at one
COUNT = 1

#Create a new dictionary for new sequence IDs
new_records=[]

for seq_record in SeqIO.parse(FILE, "fastq"):
        header = '{:0>15}'.format(COUNT)
        COUNT += 1
        print(header)
        seq_record.description = 
seq_record.description.replace(seq_record.id, "")
        seq_record.id = header
        new_records.append(seq_record)
SeqIO.write(new_records, FILE, "fastq")

*seq_记录不包含“OP:i:1”信息

假设您希望复制所有标签，您所要做的就是将计数除以复制的数量并返回向下舍入的值，如下所示

from Bio import SeqIO
import sys

FILE = sys.argv[1]

#Initialize numbering system at one
COUNT = 0

#Create a new dictionary for new sequence IDs
new_records=[]

for seq_record in SeqIO.parse(FILE, "fastq"):
        header = '{:0>15}'.format(COUNT//2+1)
        COUNT += 1
        print(header)
        seq_record.description = 
seq_record.description.replace(seq_record.id, "")
        seq_record.id = header
        new_records.append(seq_record)
SeqIO.write(new_records, FILE, "fastq")