Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/354.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在Python中创建与两个列表中的项目数相同的文件数_Python - Fatal编程技术网

在Python中创建与两个列表中的项目数相同的文件数

在Python中创建与两个列表中的项目数相同的文件数,python,Python,考虑文件testbam.txt: /groups/cgsd/alexandre/gatk-workflows/src/exomesinglesample_out/bam/pfg001G.GRCh38DH.target.bam /groups/cgsd/alexandre/gatk-workflows/src/exomesinglesample_out/bam/pfg002G.GRCh38DH.target.bam /groups/cgsd/alexandre/gatk-workflows/src

考虑文件
testbam.txt

/groups/cgsd/alexandre/gatk-workflows/src/exomesinglesample_out/bam/pfg001G.GRCh38DH.target.bam
/groups/cgsd/alexandre/gatk-workflows/src/exomesinglesample_out/bam/pfg002G.GRCh38DH.target.bam
/groups/cgsd/alexandre/gatk-workflows/src/exomesinglesample_out/bam/pfg014G.GRCh38DH.target.bam
/groups/cgsd/alexandre/gatk-workflows/src/exomesinglesample_out/bam/pfg001G.GRCh38DH.target.bai
/groups/cgsd/alexandre/gatk-workflows/src/exomesinglesample_out/bam/pfg002G.GRCh38DH.target.bai
/groups/cgsd/alexandre/gatk-workflows/src/exomesinglesample_out/bam/pfg014G.GRCh38DH.target.bai
以及文件
testbai.txt

/groups/cgsd/alexandre/gatk-workflows/src/exomesinglesample_out/bam/pfg001G.GRCh38DH.target.bam
/groups/cgsd/alexandre/gatk-workflows/src/exomesinglesample_out/bam/pfg002G.GRCh38DH.target.bam
/groups/cgsd/alexandre/gatk-workflows/src/exomesinglesample_out/bam/pfg014G.GRCh38DH.target.bam
/groups/cgsd/alexandre/gatk-workflows/src/exomesinglesample_out/bam/pfg001G.GRCh38DH.target.bai
/groups/cgsd/alexandre/gatk-workflows/src/exomesinglesample_out/bam/pfg002G.GRCh38DH.target.bai
/groups/cgsd/alexandre/gatk-workflows/src/exomesinglesample_out/bam/pfg014G.GRCh38DH.target.bai
它们总是有相同的长度,我创建了一个函数来查找它:

def file_len(fname):
    with open(fname) as f:
        for i,l in enumerate(f):
            pass
        return i+1

n = file_len('/groups/cgsd/alexandre/python_code/src/testbai.txt')
print(n)
3
然后,我通过打开文件并进行一些操作创建了两个列表:

content = []
with open('/groups/cgsd/alexandre/python_code/src/testbam.txt') as bams:
    for line in bams:
        content.append(line.strip().split())

print(content)

content2 = []
with open('/groups/cgsd/alexandre/python_code/src/testbai.txt') as bais:
    for line in bais:
        content2.append(line.strip().split())

print(content2)
现在我有一个名为
mutec.json
json
类型文件,我想用列表中的项目替换某些部分:

{
    "Mutect2.gatk_docker": "broadinstitute/gatk:4.1.4.1",
    "Mutect2.intervals": "/groups/cgsd/alexandre/gatk-workflows/src/interval_list/Basic_Core_xGen_MSI_TERT_HPV_EBV_hg38.interval_list",
    "Mutect2.scatter_count": 30,
    "Mutect2.m2_extra_args": "--downsampling-stride 20 --max-reads-per-alignment-start 6 --max-suspicious-reads-per-alignment-start 6",
    "Mutect2.filter_funcotations": true,
    "Mutect2.funco_reference_version": "hg38",
    "Mutect2.run_funcotator": true,
    "Mutect2.make_bamout": true,
    "Mutect2.funco_data_sources_tar_gz": "/groups/cgsd/alexandre/gatk-workflows/mutect2/inputs/funcotator_dataSources.v1.6.20190124s.tar.gz",
    "Mutect2.funco_transcript_selection_list": "/groups/cgsd/alexandre/gatk-workflows/mutect2/inputs/transcriptList.exact_uniprot_matches.AKT1_CRLF2_FGFR1.txt",
  
    "Mutect2.ref_fasta": "/groups/cgsd/alexandre/gatk-workflows/src/ref_Homo38_HPV/Homo_sapiens_assembly38_chrHPV.fasta",
    "Mutect2.ref_fai": "/groups/cgsd/alexandre/gatk-workflows/src/ref_Homo38_HPV/Homo_sapiens_assembly38_chrHPV.fasta.fai",
    "Mutect2.ref_dict": "/groups/cgsd/alexandre/gatk-workflows/src/ref_Homo38_HPV/Homo_sapiens_assembly38_chrHPV.dict",
    
    "Mutect2.tumor_reads": "<<<N_item_of_list_content>>>",
    "Mutect2.tumor_reads_index": "<<<N_item_of_list_content2>>>",
  }
{
“Mutect2.gatk_docker”:“broadinstitute/gatk:4.1.4.1”,
“Mutect2.interval”:“/groups/cgsd/alexandre/gatk workflows/src/interval\u list/Basic\u Core\u xGen\u MSI\u TERT\u HPV\u EBV\u hg38.interval\u list”,
“静音2.分散计数”:30,
“Mutect2.m2_extra_args”:”--下采样步长20--每次对齐开始时的最大读取数6--每次对齐开始时的最大可疑读取数6“,
“Mutect2.filter_functions”:true,
“Mutect2.funco\u参考版本”:“hg38”,
“Mutect2.run_functator”:true,
“Mutect2.make_bamout”:真,
“Mutect2.funco_数据源_tar_gz”:“/groups/cgsd/alexandre/gatk workflows/Mutect2/inputs/funcotator_dataSources.v1.6.20190124s.tar.gz”,
“Mutect2.funco_transcript_selection_list”:“/groups/cgsd/alexandre/gatk workflows/Mutect2/inputs/transcript list.exact_uniprot_matches.AKT1_CRLF2_FGFR1.txt”,
“Mutect2.ref_fasta”:“/groups/cgsd/alexandre/gatk workflows/src/ref_Homo38_HPV/Homo_sapiens_assembly 38_chrpv.fasta”,
“Mutect2.ref_fai”:“/groups/cgsd/alexandre/gatk workflows/src/ref_Homo38_HPV/Homo_sapiens_assembly 38_chrpv.fasta.fai”,
“Mutect2.ref_dict”:“/groups/cgsd/alexandre/gatk workflows/src/ref_Homo38_HPV/Homo_sapiens_assembly 38_chrpv.dict”,
“Mutect2.tumor_的内容如下:”,
“Mutect2.tumor读取索引”:”,
}
请注意,本节:

   "Mutect2.tumor_reads": "<<<N_item_of_list_content>>>",
   "Mutect2.tumor_reads_index": "<<<N_item_of_list_content2>>>",
“Mutect2.tumor_读取”:“,
“Mutect2.tumor读取索引”:”,
应替换为列表中各自的项目,我想最后将每次修改的结果写入一个新文件

最终结果将是3个文件:
mutect1.json
,其中第一个项目来自
testbam.txt
,第一个项目来自
testbai.txt
mutect2.json
,第二个项目来自
testbai.txt
,第三个文件应用相同的推理


请注意,我写的符号
不一定是硬编码到文件中的,我写自己只是为了清楚我想替换什么。

首先,即使它与问题无关,您的一些代码也不是真正的Pythonic:

def file_len(fname):
    with open(fname) as f:
        for i,l in enumerate(f):
            pass
        return i+1
当您只需执行以下操作时,可以使用for循环遍历
枚举

def file_len(fname):
    with open(fname) as f:
        return len(f)
因为f是文件行上的迭代器

现在谈谈你的问题。您想用另外两个文件中的数据替换文件中的某些元素

在你最初的问题中,字符串用三个尖括号括起来

我会使用:

import re

rx = re.compile(r'<<<.*?>>>')        # how to identify what is to replace

with open('.../testbam.txt') as bams, open('.../testbai.txt') as bais, \
     open('.../mutect.json') as src:
    for i, reps in enumerate(zip(bams, bais), 1): # gets a pair of replacement strings at each step
        src.seek(0)                  # rewind src file
        with open(f'mutect{i}', 'w') as fdout:  # open the output files
            rep_index = 0            # will first use rep string from first file
            for line in src:
                if rx.search(line):  # if the string to replace there?
                    line = rx.sub(reps[rep_index], line)
                    rep_index = 1 - rep_index    # next time will use the other string
                fdout.write(line)

你的问题没有重点!!一个问题太多了。你可以问一件事,你在哪里遇到了问题/错误。为什么不呢?我尽了最大努力提供了所有细节,看看上面的第一个代码。它打开了一个for循环,它所做的就是
通过
。那是什么?然后让我知道我应该从问题中删除什么please@AvenDesta:该函数不是真正的Python函数,但它返回预期结果。。。