Python 写入与另一个文件中的字符匹配的行

Python 写入与另一个文件中的字符匹配的行,python,python-2.7,Python,Python 2.7,本质上,我想写的是与代码中引用的ID列表匹配的文档行 nodeIDs.txt: ids = [] with open('nodeIDs.txt', 'r') as n: for line in n: ids.append(line) n.close() # Import data from the pathway file and turn into a list g = [] with open('Adherens junction.txt', 'r') as a:

本质上,我想写的是与代码中引用的ID列表匹配的文档行

nodeIDs.txt:

ids = []
with open('nodeIDs.txt', 'r') as n:
    for line in n:
        ids.append(line)
n.close()

# Import data from the pathway file and turn into a list
g = []
with open('Adherens junction.txt', 'r') as a:
    for line in a:
        g.append(line)
a.close()

aj = open('Adherens.txt', 'a')
for line in a:
    if ids[i] in line:
    aj.write(line)
aj.close()
# read ids file into a set
with open('file1', 'r') as f:
    # create a set comprehension
    ids = {line.strip() for line in f}

# read the pathway file and turn into a list
with open('file2', 'r') as f:
    # create a list comprehension
    pathways = [line for line in f]

# output matching lines
with open('file3', 'a') as f:

    # loop through each of the pathways
    for pathway in pathways:

        # get the number in front of the ':'
        start_of_line = pathway.split(':', 1)[0]

        # if this is in 'ids' output the line
        if start_of_line.strip() in ids:
            f.write(pathway)
2241: FER; FER tyrosine kinase 
56288: PARD3; par-3 family cell polarity regulator 
10000
56288
2241
4301: AFDN; afadin, adherens junction formation factor 
1496: CTNNA2; catenin alpha 2 
283106: CSNK2A3; casein kinase 2 alpha 3 
2241: FER; FER tyrosine kinase 
60: ACTB; actin beta 
1956: EGFR; epidermal growth factor receptor 
56288: PARD3; par-3 family cell polarity regulator 
10458: BAIAP2; BAI1 associated protein 2 
51176: LEF1; lymphoid enhancer binding factor 1 
。。。有417件物品

10000
10023
1017
1019
1021
1026
1027
1029
...
attendens junction.txt:

ids = []
with open('nodeIDs.txt', 'r') as n:
    for line in n:
        ids.append(line)
n.close()

# Import data from the pathway file and turn into a list
g = []
with open('Adherens junction.txt', 'r') as a:
    for line in a:
        g.append(line)
a.close()

aj = open('Adherens.txt', 'a')
for line in a:
    if ids[i] in line:
    aj.write(line)
aj.close()
# read ids file into a set
with open('file1', 'r') as f:
    # create a set comprehension
    ids = {line.strip() for line in f}

# read the pathway file and turn into a list
with open('file2', 'r') as f:
    # create a list comprehension
    pathways = [line for line in f]

# output matching lines
with open('file3', 'a') as f:

    # loop through each of the pathways
    for pathway in pathways:

        # get the number in front of the ':'
        start_of_line = pathway.split(':', 1)[0]

        # if this is in 'ids' output the line
        if start_of_line.strip() in ids:
            f.write(pathway)
2241: FER; FER tyrosine kinase 
56288: PARD3; par-3 family cell polarity regulator 
10000
56288
2241
4301: AFDN; afadin, adherens junction formation factor 
1496: CTNNA2; catenin alpha 2 
283106: CSNK2A3; casein kinase 2 alpha 3 
2241: FER; FER tyrosine kinase 
60: ACTB; actin beta 
1956: EGFR; epidermal growth factor receptor 
56288: PARD3; par-3 family cell polarity regulator 
10458: BAIAP2; BAI1 associated protein 2 
51176: LEF1; lymphoid enhancer binding factor 1 
。。。有七十三行,

4301: AFDN; afadin, adherens junction formation factor 
1496: CTNNA2; catenin alpha 2 
283106: CSNK2A3; casein kinase 2 alpha 3 
2241: FER; FER tyrosine kinase 
60: ACTB; actin beta 
1956: EGFR; epidermal growth factor receptor 
56288: PARD3; par-3 family cell polarity regulator 
10458: BAIAP2; BAI1 associated protein 2 
51176: LEF1; lymphoid enhancer binding factor 1 
我试图让程序逐行引用ID列表,如果行的开头字符与列表中的任何字符匹配,则将该行写入新文档。我在研究数据集,但我不确定这些数据集在这里是否有效

到目前为止我的代码:

ids = []
with open('nodeIDs.txt', 'r') as n:
    for line in n:
        ids.append(line)
n.close()

# Import data from the pathway file and turn into a list
g = []
with open('Adherens junction.txt', 'r') as a:
    for line in a:
        g.append(line)
a.close()

aj = open('Adherens.txt', 'a')
for line in a:
    if ids[i] in line:
    aj.write(line)
aj.close()
# read ids file into a set
with open('file1', 'r') as f:
    # create a set comprehension
    ids = {line.strip() for line in f}

# read the pathway file and turn into a list
with open('file2', 'r') as f:
    # create a list comprehension
    pathways = [line for line in f]

# output matching lines
with open('file3', 'a') as f:

    # loop through each of the pathways
    for pathway in pathways:

        # get the number in front of the ':'
        start_of_line = pathway.split(':', 1)[0]

        # if this is in 'ids' output the line
        if start_of_line.strip() in ids:
            f.write(pathway)
2241: FER; FER tyrosine kinase 
56288: PARD3; par-3 family cell polarity regulator 
10000
56288
2241
4301: AFDN; afadin, adherens junction formation factor 
1496: CTNNA2; catenin alpha 2 
283106: CSNK2A3; casein kinase 2 alpha 3 
2241: FER; FER tyrosine kinase 
60: ACTB; actin beta 
1956: EGFR; epidermal growth factor receptor 
56288: PARD3; par-3 family cell polarity regulator 
10458: BAIAP2; BAI1 associated protein 2 
51176: LEF1; lymphoid enhancer binding factor 1 

你能帮我工作吗?

这里有一些代码,我想它能满足你的要求

代码:

ids = []
with open('nodeIDs.txt', 'r') as n:
    for line in n:
        ids.append(line)
n.close()

# Import data from the pathway file and turn into a list
g = []
with open('Adherens junction.txt', 'r') as a:
    for line in a:
        g.append(line)
a.close()

aj = open('Adherens.txt', 'a')
for line in a:
    if ids[i] in line:
    aj.write(line)
aj.close()
# read ids file into a set
with open('file1', 'r') as f:
    # create a set comprehension
    ids = {line.strip() for line in f}

# read the pathway file and turn into a list
with open('file2', 'r') as f:
    # create a list comprehension
    pathways = [line for line in f]

# output matching lines
with open('file3', 'a') as f:

    # loop through each of the pathways
    for pathway in pathways:

        # get the number in front of the ':'
        start_of_line = pathway.split(':', 1)[0]

        # if this is in 'ids' output the line
        if start_of_line.strip() in ids:
            f.write(pathway)
2241: FER; FER tyrosine kinase 
56288: PARD3; par-3 family cell polarity regulator 
10000
56288
2241
4301: AFDN; afadin, adherens junction formation factor 
1496: CTNNA2; catenin alpha 2 
283106: CSNK2A3; casein kinase 2 alpha 3 
2241: FER; FER tyrosine kinase 
60: ACTB; actin beta 
1956: EGFR; epidermal growth factor receptor 
56288: PARD3; par-3 family cell polarity regulator 
10458: BAIAP2; BAI1 associated protein 2 
51176: LEF1; lymphoid enhancer binding factor 1 
结果:

ids = []
with open('nodeIDs.txt', 'r') as n:
    for line in n:
        ids.append(line)
n.close()

# Import data from the pathway file and turn into a list
g = []
with open('Adherens junction.txt', 'r') as a:
    for line in a:
        g.append(line)
a.close()

aj = open('Adherens.txt', 'a')
for line in a:
    if ids[i] in line:
    aj.write(line)
aj.close()
# read ids file into a set
with open('file1', 'r') as f:
    # create a set comprehension
    ids = {line.strip() for line in f}

# read the pathway file and turn into a list
with open('file2', 'r') as f:
    # create a list comprehension
    pathways = [line for line in f]

# output matching lines
with open('file3', 'a') as f:

    # loop through each of the pathways
    for pathway in pathways:

        # get the number in front of the ':'
        start_of_line = pathway.split(':', 1)[0]

        # if this is in 'ids' output the line
        if start_of_line.strip() in ids:
            f.write(pathway)
2241: FER; FER tyrosine kinase 
56288: PARD3; par-3 family cell polarity regulator 
10000
56288
2241
4301: AFDN; afadin, adherens junction formation factor 
1496: CTNNA2; catenin alpha 2 
283106: CSNK2A3; casein kinase 2 alpha 3 
2241: FER; FER tyrosine kinase 
60: ACTB; actin beta 
1956: EGFR; epidermal growth factor receptor 
56288: PARD3; par-3 family cell polarity regulator 
10458: BAIAP2; BAI1 associated protein 2 
51176: LEF1; lymphoid enhancer binding factor 1 
文件1:

ids = []
with open('nodeIDs.txt', 'r') as n:
    for line in n:
        ids.append(line)
n.close()

# Import data from the pathway file and turn into a list
g = []
with open('Adherens junction.txt', 'r') as a:
    for line in a:
        g.append(line)
a.close()

aj = open('Adherens.txt', 'a')
for line in a:
    if ids[i] in line:
    aj.write(line)
aj.close()
# read ids file into a set
with open('file1', 'r') as f:
    # create a set comprehension
    ids = {line.strip() for line in f}

# read the pathway file and turn into a list
with open('file2', 'r') as f:
    # create a list comprehension
    pathways = [line for line in f]

# output matching lines
with open('file3', 'a') as f:

    # loop through each of the pathways
    for pathway in pathways:

        # get the number in front of the ':'
        start_of_line = pathway.split(':', 1)[0]

        # if this is in 'ids' output the line
        if start_of_line.strip() in ids:
            f.write(pathway)
2241: FER; FER tyrosine kinase 
56288: PARD3; par-3 family cell polarity regulator 
10000
56288
2241
4301: AFDN; afadin, adherens junction formation factor 
1496: CTNNA2; catenin alpha 2 
283106: CSNK2A3; casein kinase 2 alpha 3 
2241: FER; FER tyrosine kinase 
60: ACTB; actin beta 
1956: EGFR; epidermal growth factor receptor 
56288: PARD3; par-3 family cell polarity regulator 
10458: BAIAP2; BAI1 associated protein 2 
51176: LEF1; lymphoid enhancer binding factor 1 
文件2:

ids = []
with open('nodeIDs.txt', 'r') as n:
    for line in n:
        ids.append(line)
n.close()

# Import data from the pathway file and turn into a list
g = []
with open('Adherens junction.txt', 'r') as a:
    for line in a:
        g.append(line)
a.close()

aj = open('Adherens.txt', 'a')
for line in a:
    if ids[i] in line:
    aj.write(line)
aj.close()
# read ids file into a set
with open('file1', 'r') as f:
    # create a set comprehension
    ids = {line.strip() for line in f}

# read the pathway file and turn into a list
with open('file2', 'r') as f:
    # create a list comprehension
    pathways = [line for line in f]

# output matching lines
with open('file3', 'a') as f:

    # loop through each of the pathways
    for pathway in pathways:

        # get the number in front of the ':'
        start_of_line = pathway.split(':', 1)[0]

        # if this is in 'ids' output the line
        if start_of_line.strip() in ids:
            f.write(pathway)
2241: FER; FER tyrosine kinase 
56288: PARD3; par-3 family cell polarity regulator 
10000
56288
2241
4301: AFDN; afadin, adherens junction formation factor 
1496: CTNNA2; catenin alpha 2 
283106: CSNK2A3; casein kinase 2 alpha 3 
2241: FER; FER tyrosine kinase 
60: ACTB; actin beta 
1956: EGFR; epidermal growth factor receptor 
56288: PARD3; par-3 family cell polarity regulator 
10458: BAIAP2; BAI1 associated protein 2 
51176: LEF1; lymphoid enhancer binding factor 1 
什么是集合理解?

这:

同:

# create a set
ids = set()
for line in f:
    ids.add(line.strip())

这里有一些代码,我认为它符合您的要求

代码:

ids = []
with open('nodeIDs.txt', 'r') as n:
    for line in n:
        ids.append(line)
n.close()

# Import data from the pathway file and turn into a list
g = []
with open('Adherens junction.txt', 'r') as a:
    for line in a:
        g.append(line)
a.close()

aj = open('Adherens.txt', 'a')
for line in a:
    if ids[i] in line:
    aj.write(line)
aj.close()
# read ids file into a set
with open('file1', 'r') as f:
    # create a set comprehension
    ids = {line.strip() for line in f}

# read the pathway file and turn into a list
with open('file2', 'r') as f:
    # create a list comprehension
    pathways = [line for line in f]

# output matching lines
with open('file3', 'a') as f:

    # loop through each of the pathways
    for pathway in pathways:

        # get the number in front of the ':'
        start_of_line = pathway.split(':', 1)[0]

        # if this is in 'ids' output the line
        if start_of_line.strip() in ids:
            f.write(pathway)
2241: FER; FER tyrosine kinase 
56288: PARD3; par-3 family cell polarity regulator 
10000
56288
2241
4301: AFDN; afadin, adherens junction formation factor 
1496: CTNNA2; catenin alpha 2 
283106: CSNK2A3; casein kinase 2 alpha 3 
2241: FER; FER tyrosine kinase 
60: ACTB; actin beta 
1956: EGFR; epidermal growth factor receptor 
56288: PARD3; par-3 family cell polarity regulator 
10458: BAIAP2; BAI1 associated protein 2 
51176: LEF1; lymphoid enhancer binding factor 1 
结果:

ids = []
with open('nodeIDs.txt', 'r') as n:
    for line in n:
        ids.append(line)
n.close()

# Import data from the pathway file and turn into a list
g = []
with open('Adherens junction.txt', 'r') as a:
    for line in a:
        g.append(line)
a.close()

aj = open('Adherens.txt', 'a')
for line in a:
    if ids[i] in line:
    aj.write(line)
aj.close()
# read ids file into a set
with open('file1', 'r') as f:
    # create a set comprehension
    ids = {line.strip() for line in f}

# read the pathway file and turn into a list
with open('file2', 'r') as f:
    # create a list comprehension
    pathways = [line for line in f]

# output matching lines
with open('file3', 'a') as f:

    # loop through each of the pathways
    for pathway in pathways:

        # get the number in front of the ':'
        start_of_line = pathway.split(':', 1)[0]

        # if this is in 'ids' output the line
        if start_of_line.strip() in ids:
            f.write(pathway)
2241: FER; FER tyrosine kinase 
56288: PARD3; par-3 family cell polarity regulator 
10000
56288
2241
4301: AFDN; afadin, adherens junction formation factor 
1496: CTNNA2; catenin alpha 2 
283106: CSNK2A3; casein kinase 2 alpha 3 
2241: FER; FER tyrosine kinase 
60: ACTB; actin beta 
1956: EGFR; epidermal growth factor receptor 
56288: PARD3; par-3 family cell polarity regulator 
10458: BAIAP2; BAI1 associated protein 2 
51176: LEF1; lymphoid enhancer binding factor 1 
文件1:

ids = []
with open('nodeIDs.txt', 'r') as n:
    for line in n:
        ids.append(line)
n.close()

# Import data from the pathway file and turn into a list
g = []
with open('Adherens junction.txt', 'r') as a:
    for line in a:
        g.append(line)
a.close()

aj = open('Adherens.txt', 'a')
for line in a:
    if ids[i] in line:
    aj.write(line)
aj.close()
# read ids file into a set
with open('file1', 'r') as f:
    # create a set comprehension
    ids = {line.strip() for line in f}

# read the pathway file and turn into a list
with open('file2', 'r') as f:
    # create a list comprehension
    pathways = [line for line in f]

# output matching lines
with open('file3', 'a') as f:

    # loop through each of the pathways
    for pathway in pathways:

        # get the number in front of the ':'
        start_of_line = pathway.split(':', 1)[0]

        # if this is in 'ids' output the line
        if start_of_line.strip() in ids:
            f.write(pathway)
2241: FER; FER tyrosine kinase 
56288: PARD3; par-3 family cell polarity regulator 
10000
56288
2241
4301: AFDN; afadin, adherens junction formation factor 
1496: CTNNA2; catenin alpha 2 
283106: CSNK2A3; casein kinase 2 alpha 3 
2241: FER; FER tyrosine kinase 
60: ACTB; actin beta 
1956: EGFR; epidermal growth factor receptor 
56288: PARD3; par-3 family cell polarity regulator 
10458: BAIAP2; BAI1 associated protein 2 
51176: LEF1; lymphoid enhancer binding factor 1 
文件2:

ids = []
with open('nodeIDs.txt', 'r') as n:
    for line in n:
        ids.append(line)
n.close()

# Import data from the pathway file and turn into a list
g = []
with open('Adherens junction.txt', 'r') as a:
    for line in a:
        g.append(line)
a.close()

aj = open('Adherens.txt', 'a')
for line in a:
    if ids[i] in line:
    aj.write(line)
aj.close()
# read ids file into a set
with open('file1', 'r') as f:
    # create a set comprehension
    ids = {line.strip() for line in f}

# read the pathway file and turn into a list
with open('file2', 'r') as f:
    # create a list comprehension
    pathways = [line for line in f]

# output matching lines
with open('file3', 'a') as f:

    # loop through each of the pathways
    for pathway in pathways:

        # get the number in front of the ':'
        start_of_line = pathway.split(':', 1)[0]

        # if this is in 'ids' output the line
        if start_of_line.strip() in ids:
            f.write(pathway)
2241: FER; FER tyrosine kinase 
56288: PARD3; par-3 family cell polarity regulator 
10000
56288
2241
4301: AFDN; afadin, adherens junction formation factor 
1496: CTNNA2; catenin alpha 2 
283106: CSNK2A3; casein kinase 2 alpha 3 
2241: FER; FER tyrosine kinase 
60: ACTB; actin beta 
1956: EGFR; epidermal growth factor receptor 
56288: PARD3; par-3 family cell polarity regulator 
10458: BAIAP2; BAI1 associated protein 2 
51176: LEF1; lymphoid enhancer binding factor 1 
什么是集合理解?

这:

同:

# create a set
ids = set()
for line in f:
    ids.add(line.strip())

通过一个例子,这个问题将得到极大的改进。具体地说,数据是有效的,而不仅仅是说明所提供数据的格式和预期输出。具体地说,数据是有效的,它不仅说明了所提供数据的格式和预期输出。这非常有效——也感谢格式!您能否进一步解释一下代码的“行对行”和“路径中的路径”部分中发生了什么情况?
for line-in-line
是一个标准的python迭代器。许多对象(例如:
list
)实现了一个
\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu。所以它基本上是在读取时执行的,它为每行中的每一行运行for循环,一次一行。python不是很有趣吗?你可能也不熟悉理解。我更新了帖子,注意到了这两种理解。请看:这很好地工作了——也谢谢你的格式!您能否进一步解释一下代码的“行对行”和“路径中的路径”部分中发生了什么情况?
for line-in-line
是一个标准的python迭代器。许多对象(例如:
list
)实现了一个
\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu。所以它基本上是在读取时执行的,它为每行中的每一行运行for循环,一次一行。python不是很有趣吗?你可能也不熟悉理解。我更新了帖子,注意到了这两种理解。见: