如何在python中计算一个字符串在多个字符串对中的出现次数？_Python

如何在python中计算一个字符串在多个字符串对中的出现次数？

python

如何在python中计算一个字符串在多个字符串对中的出现次数？,python,Python,我这里有两个文本文件。我想计算“Textfile 2”中给出的字符串对中“Textfile 1”中字符串出现的次数文本文件_1： 1763_0M73 2610_0M63 7529_12M64 7529_18M64 0091_00M56 文本文件2： 1763_0M73, 2610_0M63 2610_0M63, 7529_12M64 7529_18M64, 0091_00M56 0091_00M56, 7529_12M64 0267_12M64, 0091_00M56 预期产出： 1763

我这里有两个文本文件。我想计算“Textfile 2”中给出的字符串对中“Textfile 1”中字符串出现的次数

文本文件_1：

文本文件2：

1763_0M73, 2610_0M63
2610_0M63, 7529_12M64
7529_18M64, 0091_00M56
0091_00M56, 7529_12M64
0267_12M64, 0091_00M56

预期产出：

1763_0M73, 1
2610_0M63, 2
7529_12M64, 2
7529_18M64, 1
0091_00M56, 3

我尝试了以下脚本。但它没有提供预期的产出

with open('Textfile_2.txt') as f1:
    lookup = dict([x.strip() for x in line.split(',')] for line in f1)
print(lookup)

with open('Output.txt', 'w') as out:
    with open('Textfile_1.txt') as f2:
        for line in f2:
            k = line.strip()
            n = lookup[k]
            print(n)

有人知道如何在python中实现这一点吗？我对python编程相当陌生。

代码中有一些地方没有正确完成。下面是列表理解代码

#Step 1: Read the Textfile_1 and store them as dictionary values
#strip out the \n as you read through each record from the file
#value of each element will be set to 0
with open('Textfile_1.txt','r') as f1:
    txt1 = {_.rstrip("\n"):0 for _ in f1}

#Step 2: Read the Textfile_2 and strip out the \n. This will give you two values
#Then split the values into a list. You will get [[str1,str2],[str3,str4]...]
with open('Textfile_2.txt','r') as f2:
    txt2 = [z.rstrip("\n").split(',') for z in f2]

#Step 3: The strings in the list of lists may have leading or trailing spaces
#as you iterate thru them, remove the leading/trailing spaces
#then check for that value in the dictionary
#if found, increment the value by 1
for i in [y.strip() for x in txt2 for y in x]:
    if i in txt1: txt1[i] += 1

#Step 4: print the final dictionary as it now containts the counts 
print (txt1)


#Step 5: If you want to write this into a file, then use the below code
#Open file in write mode. Iterate thru the dictionary using txt1.items()
#for each key and value, write to file. Include \n to have a newline
with open('Textfile_3.txt','w') as f3:
    for k,v in txt1.items():
        t3 = k + ', ' + str(v) + '\n'
        f3.write(t3)

其输出为：

{'1763_0M73': 0, '2610_0M63': 0, '7529_12M64': 2, '7529_18M64': 1, '0091_00M56': 3}

Textfile_3

的输出为：

1763_00M73, 1
2610_00M63, 2
7529_12M64, 2
7529_18M64, 1
0091_00M56, 3

假设您的文本文件与您显示的类似：

1-打开你的文件

my_list = open('Textfile_2.txt').read().split(",")

2-计算列表中元素频率的最简单方法如下

g = ["1763_0M73 ",
"2610_0M63",
"2610_0M63 ",
"7529_12M64",
"7529_18M64",
"0091_00M56",
"0091_00M56",
"7529_12M64",
"0267_12M64",
"0091_00M56"]

from collections import Counter

counter_g = Counter(g)
counter_g.most_common()

[('0091_00M56', 3), ('7529_12M64', 2), ('1763_0M73 ', 1), ('2610_0M63', 1), ('2610_0M63 ', 1), ('7529_18M64', 1), ('0267_12M64', 1)]

请提供预期的价格。显示中间结果与预期结果的偏差。我们应该能够将单个代码块粘贴到文件中，运行它，并重现您的问题。这也让我们可以在您的上下文中测试任何建议。

Textfile\u 1

中的

2610\u 0M63

与

Textfile\u 2

中的

2610\u 00M63

不同，那么您如何获得2的计数？这是一个打字错误吗？@Joe Ferndz抱歉…那是一个打字错误文本文件包含大量行…“大量行”对不同的人意味着不同的事情。对我来说，大约是3-8亿行。既然你有一个列表，它就是列表元素<代码>计数器（）已优化，请尝试。它应该很有效。