Python对待带逗号的单词与字典中不带逗号的单词一样_Python_Python 3.x

Python对待带逗号的单词与字典中不带逗号的单词一样

python python-3.x

Python对待带逗号的单词与字典中不带逗号的单词一样,python,python-3.x,Python,Python 3.x,我正在制作一个程序，读取一个文件并制作一个字典，显示一个单词被使用了多少次： filename = 'for_python.txt' with open(filename) as file: contents = file.read().split() dict = {} for word in contents: if word not in dict: dict[word] = 1 else: dict[word] += 1

我正在制作一个程序，读取一个文件并制作一个字典，显示一个单词被使用了多少次：

filename = 'for_python.txt'
with open(filename) as file:
    contents = file.read().split()
dict = {}
for word in contents:
    if word not in dict:
        dict[word] = 1
    else:
        dict[word] += 1
    
dict = sorted(dict.items(), key=lambda x: x[1], reverse=True)

for i in dict:
    print(i[0], i[1])

它是有效的，但它将带有逗号的单词视为不同的单词，我不希望这样做。有没有一种简单而有效的方法可以做到这一点？

在拆分逗号之前删除所有逗号

filename = 'for_python.txt'
with open(filename) as file:
    contents = file.read().replace(",", "").split()

我建议您在使用

单词时使用不同的标点字符strip（）
。也不要使用内置的dict
name，它是字典构造函数
import string
words = {}
for word in contents:
    word = word.strip(string.punctuation)
    if word not in words:
        words[word] = 1
    else:
        words[word] += 1


如你所知，它存在于集合中。执行此任务的计数器
import string
from collections import Counter

filename = 'test.txt'
with open(filename) as file:
    contents = file.read().split()

words = Counter(word.strip(string.punctuation) for word in contents)

for k, v in words.most_common(): # All content, in occurence conut order descreasingly
    print(k, v)
for k, v in words.most_common(5): # Only 5 most occurrence
    print(k, v)

您正在根据“
作为分隔符拆分整个数据，但对逗号不执行相同的操作。您可以使用逗号进一步拆分这些单词。以下是方法：
...
for word in contents:
    new_words = word.split(',')
    for new_word in new_words:
        if new_word not in dict:
            dict[new_word] = 1
        else:
            dict[new_word] += 1
...

你想避免所有穿刺吗？这能回答你的问题吗？请注意，这对像word1-word2、wrod3-word4这样的东西不起作用。中间部分仍将被视为word2、word3