Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/350.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/15.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python对待带逗号的单词与字典中不带逗号的单词一样_Python_Python 3.x - Fatal编程技术网

Python对待带逗号的单词与字典中不带逗号的单词一样

Python对待带逗号的单词与字典中不带逗号的单词一样,python,python-3.x,Python,Python 3.x,我正在制作一个程序,读取一个文件并制作一个字典,显示一个单词被使用了多少次: filename = 'for_python.txt' with open(filename) as file: contents = file.read().split() dict = {} for word in contents: if word not in dict: dict[word] = 1 else: dict[word] += 1

我正在制作一个程序,读取一个文件并制作一个字典,显示一个单词被使用了多少次:

filename = 'for_python.txt'
with open(filename) as file:
    contents = file.read().split()
dict = {}
for word in contents:
    if word not in dict:
        dict[word] = 1
    else:
        dict[word] += 1
    
dict = sorted(dict.items(), key=lambda x: x[1], reverse=True)

for i in dict:
    print(i[0], i[1])

它是有效的,但它将带有逗号的单词视为不同的单词,我不希望这样做。有没有一种简单而有效的方法可以做到这一点?

在拆分逗号之前删除所有逗号

filename = 'for_python.txt'
with open(filename) as file:
    contents = file.read().replace(",", "").split()

我建议您在使用
单词时使用不同的标点字符
strip()
。也不要使用内置的
dict
name,它是字典构造函数

import string
words = {}
for word in contents:
    word = word.strip(string.punctuation)
    if word not in words:
        words[word] = 1
    else:
        words[word] += 1

如你所知,它存在于集合中。执行此任务的计数器

import string
from collections import Counter

filename = 'test.txt'
with open(filename) as file:
    contents = file.read().split()

words = Counter(word.strip(string.punctuation) for word in contents)

for k, v in words.most_common(): # All content, in occurence conut order descreasingly
    print(k, v)
for k, v in words.most_common(5): # Only 5 most occurrence
    print(k, v)

您正在根据
作为分隔符拆分整个数据,但对逗号不执行相同的操作。您可以使用逗号进一步拆分这些单词。以下是方法:

...
for word in contents:
    new_words = word.split(',')
    for new_word in new_words:
        if new_word not in dict:
            dict[new_word] = 1
        else:
            dict[new_word] += 1
...

你想避免所有穿刺吗?这能回答你的问题吗?请注意,这对像
word1-word2、wrod3-word4这样的东西不起作用。中间部分仍将被视为
word2、word3