如何在python中从字典中删除标点符号_Python_Python 3.x_Dictionary

如何在python中从字典中删除标点符号

python python-3.x dictionary

如何在python中从字典中删除标点符号,python,python-3.x,dictionary,Python,Python 3.x,Dictionary,正如PM 2Ring所指出的那样，计数器对象在这里很有用，或者只是集合库中的defaultdict。我们可以使用正则表达式包re来获得更强大的re.split（）或简单的re.findall（）：输出 from re import findall, IGNORECASE from operator import itemgetter from collections import defaultdict wordcount = defaultdict(int) file = open("l

正如PM 2Ring所指出的那样，

计数器

对象在这里很有用，或者只是

集合

库中的

defaultdict

。我们可以使用正则表达式包

re

来获得更强大的

re.split（）

或简单的

re.findall（）

：

输出

from re import findall, IGNORECASE
from operator import itemgetter
from collections import defaultdict

wordcount = defaultdict(int)

file = open("license.txt")

for vocab in findall(r"[A-Z]+", file.read(), flags=IGNORECASE):
    wordcount[vocab.lower()] += 1

for word, number in sorted(wordcount.items(), key=itemgetter(1), reverse=True):
    print(word, number)

始终存在权衡：您可能希望微调模式，以允许使用连字符或撇号，具体取决于您的应用程序

在中读取整个文件并进行处理，如果输入文件相对较小，则可以。如果没有，请使用

readline（）

以循环方式逐行阅读，然后依次处理每一行。

问题是什么？打印时如何从字典中删除标点符号？当我打印它时，在单词末尾会有很多标点符号，也许你应该在把单词放在字典里之前去掉标点符号。

from re import findall, IGNORECASE
from operator import itemgetter
from collections import defaultdict

wordcount = defaultdict(int)

file = open("license.txt")

for vocab in findall(r"[A-Z]+", file.read(), flags=IGNORECASE):
    wordcount[vocab.lower()] += 1

for word, number in sorted(wordcount.items(), key=itemgetter(1), reverse=True):
    print(word, number)

> python3 test.py
the 77
or 54
of 48
to 47
software 44
and 36
any 36
for 23
license 22
you 20
this 19
agreement 18
be 17
by 16
in 16
other 14
may 13
use 11
not 10
that 10
...