Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/15.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python重复字_Python_Python 3.x_Count_Duplicates - Fatal编程技术网

Python重复字

Python重复字,python,python-3.x,count,duplicates,Python,Python 3.x,Count,Duplicates,我有一个问题,我必须计算Python(v3.4.1)中的重复单词,并将它们放在一个句子中。我使用了计数器,但我不知道如何按以下顺序获得输出。输入为: mysentence = As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality 我把它列成了一个列表,并对它进行了排序 假设输出是这样的 "As"

我有一个问题,我必须计算Python(v3.4.1)中的重复单词,并将它们放在一个句子中。我使用了计数器,但我不知道如何按以下顺序获得输出。输入为:

mysentence = As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality
我把它列成了一个列表,并对它进行了排序

假设输出是这样的

"As" is repeated 1 time.
"are" is repeated 2 times.
"as" is repeated 3 times.
"certain" is repeated 2 times.
"do" is repeated 1 time.
"far" is repeated 2 times.
"laws" is repeated 1 time.
"mathematics" is repeated 1 time.
"not" is repeated 2 times.
"of" is repeated 1 time.
"reality" is repeated 2 times.
"refer" is repeated 2 times.
"the" is repeated 1 time.
"they" is repeated 3 times.
"to" is repeated 2 times.
到目前为止,我已经谈到这一点

x=input ('Enter your sentence :')
y=x.split()
y.sort()
for y in sorted(y):
    print (y)

我可以看到排序的方向,因为您可以可靠地知道何时找到一个新词,并跟踪每个唯一单词的计数。然而,您真正想要做的是使用哈希(字典)来跟踪计数,因为字典键是唯一的。例如:

words = sentence.split()
counts = {}
for word in words:
    if word not in counts:
        counts[word] = 0
    counts[word] += 1
现在,这将为您提供一个字典,其中关键字是单词,值是它出现的次数。您可以使用
collections.defaultdict(int)
等方法来添加值:

counts = collections.defaultdict(int)
for word in words:
    counts[word] += 1
但还有比这更好的东西
collections.计数器
,它会将您的单词列表转换为包含计数的词典(实际上是词典的扩展)

counts = collections.Counter(words)
从那里,您需要按排序的单词列表及其计数,以便您可以打印它们
items()
将为您提供一个元组列表,
sorted
将按每个元组的第一项(本例中的单词)进行排序(默认情况下)。。。这正是你想要的

import collections
sentence = """As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality"""
words = sentence.split()
word_counts = collections.Counter(words)
for word, count in sorted(word_counts.items()):
    print('"%s" is repeated %d time%s.' % (word, count, "s" if count > 1 else ""))
输出

"As" is repeated 1 time.
"are" is repeated 2 times.
"as" is repeated 3 times.
"certain" is repeated 2 times.
"do" is repeated 1 time.
"far" is repeated 2 times.
"laws" is repeated 1 time.
"mathematics" is repeated 1 time.
"not" is repeated 2 times.
"of" is repeated 1 time.
"reality" is repeated 2 times.
"refer" is repeated 2 times.
"the" is repeated 1 time.
"they" is repeated 3 times.
"to" is repeated 2 times.

下面是一个非常糟糕的示例,说明在不使用列表以外的任何内容的情况下执行此操作:

x = "As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality"
words = x.split(" ")
words.sort()

words_copied = x.split(" ")
words_copied.sort()

for word in words:
    count = 0
    while(True):
        try:
            index = words_copied.index(word)
            count += 1
            del words_copied[index]
        except ValueError:
            if count is not 0:
                print(word + " is repeated " + str(count) + " times.")
            break
编辑:这里有一个更好的方法:

x = "As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality"
words = x.split(" ")
words.sort()

last_word = ""
for word in words:
    if word != last_word:
        count = [i for i, w in enumerate(words) if w == word]
        print(word + " is repeated " + str(len(count)) + " times.")
    last_word = word

要按排序顺序打印字符串中的单词副本,请执行以下操作:

from itertools import groupby 

mysentence = ("As far as the laws of mathematics refer to reality "
              "they are not certain as far as they are certain "
              "they do not refer to reality")
words = mysentence.split() # get a list of whitespace-separated words
for word, duplicates in groupby(sorted(words)): # sort and group duplicates
    count = len(list(duplicates)) # count how many times the word occurs
    print('"{word}" is repeated {count} time{s}'.format(
            word=word, count=count,  s='s'*(count > 1)))
“As”重复1次 “are”重复2次 “as”重复3次 “确定”重复2次 “do”重复1次 “远”重复2次 “定律”重复一次 “数学”重复一次 “不”重复2次 “of”重复1次 “现实”重复2次 “参考”重复2次 “the”重复一次 “他们”重复3次 “to”重复2次
嘿,我已经在Python2.7(mac)上试过了,因为我有那个版本,所以试着掌握逻辑

from collections import Counter

mysentence = """As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality"""

mysentence = dict(Counter(mysentence.split()))
for i in sorted(mysentence.keys()):
    print ('"'+i+'" is repeated '+str(mysentence[i])+' time.')
我希望这是你正在寻找的,如果不是的话,那么请让我高兴地学习新的东西

"As" is repeated 1 time.
"are" is repeated 2 time.
"as" is repeated 3 time.
"certain" is repeated 2 time.
"do" is repeated 1 time.
"far" is repeated 2 time.
"laws" is repeated 1 time.
"mathematics" is repeated 1 time.
"not" is repeated 2 time.
"of" is repeated 1 time.
"reality" is repeated 2 time.
"refer" is repeated 2 time.
"the" is repeated 1 time.
"they" is repeated 3 time.
"to" is repeated 2 time.

你应该看看collections.Counter类。这与您的用例非常相关。@ChrisArena:他在第一句话“我使用了计数器”…为什么所有变量都被称为
y
?你是想让你的代码变得混乱,还是你的其他大部分钥匙都坏了?@abarnert我不确定他是不是想说,在官方的“计数器”意义上,他是想数一数。很明显,这不在他的密码里。@ChrisArena:你可能是对的。很难从这样一个模糊的问题来判断。我认为最好使用SPLIT(),而不是S拆开(“”),因为后者将把''和'\n '添加到列表中,我们不想把它们看成“单词”。split()使用空格和新行。很好的解释!一个小问题:当(预先)解释
计数器的等效功能时,最好使用
单词not in counts
而不是
not counts.get(word)
。除了更加惯用,在其他(非
计数器)情况下更正确地使用假值之外,它还可以让您更清楚地看到,您正在检查这是一个以前从未见过的新词。@PiotrDabkowski:这一点很好。但是OP发布的输入没有任何新行。如果他的真实输入是这样,我敢打赌它也有标点符号,这意味着我们需要的不仅仅是
str.split
(无论
re.findall
re.split
str.split
加上
str.translate
,…))谢谢你的解释,谢谢你的解释。很抱歉,我是编程新手:)+1表示排序(计数器(单词))。我已经提供了。这对OP来说并不重要,但这里有一个。可能有些过火了……但任何人有机会学习
groupby
都是一件好事。:)
"As" is repeated 1 time.
"are" is repeated 2 time.
"as" is repeated 3 time.
"certain" is repeated 2 time.
"do" is repeated 1 time.
"far" is repeated 2 time.
"laws" is repeated 1 time.
"mathematics" is repeated 1 time.
"not" is repeated 2 time.
"of" is repeated 1 time.
"reality" is repeated 2 time.
"refer" is repeated 2 time.
"the" is repeated 1 time.
"they" is repeated 3 time.
"to" is repeated 2 time.