Python 计数包含相同单词的句子数计数

Python 计数包含相同单词的句子数计数,python,Python,我有以下问题: 给定一篇课文,数一数相同字数的句子数。结果如下: s = 'I am in grade 12. I want to go to Harvard.' 应该是{5:1,6:1} 以下是我的尝试: self.sentence_length = {} for i in range(len(s)): if s[i] == '.': sentence = s[:i] l = len(sentence.split(' ')) if l

我有以下问题:

给定一篇课文,数一数相同字数的句子数。结果如下:

s = 'I am in grade 12. I want to go to Harvard.'
应该是
{5:1,6:1}

以下是我的尝试:

self.sentence_length = {}
for i in range(len(s)):
    if s[i] == '.':
        sentence = s[:i]
        l = len(sentence.split(' '))
        if l in self.sentence_lengths:
            self.sentence_lengths[l] += 1
        else:
            self.sentence_lengths[l] = 1

这给了我(错误的)结果
{5:1,11:1}

首先,定义一个通用函数来计算句子中的单词数:

def word_count(sentence):
    return len(sentence.split(' '))
另一个用于将文本拆分成句子:

def sentences_from(text):
    stripped = (s.strip() for s in text.split('.'))
    return [s for s in stripped if s]
然后(用更清晰的
文本重命名您的“
s
”)您需要的映射是一个简单的:


显然,与您的代码一样,前面的函数也会遗漏许多关键情况。

您可以尝试类似的方法

from itertools import count

s = 'I am in grade 12. I want to go to Harvard.'
sentences = list(filter(None, s.split('.')))  # filter removes empty string
# sentences holds ['I am in grade 12', ' I want to go to Harvard']

>>> dict(zip(count(1), list(map(lambda x: len(x.split()), sentences))))
{1: 5, 2: 6} # first sentence has 5 words, second has 6 words

注意:使用句子的长度作为字典的值,因为如果多个句子的长度相同,那么它将覆盖字典。

句子
打印在
句子=s[:i]
行后,您使用的是sentance[:i],即当前索引之前的所有单词,包括前面的句子。我建议将第一个字符串按点(.)分成几个句子,然后逐个处理
from itertools import count

s = 'I am in grade 12. I want to go to Harvard.'
sentences = list(filter(None, s.split('.')))  # filter removes empty string
# sentences holds ['I am in grade 12', ' I want to go to Harvard']

>>> dict(zip(count(1), list(map(lambda x: len(x.split()), sentences))))
{1: 5, 2: 6} # first sentence has 5 words, second has 6 words