Python 2.7 Python-获取句子中最常用的单词,如果出现并列,则返回按字母顺序排列在第一位的单词
我在下面编写了以下代码。它可以正常工作,我面临的问题是,如果一个句子中有两个单词重复了相同的次数,代码不会按字母顺序返回第一个单词。有人能提出其他的建议吗?这段代码将在Python2.7中进行评估Python 2.7 Python-获取句子中最常用的单词,如果出现并列,则返回按字母顺序排列在第一位的单词,python-2.7,Python 2.7,我在下面编写了以下代码。它可以正常工作,我面临的问题是,如果一个句子中有两个单词重复了相同的次数,代码不会按字母顺序返回第一个单词。有人能提出其他的建议吗?这段代码将在Python2.7中进行评估 """Quiz: Most Frequent Word""" def most_frequent(s): """Return the most frequently occuring word in s.""" """ Step 1 - The following assumpti
"""Quiz: Most Frequent Word"""
def most_frequent(s):
"""Return the most frequently occuring word in s."""
""" Step 1 - The following assumptions have been made:
- Space is the default delimiter
- There are no other punctuation marks that need removing
- Convert all letters into lower case"""
word_list_array = s.split()
"""Step 2 - sort the list alphabetically"""
word_sort = sorted(word_list_array, key=str.lower)
"""Step 3 - count the number of times word has been repeated in the word_sort array.
create another array containing the word and the frequency in which it is repeated"""
wordfreq = []
freq_wordsort = []
for w in word_sort:
wordfreq.append(word_sort.count(w))
freq_wordsort = zip(wordfreq, word_sort)
"""Step 4 - output the array having the maximum first index variable and output the word in that array"""
max_word = max(freq_wordsort)
word = max_word[-1]
result = word
return result
def test_run():
"""Test most_frequent() with some inputs."""
print most_frequent("london bridge is falling down falling down falling down london bridge is falling down my fair lady") # output: 'bridge'
print most_frequent("betty bought a bit of butter but the butter was bitter") # output: 'butter'
if __name__ == '__main__':
test_run()
我发现通过使用该方法可以实现一个很好的解决方案,而不必对代码进行太多的修改 找到频率最高的单词(max_word)后,只需调用wordfreq上的
index
方法,提供max_word作为输入,返回其在列表中的位置;然后以word_排序返回与此索引关联的单词
代码示例如下(我删除了zip
函数,因为不再需要它,并添加了两个更简单的示例):
“测验:最常用单词”
def最频繁出现的次数:
“”“返回s中最常出现的单词。”“”
“”“第1步-已做出以下假设:
-空格是默认的分隔符
-没有其他需要删除的标点符号
-将所有字母转换为小写“”
单词列表数组=s.split()
“”“步骤2-按字母顺序对列表排序”“”
单词排序=排序(单词列表数组,关键字=str.lower)
“”“步骤3-计算word在word排序数组中重复的次数。”。
创建另一个包含单词及其重复频率的数组“”
wordfreq=[]
#freq_wordsort=[]
对于word_排序中的w:
wordfreq.append(单词排序计数(w))
#freq\u wordsort=zip(wordfreq,word\u sort)
“”“步骤4-输出具有最大第一个索引变量的数组,并输出该数组中的单词”“”
max_word=max(wordfreq)
word=word\u sort[wordfreq.index(max\u word)]#非常感谢。@Dadamuni很高兴我能帮上忙。如果您发现我的建议回答了您的问题,请向上投票并单击左侧的“接受”按钮:)
"""Quiz: Most Frequent Word"""
def most_frequent(s):
"""Return the most frequently occuring word in s."""
""" Step 1 - The following assumptions have been made:
- Space is the default delimiter
- There are no other punctuation marks that need removing
- Convert all letters into lower case"""
word_list_array = s.split()
"""Step 2 - sort the list alphabetically"""
word_sort = sorted(word_list_array, key=str.lower)
"""Step 3 - count the number of times word has been repeated in the word_sort array.
create another array containing the word and the frequency in which it is repeated"""
wordfreq = []
# freq_wordsort = []
for w in word_sort:
wordfreq.append(word_sort.count(w))
# freq_wordsort = zip(wordfreq, word_sort)
"""Step 4 - output the array having the maximum first index variable and output the word in that array"""
max_word = max(wordfreq)
word = word_sort[wordfreq.index(max_word)] # <--- solution!
result = word
return result
def test_run():
"""Test most_frequent() with some inputs."""
print(most_frequent("london bridge is falling down falling down falling down london bridge is falling down my fair lady")) # output: 'down'
print(most_frequent("betty bought a bit of butter but the butter was bitter")) # output: 'butter'
print(most_frequent("a a a a b b b b")) #output: 'a'
print(most_frequent("z z j j z j z j")) #output: 'j'
if __name__ == '__main__':
test_run()