Python 2.7 Python-获取句子中最常用的单词,如果出现并列,则返回按字母顺序排列在第一位的单词

Python 2.7 Python-获取句子中最常用的单词,如果出现并列,则返回按字母顺序排列在第一位的单词,python-2.7,Python 2.7,我在下面编写了以下代码。它可以正常工作,我面临的问题是,如果一个句子中有两个单词重复了相同的次数,代码不会按字母顺序返回第一个单词。有人能提出其他的建议吗?这段代码将在Python2.7中进行评估 """Quiz: Most Frequent Word""" def most_frequent(s): """Return the most frequently occuring word in s.""" """ Step 1 - The following assumpti

我在下面编写了以下代码。它可以正常工作,我面临的问题是,如果一个句子中有两个单词重复了相同的次数,代码不会按字母顺序返回第一个单词。有人能提出其他的建议吗?这段代码将在Python2.7中进行评估

"""Quiz: Most Frequent Word"""

def most_frequent(s):
    """Return the most frequently occuring word in s."""

    """ Step 1 - The following assumptions have been made:
        - Space is the default delimiter
        - There are no other punctuation marks that need removing
        - Convert all letters into lower case"""


    word_list_array = s.split()


    """Step 2 - sort the list alphabetically"""

    word_sort = sorted(word_list_array, key=str.lower)

    """Step 3 - count the number of times word has been repeated in the word_sort array.
                create another array containing the word and the frequency in which it is repeated"""

    wordfreq = []
    freq_wordsort = []
    for w in word_sort:
        wordfreq.append(word_sort.count(w))
        freq_wordsort = zip(wordfreq, word_sort)


    """Step 4 - output the array having the maximum first index variable and output the word in that array"""

    max_word = max(freq_wordsort)
    word = max_word[-1]


    result = word

    return result


def test_run():
    """Test most_frequent() with some inputs."""
    print most_frequent("london bridge is falling down falling down falling down london bridge is falling down my fair lady") # output: 'bridge'
    print most_frequent("betty bought a bit of butter but the butter was bitter") # output: 'butter'


if __name__ == '__main__':
    test_run()

我发现通过使用该方法可以实现一个很好的解决方案,而不必对代码进行太多的修改

找到频率最高的单词(max_word)后,只需调用wordfreq上的
index
方法,提供max_word作为输入,返回其在列表中的位置;然后以word_排序返回与此索引关联的单词

代码示例如下(我删除了zip函数,因为不再需要它,并添加了两个更简单的示例):

“测验:最常用单词”
def最频繁出现的次数:
“”“返回s中最常出现的单词。”“”
“”“第1步-已做出以下假设:
-空格是默认的分隔符
-没有其他需要删除的标点符号
-将所有字母转换为小写“”
单词列表数组=s.split()
“”“步骤2-按字母顺序对列表排序”“”
单词排序=排序(单词列表数组,关键字=str.lower)
“”“步骤3-计算word在word排序数组中重复的次数。”。
创建另一个包含单词及其重复频率的数组“”
wordfreq=[]
#freq_wordsort=[]
对于word_排序中的w:
wordfreq.append(单词排序计数(w))
#freq\u wordsort=zip(wordfreq,word\u sort)
“”“步骤4-输出具有最大第一个索引变量的数组,并输出该数组中的单词”“”
max_word=max(wordfreq)

word=word\u sort[wordfreq.index(max\u word)]#非常感谢。@Dadamuni很高兴我能帮上忙。如果您发现我的建议回答了您的问题,请向上投票并单击左侧的“接受”按钮:)
"""Quiz: Most Frequent Word"""



def most_frequent(s):
    """Return the most frequently occuring word in s."""

    """ Step 1 - The following assumptions have been made:
        - Space is the default delimiter
        - There are no other punctuation marks that need removing
        - Convert all letters into lower case"""


    word_list_array = s.split()


    """Step 2 - sort the list alphabetically"""

    word_sort = sorted(word_list_array, key=str.lower)

    """Step 3 - count the number of times word has been repeated in the word_sort array.
                create another array containing the word and the frequency in which it is repeated"""

    wordfreq = []
    # freq_wordsort = []
    for w in word_sort:
        wordfreq.append(word_sort.count(w))
        # freq_wordsort = zip(wordfreq, word_sort)


    """Step 4 - output the array having the maximum first index variable and output the word in that array"""

    max_word = max(wordfreq)
    word = word_sort[wordfreq.index(max_word)] # <--- solution!


    result = word

    return result


def test_run():
    """Test most_frequent() with some inputs."""
    print(most_frequent("london bridge is falling down falling down falling down london bridge is falling down my fair lady")) # output: 'down'
    print(most_frequent("betty bought a bit of butter but the butter was bitter")) # output: 'butter'
    print(most_frequent("a a a a b b b b")) #output: 'a'
    print(most_frequent("z z j j z j z j")) #output: 'j'


if __name__ == '__main__':
    test_run()