带有url和字符串计数的Python练习_Python

带有url和字符串计数的Python练习

python

带有url和字符串计数的Python练习,python,Python,我有一个小问题，我必须做一个练习：基本上，任务是打开url，将其转换为给定格式，并计算文本中给定字符串的出现次数 import urllib2 as ul def word_counting(url, code, words): page = ul.urlopen(url) text = page.read() decoded = ext.decode(code) result = {} for word in words: cou

我有一个小问题，我必须做一个练习：基本上，任务是打开url，将其转换为给定格式，并计算文本中给定字符串的出现次数

import urllib2 as ul 

def word_counting(url, code, words):
    page = ul.urlopen(url)
    text = page.read()
    decoded = ext.decode(code)
    result = {}

    for word in words:
        count = decoded.count(word)
        counted = str(word) + ":" + " " + str(count)
        result.append(counted)

    return finale

我应该得到的结果类似于“word1:x，word2:y，word3:z”，其中x，y，z是出现的次数。但我似乎只得到一个数字，当我尝试运行测试程序时，我得到的结果是第一次出现的结果只有9，第二次出现的结果是14，第三次出现的结果是5，缺少其他出现的结果和整个计数值。

我做错了什么？提前感谢

您没有正确地附加到词典中

正确的方法是

result[key]=value

所以对于你的循环来说是

for word in words:
  count = decoded.count(word)
  result[word] = str(count)

没有解码但使用

.count（）

不要忘记列表和字典的理解。它们可以非常有效地处理更大的数据集（尤其是在示例中分析大型网页时）。最后，如果您的数据集很小，您可能会认为dict理解语法更干净/更符合python等

因此，在本例中，我将使用如下内容：

result = {word : decoded.count(word) for word in words}

也可以使用集合。计数器：

>>> from collections import Counter
>>> words = ['apple', 'apple', 'pear', 'banana']
>>> Counter(words)
Counter({'apple': 2, 'pear': 1, 'banana': 1})

你可能想看看计数器上的命令：

>>> from collections import Counter
>>> words = ['apple', 'apple', 'pear', 'banana']
>>> Counter(words)
Counter({'apple': 2, 'pear': 1, 'banana': 1})