如何在python字典中迭代所有键？_Python_Dictionary_Python 3.x_Machine Learning

如何在python字典中迭代所有键？

python dictionary python-3.x machine-learning

如何在python字典中迭代所有键？,python,dictionary,python-3.x,machine-learning,Python,Dictionary,Python 3.x,Machine Learning,我应该计算字典“d”的所有键值在文档“individual articles”中所有文件中的出现频率这里，文档“individual articles”有大约20000个txt文件，文件名为1,2,3,4。。。例如：假设d[Britain]=[5,76289]必须返回Britain在属于文档“Induvidal articles”的文件5.txt、76.txt、289.txt中出现的次数，并且我需要找到它在同一文档中所有文件中的出现频率 import collections import sys

我应该计算字典“d”的所有键值在文档“individual articles”中所有文件中的出现频率这里，文档“individual articles”有大约20000个txt文件，文件名为1,2,3,4。。。例如：假设d[Britain]=[5,76289]必须返回Britain在属于文档“Induvidal articles”的文件5.txt、76.txt、289.txt中出现的次数，并且我需要找到它在同一文档中所有文件中的出现频率

import collections
import sys
import os
import re
sys.stdout=open('dictionary.txt','w')
from collections import Counter
from glob import glob


folderpath='d:/individual-articles'
counter=Counter()


filepaths = glob(os.path.join(folderpath,'*.txt'))

def words_generator(fileobj):
    for line in fileobj:
        for word in line.split():
            yield word
word_count_dict = {}
for file in filepaths:
    f = open(file,"r")
    words = words_generator(f)
    for word in words:
        if word not in word_count_dict:
              word_count_dict[word] = {"total":0}
        if file not in word_count_dict[word]:
              word_count_dict[word][file] = 0
        word_count_dict[word][file] += 1              
        word_count_dict[word]["total"] += 1        
for k in word_count_dict.keys():
    for filename in word_count_dict[k]:
        if filename == 'total': continue
        counter.update(filename)

for k in word_count_dict.keys():
    for count in counter.most_common():
        print('{}  {}'.format(word_count_dict[k],count))

我怎样才能在那些作为该键值字典元素的文件中找到英国的频率

我需要将这些值存储在另一个d2中，对于同一个示例，d2必须包含

（英国，261200）（西班牙，526795）（法国，45568）

其中26是文件5.txt、76.txt和289.txt中单词Britain的频率，1200是所有文件中单词Britain的频率。西班牙和法国也是如此

我在这里使用计数器，我认为这是缺陷，因为到目前为止，除了我的最后一个循环外，一切都很好

我是一个python新手，我很少尝试！请帮忙

word\u count\u dict[“Britain”]

是一本普通字典。只需在其上循环：

for filename in word_count_dict["Britain"]:
    if filename == 'total': continue
    print("Britain appears in {} {} times".format(filename, word_count_dict["Britain"][filename]))

或使用以下命令检索所有密钥：

word_count_dict["Britain"].keys()

请注意，该词典中有一个特殊的键

total

可能是缩进已关闭，但似乎没有正确计算文件条目：

if file not in word_count_dict[word]:
    word_count_dict[word][file] = 0
    word_count_dict[word][file] += 1              
    word_count_dict[word]["total"] += 1

如果以前未在每个单词词典中看到

文件

，则只计算（

+=1

）个单词；更正为：

if file not in word_count_dict[word]:
    word_count_dict[word][file] = 0
word_count_dict[word][file] += 1              
word_count_dict[word]["total"] += 1

要将其扩展到任意单词，请在外部

单词\u count\u dict上循环：
for word, counts in word_count_dict.iteritems():
    print('Total counts for word {}: '.format(word, counts['total']))
    for filename, count in counts.iteritems():
        if filename == 'total': continue
        print("{} appears in {} {} times".format(word, filename, count))

假设我有多个单词，比如“英国”、“法国”、“西班牙”，那么这个词也会起作用：对于word_count_dict.keys（）中的k：@radhika:没错。k
本身就是一个将文件名映射到计数的字典。那么这是正确的吗？对于word\u count\u dict.keys（）中的k：对于word\u count\u dict[k]中的文件名：如果文件名='total'：继续打印（k+“出现在{}{}{}{}次中）。格式（文件名，word\u count\u dict[k][filename]）@radhika：我会这么说，是的。：-）我已经编辑了这个问题，你能看看最后的循环吗？提前谢谢