Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/list/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 在字典中迭代多个值?_Python_List_Python 2.7_Dictionary - Fatal编程技术网

Python 在字典中迭代多个值?

Python 在字典中迭代多个值?,python,list,python-2.7,dictionary,Python,List,Python 2.7,Dictionary,我有一个单词列表和字典: word_list = ["it's","they're","there's","he's"] 以及一本包含单词列表中的单词在多个文档中出现频率信息的词典: dict = [('document1',{"it's": 0,"they're": 2,"there's": 5,"he's": 1}), ('document2',{"it's": 4,"they're": 2,"there's": 3,"he's": 0}), ('document3',{"it's": 7

我有一个单词列表和字典:

word_list = ["it's","they're","there's","he's"]
以及一本包含
单词列表
中的单词在多个文档中出现频率信息的词典:

dict = [('document1',{"it's": 0,"they're": 2,"there's": 5,"he's": 1}),
('document2',{"it's": 4,"they're": 2,"there's": 3,"he's": 0}),
('document3',{"it's": 7,"they're": 0,"there's": 4,"he's": 1})]
我想开发一个如下所示的数据结构(可能是数据帧):

file       word       count
document1  it's        0
document1  they're     2
document1  there's     5
document1  he's        1
document2  it's        4
document2  they're     2
document2  there's     3
document2  he's        0
document3  it's        7
document3  they're     0
document3  there's     4
document3  he's        1
res = {}
for i in words_list:
    count = 0
    for j in dict.items():
         if i == j:
              count = count + 1
              res[i,j] = count
我试图找到这些文档中最常用的
单词。我有900多份文件

我在想下面的事情:

file       word       count
document1  it's        0
document1  they're     2
document1  there's     5
document1  he's        1
document2  it's        4
document2  they're     2
document2  there's     3
document2  he's        0
document3  it's        7
document3  they're     0
document3  there's     4
document3  he's        1
res = {}
for i in words_list:
    count = 0
    for j in dict.items():
         if i == j:
              count = count + 1
              res[i,j] = count

从这里我可以去哪里?

像这样的怎么样

word_list = ["it's","they're","there's","he's"]

frequencies = [('document1',{"it's": 0,"they're": 2,"there's": 5,"he's": 1}),
('document2',{"it's": 4,"they're": 2,"there's": 3,"he's": 0}),
('document3',{"it's": 7,"they're": 0,"there's": 4,"he's": 1})]

result = []
for document in frequencies:
    for word in word_list:
        result.append({"file":document[0], "word":word,"count":document[1][word]})

print result

像这样的怎么样

word_list = ["it's","they're","there's","he's"]

frequencies = [('document1',{"it's": 0,"they're": 2,"there's": 5,"he's": 1}),
('document2',{"it's": 4,"they're": 2,"there's": 3,"he's": 0}),
('document3',{"it's": 7,"they're": 0,"there's": 4,"he's": 1})]

result = []
for document in frequencies:
    for word in word_list:
        result.append({"file":document[0], "word":word,"count":document[1][word]})

print result

首先,你的dict不是一个dict,应该像这样构建

d = {'document1':{"it's": 0,"they're": 2,"there's": 5,"he's": 1},
    'document2':{"it's": 4,"they're": 2,"there's": 3,"he's": 0},
    'document3':{"it's": 7,"they're": 0,"there's": 4,"he's": 1}}
现在我们已经有了一个字典,我们可以使用pandas来构建一个数据帧,但是为了得到您想要的方式,我们必须从字典中构建一个列表列表。然后,我们将创建一个dataframe并标记列,然后进行排序

import collections
import pandas as pd

d = {'document1':{"it's": 0,"they're": 2,"there's": 5,"he's": 1},
    'document2':{"it's": 4,"they're": 2,"there's": 3,"he's": 0},
    'document3':{"it's": 7,"they're": 0,"there's": 4,"he's": 1}}

d = pd.DataFrame([[k,k1,v1] for k,v in d.items() for k1,v1 in v.items()], columns = ['File','Words','Count'])
print d.sort(['File','Count'], ascending=[1,1])

         File    Words  Count
1   document1     it's      0
0   document1     he's      1
3   document1  they're      2
2   document1  there's      5
4   document2     he's      0
7   document2  they're      2
6   document2  there's      3
5   document2     it's      4
11  document3  they're      0
8   document3     he's      1
10  document3  there's      4
9   document3     it's      7
如果需要前n个匹配项,则可以在排序时使用
groupby()
,然后使用
head()或tail()

d = d.sort(['File','Count'], ascending=[1,1]).groupby('File').head(2)

         File    Words  Count
1   document1     it's      0
0   document1     he's      1
4   document2     he's      0
7   document2  they're      2
11  document3  they're      0
8   document3     he's      1
列表理解返回如下所示的列表列表

d = [['document1', "he's", 1], ['document1', "it's", 0], ['document1', "there's", 5], ['document1', "they're", 2], ['document2', "he's", 0], ['document2', "it's", 4], ['document2', "there's", 3], ['document2', "they're", 2], ['document3', "he's", 1], ['document3', "it's", 7], ['document3', "there's", 4], ['document3', "they're", 0]]
为了正确地构建词典,您只需使用以下内容

d['document1']['it\'s'] = 1
如果出于某种原因,你执意使用str和dict的元组列表,你可以使用这个列表

[[i[0],k1,v1] for i in d for k1,v1 in i[1].items()]

首先,你的dict不是一个dict,应该像这样构建

d = {'document1':{"it's": 0,"they're": 2,"there's": 5,"he's": 1},
    'document2':{"it's": 4,"they're": 2,"there's": 3,"he's": 0},
    'document3':{"it's": 7,"they're": 0,"there's": 4,"he's": 1}}
现在我们已经有了一个字典,我们可以使用pandas来构建一个数据帧,但是为了得到您想要的方式,我们必须从字典中构建一个列表列表。然后,我们将创建一个dataframe并标记列,然后进行排序

import collections
import pandas as pd

d = {'document1':{"it's": 0,"they're": 2,"there's": 5,"he's": 1},
    'document2':{"it's": 4,"they're": 2,"there's": 3,"he's": 0},
    'document3':{"it's": 7,"they're": 0,"there's": 4,"he's": 1}}

d = pd.DataFrame([[k,k1,v1] for k,v in d.items() for k1,v1 in v.items()], columns = ['File','Words','Count'])
print d.sort(['File','Count'], ascending=[1,1])

         File    Words  Count
1   document1     it's      0
0   document1     he's      1
3   document1  they're      2
2   document1  there's      5
4   document2     he's      0
7   document2  they're      2
6   document2  there's      3
5   document2     it's      4
11  document3  they're      0
8   document3     he's      1
10  document3  there's      4
9   document3     it's      7
如果需要前n个匹配项,则可以在排序时使用
groupby()
,然后使用
head()或tail()

d = d.sort(['File','Count'], ascending=[1,1]).groupby('File').head(2)

         File    Words  Count
1   document1     it's      0
0   document1     he's      1
4   document2     he's      0
7   document2  they're      2
11  document3  they're      0
8   document3     he's      1
列表理解返回如下所示的列表列表

d = [['document1', "he's", 1], ['document1', "it's", 0], ['document1', "there's", 5], ['document1', "they're", 2], ['document2', "he's", 0], ['document2', "it's", 4], ['document2', "there's", 3], ['document2', "they're", 2], ['document3', "he's", 1], ['document3', "it's", 7], ['document3', "there's", 4], ['document3', "they're", 0]]
为了正确地构建词典,您只需使用以下内容

d['document1']['it\'s'] = 1
如果出于某种原因,你执意使用str和dict的元组列表,你可以使用这个列表

[[i[0],k1,v1] for i in d for k1,v1 in i[1].items()]


您应该使用Python熊猫库来创建您在文章中显示的数据帧类型。我从哪里开始呢?我应该考虑的任何方法?使一个名为
dict
的变量将使内置的
dict
函数无法访问。你应该将它重命名为其他名称。此外,它不是dict,而是字符串和dict的元组列表。您应该使用Python Pandas库来创建您在帖子中显示的数据帧类型。我从哪里开始?我应该考虑的任何方法?使一个名为
dict
的变量将使内置的
dict
函数无法访问。你应该将它重命名为其他名称。此外,它不是dict,而是字符串和dict的元组列表。我得到以下错误:
TypeError:string索引必须是整数,而不是str
。我不能用这个词本身来表示你是否使用与我相同的数据运行代码?唯一可能失败的地方是
文档[1][word]
,而
文档[1]
中的所有键都是所提供数据中的字符串。不应该失败。编辑:再想一想,这个错误意味着您试图用另一个字符串访问字符串的一个元素。你的
频率
是否包含任何原始字符串?我不这么认为。从字面上看是这样的,尽管从我使用的实际数据来看要简单得多。。它遵循完全相同的语法结构,但是
frequencies
更容易谈论。是的,没有
“count”的代码:document[1][word]
words很好。运行以下命令:
[x代表x,如果类型(x[1])是str的话
并查看是否有任何显示,我得到以下错误:
TypeError:string索引必须是整数,而不是str
。我不能用这个词本身来表示你是否使用与我相同的数据运行代码?唯一可能失败的地方是
文档[1][word]
,而
文档[1]
中的所有键都是所提供数据中的字符串。不应该失败。编辑:再想一想,这个错误意味着您试图用另一个字符串访问字符串的一个元素。你的
频率
是否包含任何原始字符串?我不这么认为。从字面上看是这样的,尽管从我使用的实际数据来看要简单得多。。它遵循完全相同的语法结构,但是
frequencies
更容易谈论。是的,没有
的代码“count”:document[1][word]
words很好。运行以下命令:
[x for x in frequencies,如果type(x[1])是str的话,看看是否有任何东西显示了很好的答案。一个问题:
d.sort(['File','Count'],升序=[1,1])
也会更改索引。有什么特别的原因让你这么做吗?@JoeR我只是把它改了,让文件从低到高,然后计数从低到高。没必要,但我觉得看起来好多了。回答得很好。一个问题:
d.sort(['File','Count'],升序=[1,1])
也会更改索引。有什么特别的原因让你这么做吗?@JoeR我只是把它改了,让文件从低到高,然后计数从低到高。没必要,但我觉得它看起来好多了。