Python 如何解决将词典写入csv的以下问题?
您好,我正在使用sklearn并使用kmeans进行自然语言处理,我使用kmeans从注释创建集群,然后我创建了一个字典,其中集群数量作为键,注释列表作为值关联,如下所示:Python 如何解决将词典写入csv的以下问题?,python,csv,numpy,dictionary,anaconda,Python,Csv,Numpy,Dictionary,Anaconda,您好,我正在使用sklearn并使用kmeans进行自然语言处理,我使用kmeans从注释创建集群,然后我创建了一个字典,其中集群数量作为键,注释列表作为值关联,如下所示: dict_clusters = {} for i in range(0,len(kmeans.labels_)): #print(kmeans.labels_[i]) #print(listComments[i]) if not kmeans.labels_[i] in dict_clusters:
dict_clusters = {}
for i in range(0,len(kmeans.labels_)):
#print(kmeans.labels_[i])
#print(listComments[i])
if not kmeans.labels_[i] in dict_clusters:
dict_clusters[kmeans.labels_[i]] = []
dict_clusters[kmeans.labels_[i]].append(listComments[i])
print("dictionary constructed")
key1, value
key2, value
.
.
.
keyN, value
from collections import defaultdict
pairs = zip(y_pred, listComments)
dict_clusters2 = defaultdict(list)
for num, comment in pairs:
dict_clusters2[num].append(comment)
with open('dict.csv', 'w') as csv_file:
writer = csv.writer(csv_file)
for key, value in dict_clusters2.items():
writer.writerow([key, value])
1,'hello this is','the car is red',....'performing test'
2,'we already have','another comment',...'strings strings'
.
.
19,'we have',' comment music',...'strings strings dance'
我想用我试过的这本字典写一个csv:
Out = open("dictionary.csv", "wb")
w = csv.DictWriter(Out,dict_clusters.keys())
w.writerows(dict_clusters)
Out.close()
但是我不确定为什么会出错,因为我得到了以下错误,而且我不确定这个错误是否与numpy有关,因为kmeans.labels_uu包含多个值
Traceback (most recent call last):
File "C:/Users/CleanFile.py", line 133, in <module>
w.writerows(dict_clusters)
File "C:\Program Files\Anaconda3\lib\csv.py", line 156, in writerows
return self.writer.writerows(map(self._dict_to_list, rowdicts))
File "C:\Program Files\Anaconda3\lib\csv.py", line 146, in _dict_to_list
wrong_fields = [k for k in rowdict if k not in self.fieldnames]
TypeError: 'numpy.int32' object is not iterable
从这里得到反馈后,我尝试:
with open("dictionary.csv", mode="wb") as out_file:
writer = csv.DictWriter(out_file, headers=dict_clusters.keys())
writer.writerow(dict_clusters)
我得到:
Traceback (most recent call last):
File "C:/Users/CleanFile.py", line 129, in <module>
writer = csv.DictWriter(out_file, headers=dict_clusters.keys())
TypeError: __init__() missing 1 required positional argument: 'fieldnames'
输出:
Traceback (most recent call last):
File "C:/Users/CleanFile.py", line 130, in <module>
w.writerows([dict_clusters])
File "C:\Program Files\Anaconda3\lib\csv.py", line 156, in writerows
return self.writer.writerows(map(self._dict_to_list, rowdicts))
TypeError: a bytes-like object is required, not 'str'
我使用的python版本如下:
3.5.2 |Anaconda 4.2.0 (64-bit)| (default, Jul 5 2016, 11:41:13) [MSC v.1900 64 bit (AMD64)]
3.5.2
在尝试了很多次之后,我决定使用一种更好的方法来构建我的字典,如下所示:
dict_clusters = {}
for i in range(0,len(kmeans.labels_)):
#print(kmeans.labels_[i])
#print(listComments[i])
if not kmeans.labels_[i] in dict_clusters:
dict_clusters[kmeans.labels_[i]] = []
dict_clusters[kmeans.labels_[i]].append(listComments[i])
print("dictionary constructed")
key1, value
key2, value
.
.
.
keyN, value
from collections import defaultdict
pairs = zip(y_pred, listComments)
dict_clusters2 = defaultdict(list)
for num, comment in pairs:
dict_clusters2[num].append(comment)
with open('dict.csv', 'w') as csv_file:
writer = csv.writer(csv_file)
for key, value in dict_clusters2.items():
writer.writerow([key, value])
1,'hello this is','the car is red',....'performing test'
2,'we already have','another comment',...'strings strings'
.
.
19,'we have',' comment music',...'strings strings dance'
但是,某些角色似乎无法创建csv文件,如下所示:
dict_clusters = {}
for i in range(0,len(kmeans.labels_)):
#print(kmeans.labels_[i])
#print(listComments[i])
if not kmeans.labels_[i] in dict_clusters:
dict_clusters[kmeans.labels_[i]] = []
dict_clusters[kmeans.labels_[i]].append(listComments[i])
print("dictionary constructed")
key1, value
key2, value
.
.
.
keyN, value
from collections import defaultdict
pairs = zip(y_pred, listComments)
dict_clusters2 = defaultdict(list)
for num, comment in pairs:
dict_clusters2[num].append(comment)
with open('dict.csv', 'w') as csv_file:
writer = csv.writer(csv_file)
for key, value in dict_clusters2.items():
writer.writerow([key, value])
1,'hello this is','the car is red',....'performing test'
2,'we already have','another comment',...'strings strings'
.
.
19,'we have',' comment music',...'strings strings dance'
输出:
Traceback (most recent call last):
File "C:/Users/CleanFile.py", line 146, in <module>
writer.writerow([key, value])
File "C:\Program Files\Anaconda3\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\U0001f609' in position 6056: character maps to <undefined>
我得到的结果是:
1 ['hello this is','the car is red',....'performing test']
2 ['we already have','another comment',...'strings strings']
.
.
19 ['we have',' comment music',...'strings strings dance']
我的字典有一个键和几个注释的列表,我希望有一个csv,如下所示:
dict_clusters = {}
for i in range(0,len(kmeans.labels_)):
#print(kmeans.labels_[i])
#print(listComments[i])
if not kmeans.labels_[i] in dict_clusters:
dict_clusters[kmeans.labels_[i]] = []
dict_clusters[kmeans.labels_[i]].append(listComments[i])
print("dictionary constructed")
key1, value
key2, value
.
.
.
keyN, value
from collections import defaultdict
pairs = zip(y_pred, listComments)
dict_clusters2 = defaultdict(list)
for num, comment in pairs:
dict_clusters2[num].append(comment)
with open('dict.csv', 'w') as csv_file:
writer = csv.writer(csv_file)
for key, value in dict_clusters2.items():
writer.writerow([key, value])
1,'hello this is','the car is red',....'performing test'
2,'we already have','another comment',...'strings strings'
.
.
19,'we have',' comment music',...'strings strings dance'
然而,似乎有些字符没有很好地映射,所有操作都失败了,我希望获得支持,感谢您的支持。必须提供一个字典列表:
Out = open("dictionary.csv", "wb")
w = csv.DictWriter(Out,dict_clusters.keys())
w.writerows([dict_clusters])
Out.close()
您可能正在查找只接受一个dictionary对象的:
Out = open("dictionary.csv", "wb")
w = csv.DictWriter(Out,dict_clusters.keys())
w.writerow(dict_clusters)
Out.close()
旁白:您可能还想考虑在Access块中使用OPEN作为上下文管理器,以确保文件被正确关闭:
with open("dictionary.csv", mode="wb") as out_file:
writer = csv.DictWriter(out_file, headers=dict_clusters.keys())
writer.writerow(dict_clusters)
您的特殊角色在Py3 Ipython会话中呈现为: [31]中的“\U0001f609”
Out[31]:“与问题无关:您可能需要研究,并且第一个代码块可以编写为i,enumeratekmeans中的标签。标签:dict_clusters.setdefaultlabel,[]。appendlistComments[i],尽管最好拆分为几行,比enumerate好七行,在这种情况下,您可能希望签出以同时循环listComments和kmeans.labels。关于索引循环的更多信息:作为dict.setdefault的替代,可以使用。我通常更喜欢defaultdict而不是dict.setdefault,但它们都达到了相同的目的。您打开了文件以字节wb写入,但csv模块正在尝试写入字符串,所以只需将其更改为wwait。。我认为您将数据放在了错误的csv格式中。DictWriter,您能否提供一个小的基本示例,说明您开始使用的数据是什么样的,以及csv作为输出应该是什么样的?我认为您需要创建一个字典列表,其中每个值对应一行,而不是一个字典,每个列对应列表。@Trey Hunner,我尝试了3次,但未能获得所需的csv。我不确定发生了什么,我希望得到支持,非常感谢您的关注,我的问题更新为新的尝试@Tadhg McDonald Jensen,谢谢这打破了csv文件的生成。你知道如何避免这个例外吗?,非常感谢您的支持非常感谢您的支持我添加了更多关于如何编写我的词典的详细信息如果您需要其他细节来帮助我,请让我知道非常感谢您帮助我克服这种情况