用Python保存UTF-8 CSV_Python_Csv_Utf 8

用Python保存UTF-8 CSV

python csv utf-8

用Python保存UTF-8 CSV,python,csv,utf-8,Python,Csv,Utf 8,我一直在努力解决这个问题，并且读了很多文章，但我似乎无法让它正常工作。我需要保存一个UTF-8 CSV文件首先，这是我的超简单方法： #!/usr/bin/env python # -*- coding: utf-8 -*- import csv import sys import codecs f = codecs.open("output.csv", "w", "utf-8-sig") writer = csv.writer(f, delimiter=',', quotechar='"

我一直在努力解决这个问题，并且读了很多文章，但我似乎无法让它正常工作。我需要保存一个UTF-8 CSV文件

首先，这是我的超简单方法：

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import csv
import sys
import codecs

f = codecs.open("output.csv", "w", "utf-8-sig")
writer = csv.writer(f, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
cells = ["hello".encode("utf-8"), "nǐ hǎo".encode("utf-8"), "你好".encode("utf-8")]
writer.writerow(cells)

这将导致一个错误：

Traceback (most recent call last):
  File "./makesimplecsv.py", line 10, in <module>
    cells = ["hello".encode("utf-8"), "nǐ hǎo".encode("utf-8"), "你好".encode("utf-8")]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc7 in position 1: ordinal not in range(128)

这会导致相同的错误：

Traceback (most recent call last):
  File "./makesimplecsvwithunicodewriter.sh", line 40, in <module>
    cells = ["hello".encode("utf-8"), "nǐ hǎo".encode("utf-8"), "你好".encode("utf-8")]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc7 in position 1: ordinal not in range(128)

回溯（最近一次呼叫最后一次）：
文件“/makesimplecsvwithunicodewriter.sh”，第40行，在
单元格=[“你好”。编码（“utf-8”），“nǐhǎo”。编码（“utf-8”），”你好".编码（“utf-8”）]
UnicodeDecodeError:“ascii”编解码器无法解码位置1中的字节0xc7：序号不在范围内（128）

我想我已经看过了在其他类似问题中发现的清单：

我的文件有一个编码语句
我正在打开文件，以便使用UTF-8编写
在将单个字符串传递给CSV编写器之前，我正在UTF-8中对其进行编码
我尝试过添加UTF-8 BOM，也尝试过不添加UTF-8 BOM，但这与我所读到的内容似乎没有任何区别，或者说确实很关键

知道我做错了什么吗？

您正在将编码的字节字符串写入CSV文件。当您需要Unicode对象时，这样做没有什么意义
不编码，解码：

cells = ["hello".decode("utf-8"), "nǐ hǎo".decode("utf-8"), "你好".decode("utf-8")]
或者使用
u'…'
unicode字符串文字：

cells = [u"hello", u"nǐ hǎo", u"你好"]

您不能将
codecs.open（）
文件对象与Python 2
csv
模块一起使用。或者使用
unicodedwriter
方法（使用常规文件对象）并传入Unicode对象，或者将单元格编码为字节字符串并直接使用
csv.writer（）
对象（同样使用常规文件对象），因为这是Unicode编写器所做的；将编码的字节字符串传递给
csv.writer（）
对象。
您正在将编码的字节字符串写入csv文件。当您需要Unicode对象时，这样做没有什么意义
不编码，解码：

cells = ["hello".decode("utf-8"), "nǐ hǎo".decode("utf-8"), "你好".decode("utf-8")]
或者使用
u'…'
unicode字符串文字：

cells = [u"hello", u"nǐ hǎo", u"你好"]

您不能将
codecs.open（）
文件对象与Python 2
csv
模块一起使用。或者使用
unicodedwriter
方法（使用常规文件对象）并传入Unicode对象，或者将单元格编码为字节字符串并直接使用
csv.writer（）
对象（同样使用常规文件对象），因为这是UnicodeWriter所做的；将编码的字节字符串传递给
csv.writer（）
对象。
更新-解决方案
多亏了大家接受的答案，我才得以实现这一目标。以下是完整的工作示例，供将来参考：

#!/usr/bin/env python # -*- coding: utf-8 -*- import csv import sys import codecs import cStringIO class UnicodeWriter: """ A CSV writer which will write rows to CSV file "f", which is encoded in the given encoding. """ def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds): # Redirect output to a queue self.queue = cStringIO.StringIO() self.writer = csv.writer(self.queue, dialect=dialect, **kwds) self.stream = f self.encoder = codecs.getincrementalencoder(encoding)() def writerow(self, row): self.writer.writerow([s.encode("utf-8") for s in row]) # Fetch UTF-8 output from the queue ... data = self.queue.getvalue() data = data.decode("utf-8") # ... and reencode it into the target encoding data = self.encoder.encode(data) # write to the target stream self.stream.write(data) # empty queue self.queue.truncate(0) def writerows(self, rows): for row in rows: self.writerow(row) f = open("output.csv", "w") writer = UnicodeWriter(f) cells = ["hello".decode("utf-8"), "nǐ hǎo".decode("utf-8"), "你好".decode("utf-8")] writer.writerow(cells)

更新-解决方案
多亏了大家接受的答案，我才得以实现这一目标。以下是完整的工作示例，供将来参考：

#!/usr/bin/env python # -*- coding: utf-8 -*- import csv import sys import codecs import cStringIO class UnicodeWriter: """ A CSV writer which will write rows to CSV file "f", which is encoded in the given encoding. """ def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds): # Redirect output to a queue self.queue = cStringIO.StringIO() self.writer = csv.writer(self.queue, dialect=dialect, **kwds) self.stream = f self.encoder = codecs.getincrementalencoder(encoding)() def writerow(self, row): self.writer.writerow([s.encode("utf-8") for s in row]) # Fetch UTF-8 output from the queue ... data = self.queue.getvalue() data = data.decode("utf-8") # ... and reencode it into the target encoding data = self.encoder.encode(data) # write to the target stream self.stream.write(data) # empty queue self.queue.truncate(0) def writerows(self, rows): for row in rows: self.writerow(row) f = open("output.csv", "w") writer = UnicodeWriter(f) cells = ["hello".decode("utf-8"), "nǐ hǎo".decode("utf-8"), "你好".decode("utf-8")] writer.writerow(cells)

谢谢！根据您的反馈，我可以让它正常工作。我使用UnicodeWriter方法，将encode（）调用切换为decode（），并使用标准的open（）函数获取文件对象以供编写。我将使用解决方案更新该问题，以供将来参考。@antun：如果您觉得有必要，请添加您自己的解决方案作为新答案；该问题应该仍然只是一个问题。好的，我将添加它作为答案。谢谢！我能够根据您的反馈获得该解决方案。我使用了nicodeWriter方法，并将encode（）调用切换为decode（），我使用了标准的open（）函数获取要写入的文件对象。我将用解决方案更新该问题，以供将来参考。@antun：如果您觉得有必要，请添加您自己的解决方案作为新答案；该问题应该仍然只是一个问题。好的，我将添加它作为答案。