Windows csv文件中字符串的表示不正确
我在Win7上,Python2.7 拿着绳子。 原始视图: A.p.Møller Mærsk UTF-8:Windows csv文件中字符串的表示不正确,windows,python-2.7,csv,unicode,Windows,Python 2.7,Csv,Unicode,我在Win7上,Python2.7 拿着绳子。 原始视图: A.p.Møller Mærsk UTF-8: s = 'A. P. M\xc3\xb8ller M\xc3\xa6rsk' 我需要用csv来写。 试试这个: with open('14.09 Anbefalte aksjer.csv', 'w') as csvfile: writer = csv.writer(csvfile) writer.writerow([s]) 收到这个: A.p.MΓёller MΓ阿尔斯
s = 'A. P. M\xc3\xb8ller M\xc3\xa6rsk'
我需要用csv来写。
试试这个:
with open('14.09 Anbefalte aksjer.csv', 'w') as csvfile:
writer = csv.writer(csvfile)
writer.writerow([s])
收到这个:
A.p.MΓёller MΓ阿尔斯克
尝试使用Unicode编写器:
class UnicodeWriter:
"""
A CSV writer which will write rows to CSV file "f",
which is encoded in the given encoding.
"""
def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):
# Redirect output to a queue
self.queue = StringIO()
self.writer = csv.writer(self.queue, dialect=dialect, **kwds)
self.stream = f
self.encoder = codecs.getincrementalencoder(encoding)()
def writerow(self, row):
self.writer.writerow([s.encode("utf-8") for s in row])
# Fetch UTF-8 output from the queue ...
data = self.queue.getvalue()
data = data.decode("utf-8")
# ... and reencode it into the target encoding
data = self.encoder.encode(data)
# write to the target stream
self.stream.write(data)
# empty queue
self.queue.truncate(0)
def writerows(self, rows):
for row in rows:
self.writerow(row)
s = 'A. P. M\xc3\xb8ller M\xc3\xa6rsk'.decode('utf8')
with open('14.09 Anbefalte aksjer.csv', 'w') as csvfile:
writer = UnicodeWriter(csvfile)
writer.writerow([s])
又得到了:
A.p.MΓёller MΓ阿尔斯克
请尝试以下操作:
再次:
A.p.MΓёller MΓ阿尔斯克
怎么了?我怎样才能把它写对呢?您看到的是一个mojibake:表示在一种字符编码中编码的Unicode文本的字节在另一种(不兼容的)字符编码中显示 如果
'.decode('utf8')
没有引发AttributeError
,则表示您不在Python 3上(不管您的问题是什么)。在Python 2上,csv
不直接支持Unicode,您必须手动编码:
#!/usr/bin/env python2
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
text = "A. P. Møller Mærsk"
with open('14.09 Anbefalte aksjer.csv', 'wb') as csvfile:
writer = csv.writer(csvfile)
writer.writerow([text.encode('utf-8')])
如果
text
包含未损坏的数据,则UnicodeWriter
和unicodesv
模块应该也可以工作。Windows使用记事本或Excel等工具假定默认窗口区域设置的编码,因此对于UTF-8,必须在文件开头编码字节顺序标记(BOM,U+FEFF)。Python为此提供了一种编码,utf-8-sig
。注意:通过使用#coding:utf8
并将源文件保存在UTF-8中,您可以直接将字符串声明为Unicode字符串。最后,与csv
模块一起使用的文件应该在Python2.7上以wb
的形式打开,否则您将看到在Windows上编写换行符时出现问题
#coding:utf8
import csv
from StringIO import StringIO
import codecs
class UnicodeWriter:
"""
A CSV writer which will write rows to CSV file "f",
which is encoded in the given encoding.
"""
# Use utf-8-sig encoding here.
def __init__(self, f, dialect=csv.excel, encoding="utf-8-sig", **kwds):
# Redirect output to a queue
self.queue = StringIO()
self.writer = csv.writer(self.queue, dialect=dialect, **kwds)
self.stream = f
self.encoder = codecs.getincrementalencoder(encoding)()
def writerow(self, row):
self.writer.writerow([s.encode("utf-8") for s in row])
# Fetch UTF-8 output from the queue ...
data = self.queue.getvalue()
data = data.decode("utf-8")
# ... and reencode it into the target encoding
data = self.encoder.encode(data)
# write to the target stream
self.stream.write(data)
# empty queue
self.queue.truncate(0)
def writerows(self, rows):
for row in rows:
self.writerow(row)
s = u'A. P. Møller Mærsk' # declare as Unicode string.
with open('14.09 Anbefalte aksjer.csv', 'wb') as csvfile:
writer = UnicodeWriter(csvfile)
writer.writerow([s])
输出:
A. P. Møller Mærsk
你用什么打开你的CSV文件?当您通过简单的双击打开文件时,Excel在处理Unicode/UTF-8编码的CSV数据方面非常糟糕。试试程序的导入对话框,我想你可以选择应该采用什么字符编码。如果这不起作用,请尝试LibreOffice/OpenOffice的Calc,它也提供了选项。或者至少用记事本++打开它,看看编码是否正确。谢谢你,CBroe。通过更改excel编码,我做到了这一点。