在python csv.reader中解码utf-8
我一直在尝试从csv文件中读取数据,并将其放入在sqlite3中创建的DB表中。尝试了一百万种不同的方法,最近我使用了utf-8解码器,假设csv使用utf-8,但出现以下错误:在python csv.reader中解码utf-8,python,csv,utf-8,sqlite,Python,Csv,Utf 8,Sqlite,我一直在尝试从csv文件中读取数据,并将其放入在sqlite3中创建的DB表中。尝试了一百万种不同的方法,最近我使用了utf-8解码器,假设csv使用utf-8,但出现以下错误: File "/Users/yanhu/Documents/Python/Practice/DataScience_lec02_inclass_csv.py", line 15, in unicode_csv_reader yield [unicode(cell, 'utf-8') for cell in r
File "/Users/yanhu/Documents/Python/Practice/DataScience_lec02_inclass_csv.py", line 15, in unicode_csv_reader
yield [unicode(cell, 'utf-8') for cell in row]
UnicodeDecodeError: 'utf8' codec can't decode byte 0xaa in position 1: invalid start byte
这是我的密码:
import csv
import unicodecsv
import sqlite3
#define a decoder to decode UTF-8 to unicode
def unicode_csv_reader(unicode_csv_data, dialect=csv.excel, **kwargs):
# csv.py doesn't do Unicode; encode temporarily as UTF-8:
csv_reader = csv.reader(unicode_csv_data,
dialect=dialect, **kwargs)
for row in csv_reader:
# decode UTF-8 back to Unicode, cell by cell:
yield [unicode(cell, 'utf-8') for cell in row]
conn = sqlite3.connect("example.db")
c = conn.cursor()
c.execute("CREATE TABLE GDPNEW (Code text, Ranking int, Country text, GDP int)")
c.execute("DELETE FROM GDPNEW")
with open('GDP.csv', 'rU') as csvfile:
readercsv=unicode_csv_reader(csvfile)
row_count = 0
for row in readercsv:
if row_count !=0:
row[1]=int(row[1])
row[3]=int(row[3])
print row[2]
c.execute("INSERT INTO GDPNEW VALUES (?,?,?,?)", (row[0], row[1], row[2], row[3]))
conn.commit()
row_count += 1
results = c.execute("SELECT * FROM GDPNEW")
for row in results.fetchall():
print row
只需使用这个包:您的输入数据不是UTF-8编码的。您的输入数据包括字节
\xaa
,该字节不是有效的UTF-8字节。你确定你没有将数据编码成其他编码吗?@SimeonVisser:但这不会神奇地使输入数据变成UTF-8。