使用Python客户端库utf8 can';将行上载到现有的BigQuery表;破译

使用Python客户端库utf8 can';将行上载到现有的BigQuery表;破译,python,python-2.7,google-bigquery,Python,Python 2.7,Google Bigquery,我正在尝试将csv文件上载到BigQuery上的现有表。根据Bigquery的说法,您可以这样做: ROWS_TO_INSERT = [ (u'Phred Phlyntstone', 32), (u'Wylma Phlyntstone', 29), ] table.insert_data(ROWS_TO_INSERT) 这是我的代码: from google.cloud import bigquery # enter credentials bigquery_client

我正在尝试将csv文件上载到BigQuery上的现有表。根据Bigquery的说法,您可以这样做:

ROWS_TO_INSERT = [
    (u'Phred Phlyntstone', 32),
    (u'Wylma Phlyntstone', 29),
]

table.insert_data(ROWS_TO_INSERT)
这是我的代码:

from google.cloud import bigquery

# enter credentials
bigquery_client = bigquery.Client()
dataset = bigquery_client.dataset('my_dataset')
table = dataset.table('my_table')

# open csv file and get a list of rows in the form of tuples
with open('my_data.csv') as f:
    content = f.readlines()
ROWS_TO_INSERT = [tuple(x.split(",")) for x in content
table.reload()

# everything above worked well, but the below line got a utf8 can't decode error
table.insert_data(ROWS_TO_INSERT)
这是回溯:

Traceback (most recent call last):
  File "<input>", line 6, in <module>
  File "/Users/layla.zhang/Library/Python/2.7/lib/python/site-packages/google/cloud/bigquery/table.py", line 770, in insert_data
    data=data)
  File "/Users/layla.zhang/Library/Python/2.7/lib/python/site-packages/google/cloud/_http.py", line 294, in api_request
    data = json.dumps(data)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 243, in dumps
    return _default_encoder.encode(obj)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 207, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 270, in iterencode
    return _iterencode(o, 0)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xd9 in position 68: invalid continuation byte
回溯(最近一次呼叫最后一次):
文件“”,第6行,在
文件“/Users/layla.zhang/Library/Python/2.7/lib/Python/site packages/google/cloud/bigquery/table.py”,第770行,在insert_数据中
数据=数据)
api请求中的文件“/Users/layla.zhang/Library/Python/2.7/lib/Python/site packages/google/cloud/_http.py”,第294行
data=json.dumps(数据)
文件“/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/_init__.py”,第243行,转储中
返回默认编码器编码(obj)
文件“/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py”,第207行,在encode中
chunks=self.iterencode(o,\u one\u shot=True)
iterencode中的文件“/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py”,第270行
返回_iterencode(o,0)
UnicodeDecodeError:“utf8”编解码器无法解码位置68中的字节0xd9:无效的连续字节

我花了很多时间研究类似的问题,并尝试了许多编码/解码解决方案,但到目前为止没有任何效果。我该怎么办?

可能的重复我在这篇文章中遇到过,尝试了UnicodeReader(文件),也遇到了同样的错误,还尝试了io.path(文件,errors='replace'),奇怪的是得到了101行而不是26行(这是正确的)。