Python &引用；UnicodeDecodeError:&x27；utf-8'；编解码器可以'；t解码字节0x80“；在google colaboratory上使用pydrive加载pickle文件时_Python_Utf 8_Pickle_Google Colaboratory_Pydrive

Python &引用；UnicodeDecodeError:&x27；utf-8'；编解码器可以'；t解码字节0x80“；在google colaboratory上使用pydrive加载pickle文件时

python utf-8 google-colaboratory

Python &引用；UnicodeDecodeError:&x27；utf-8'；编解码器可以'；t解码字节0x80“；在google colaboratory上使用pydrive加载pickle文件时,python,utf-8,pickle,google-colaboratory,pydrive,Python,Utf 8,Pickle,Google Colaboratory,Pydrive,我对使用google colaboratory（colab）和pydrive很陌生。我正在尝试加载“CAS_num_strings”中的数据，该数据是在我的google驱动器的特定目录中的pickle文件中写入的，使用colab作为： pickle.dump(CAS_num_strings,open('CAS_num_strings.p', 'wb')) dump_meta = {'title': 'CAS.pkl', 'parents': [{'id':'1UEqIADV_tHic1Le0zl

我对使用google colaboratory（colab）和pydrive很陌生。我正在尝试加载“CAS_num_strings”中的数据，该数据是在我的google驱动器的特定目录中的pickle文件中写入的，使用colab作为：

pickle.dump(CAS_num_strings,open('CAS_num_strings.p', 'wb'))
dump_meta = {'title': 'CAS.pkl', 'parents': [{'id':'1UEqIADV_tHic1Le0zlT25iYB7T6dBpBj'}]} 
pkl_dump = drive.CreateFile(dump_meta)
pkl_dump.SetContentFile('CAS_num_strings.p')
pkl_dump.Upload()
print(pkl_dump.get('id'))

其中'id'：'1UEqIADV_tHic1Le0zlT25iYB7T6dBpBj'确保它有一个特定的父文件夹，该文件夹由该id给定。最后一个print命令提供输出：

'1ZgZfEaKgqGnuBD40CY8zg0MCiqKmi1vH'

title: CAS.pkl, mimeType: text/x-pascal

---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-9-a80d9de0fecf> in <module>()
     30 cas_strings = drive.CreateFile({'id':'1ZgZfEaKgqGnuBD40CY8zg0MCiqKmi1vH'})
     31 print('title: %s, mimeType: %s' % (cas_strings['title'], cas_strings['mimeType']))
---> 32 print('Downloaded content "{}"'.format(cas_strings.GetContentString()))
     33 
     34 

/usr/local/lib/python3.6/dist-packages/pydrive/files.py in GetContentString(self, mimetype, encoding, remove_bom)
    192                     self.has_bom == remove_bom:
    193       self.FetchContent(mimetype, remove_bom)
--> 194     return self.content.getvalue().decode(encoding)
    195 
    196   def GetContentFile(self, filename, mimetype=None, remove_bom=False):

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

因此，我能够创建并转储id为“1ZGZpeakGQGNUBD40CY8ZG0MCIQKMI1VH”的pickle文件。现在，我想将这个pickle文件加载到另一个colab脚本中，用于不同的目的。为了加载，我使用命令集：

cas_strings = drive.CreateFile({'id':'1ZgZfEaKgqGnuBD40CY8zg0MCiqKmi1vH'})
print('title: %s, mimeType: %s' % (cas_strings['title'], cas_strings['mimeType']))
print('Downloaded content "{}"'.format(cas_strings.GetContentString()))

这给了我输出：

'1ZgZfEaKgqGnuBD40CY8zg0MCiqKmi1vH'

title: CAS.pkl, mimeType: text/x-pascal

---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-9-a80d9de0fecf> in <module>()
     30 cas_strings = drive.CreateFile({'id':'1ZgZfEaKgqGnuBD40CY8zg0MCiqKmi1vH'})
     31 print('title: %s, mimeType: %s' % (cas_strings['title'], cas_strings['mimeType']))
---> 32 print('Downloaded content "{}"'.format(cas_strings.GetContentString()))
     33 
     34 

/usr/local/lib/python3.6/dist-packages/pydrive/files.py in GetContentString(self, mimetype, encoding, remove_bom)
    192                     self.has_bom == remove_bom:
    193       self.FetchContent(mimetype, remove_bom)
--> 194     return self.content.getvalue().decode(encoding)
    195 
    196   def GetContentFile(self, filename, mimetype=None, remove_bom=False):

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

标题：CAS.pkl，mimeType:text/x-pascal
---------------------------------------------------------------------------
UnicodeDecodeError回溯（最近一次呼叫最后一次）
在（）
30个cas_strings=drive.CreateFile（{'id'：'1ZgZfEaKgqGnuBD40CY8zg0MCiqKmi1vH'））
31打印（'title:%s，mimeType:%s'（cas_字符串['title']，cas_字符串['mimeType']））
--->32打印（'下载的内容“{}”。格式（cas_strings.GetContentString（）））
33
34
/GetContentString中的usr/local/lib/python3.6/dist-packages/pydrive/files.py（self、mimetype、encoding、remove_-bom）
192 self.has_bom==删除_bom:
193自取内容（mimetype，删除bom）
-->194返回self.content.getvalue（）.decode（编码）
195
196 def GetContentFile（self，filename，mimetype=None，remove_bom=False）：
UnicodeDecodeError:“utf-8”编解码器无法解码位置0中的字节0x80:无效的开始字节

如您所见，它找到了文件CAS.pkl，但无法解码数据。我希望能够解决此错误。据我所知，正常的utf-8编码/解码在正常的pickle转储和加载过程中使用“wb”和“rb”选项可以顺利工作。然而，在本例中，在转储之后，我似乎无法从上一步创建的google drive中的pickle文件加载它。错误存在于我的某个地方，无法指定如何在“return self.content.getvalue（）.decode（encoding）”处解码数据。我似乎无法从这里（）找到要修改的关键字/元数据标记。感谢您的帮助。谢谢

问题在于

GetContentString

仅在内容是有效的UTF-8字符串（）而pickle不是时才有效

不幸的是，您需要做一些额外的工作，因为没有

GetContentBytes

——您必须将内容保存到一个文件中，然后将其读回。下面是一个工作示例：

事实上，在朋友们的帮助下，我找到了一个优雅的答案。我使用的不是GetContentString，而是GetContentFile，它是SetContentFile的对应项。这将加载当前工作区中的文件，我可以像读取任何pickle文件一样从中读取该文件。最后，数据被很好地加载到cas_nums中

cas_strings = drive.CreateFile({'id':'1ZgZfEaKgqGnuBD40CY8zg0MCiqKmi1vH'})
print('title: %s, mimeType: %s' % (cas_strings['title'], cas_strings['mimeType']))
cas_strings.GetContentFile(cas_strings['title'])
cas_nums = pickle.load(open(cas_strings['title'],'rb'))

有关这方面的更多详细信息，请参阅pydrive文档中的下载文件内容部分-

我看到您调用了pickle.dump，但您从未调用过pickle.load——这可能是问题所在吗？这是一个很好的观点。我想尝试一下，但不知道它在序列中的哪个位置工作。如果不同时使用pydrive，它将无法工作，因为只有pydrive可以从google Drive中检索id文件。谢谢，我的答案与你的答案几乎完全相同！