Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/343.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在python中从xls读取unicode_Python_Xls_Xlrd - Fatal编程技术网

在python中从xls读取unicode

在python中从xls读取unicode,python,xls,xlrd,Python,Xls,Xlrd,我试图用Python读入一个.xls文件。该文件包含多个非ascii字符(即äöü)。我尝试了openpyxls和xlrd(我对xlrd寄予厚望,因为它应该以unicode读取所有内容),但两种方法都不起作用 我在尝试打印xls中的信息时,发现了许多关于编码/解码的答案,但我似乎都走不了那么远。仅在尝试读取文件后,此纸条就会出错: import xlrd workbook = xlrd.open_workbook('export_data.xls') 导致: Traceback (most r

我试图用Python读入一个.xls文件。该文件包含多个非ascii字符(即äöü)。我尝试了openpyxls和xlrd(我对xlrd寄予厚望,因为它应该以unicode读取所有内容),但两种方法都不起作用

我在尝试打印xls中的信息时,发现了许多关于编码/解码的答案,但我似乎都走不了那么远。仅在尝试读取文件后,此纸条就会出错:

import xlrd
workbook = xlrd.open_workbook('export_data.xls')
导致:

Traceback (most recent call last):
  File "C:\Users\Administrator\workspace\tufinderxlstoxml\tufinderxlstoxml2.py", line 2, in <module>
    workbook = xlrd.open_workbook('export_data.xls')
  File "C:\Python27_32\lib\site-packages\xlrd\__init__.py", line 435, in open_workbook
    ragged_rows=ragged_rows,
  File "C:\Python27_32\lib\site-packages\xlrd\book.py", line 119, in open_workbook_xls
    bk.get_sheets()
  File "C:\Python27_32\lib\site-packages\xlrd\book.py", line 705, in get_sheets
    self.get_sheet(sheetno)
  File "C:\Python27_32\lib\site-packages\xlrd\book.py", line 696, in get_sheet
    sh.read(self)
  File "C:\Python27_32\lib\site-packages\xlrd\sheet.py", line 796, in read
    strg = unpack_string(data, 6, bk.encoding or bk.derive_encoding(), lenlen=2)
  File "C:\Python27_32\lib\site-packages\xlrd\biffh.py", line 269, in unpack_string
    return unicode(data[pos:pos+nchars], encoding)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x92 in position 55: ordinal not in range(128)
WARNING *** OLE2 inconsistency: SSCS size is 0 but SSAT size is non-zero
*** No CODEPAGE record, no encoding_override: will use 'ascii'
*** No CODEPAGE record, no encoding_override: will use 'ascii'
Traceback (most recent call last):
  File "C:\Users\Administrator\workspace\tufinderxlstoxml\tufinderxlstoxml2.py", line 2, in <module>
    workbook = xlrd.open_workbook('export_data.xls', encoding_override="utf-8")
  File "C:\Python27_32\lib\site-packages\xlrd\__init__.py", line 435, in open_workbook
    ragged_rows=ragged_rows,
  File "C:\Python27_32\lib\site-packages\xlrd\book.py", line 119, in open_workbook_xls
    bk.get_sheets()
  File "C:\Python27_32\lib\site-packages\xlrd\book.py", line 705, in get_sheets
    self.get_sheet(sheetno)
  File "C:\Python27_32\lib\site-packages\xlrd\book.py", line 696, in get_sheet
    sh.read(self)
  File "C:\Python27_32\lib\site-packages\xlrd\sheet.py", line 796, in read
    strg = unpack_string(data, 6, bk.encoding or bk.derive_encoding(), lenlen=2)
  File "C:\Python27_32\lib\site-packages\xlrd\biffh.py", line 269, in unpack_string
    return unicode(data[pos:pos+nchars], encoding)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in position 55: invalid start byte
WARNING *** OLE2 inconsistency: SSCS size is 0 but SSAT size is non-zero
导致:

Traceback (most recent call last):
  File "C:\Users\Administrator\workspace\tufinderxlstoxml\tufinderxlstoxml2.py", line 2, in <module>
    workbook = xlrd.open_workbook('export_data.xls')
  File "C:\Python27_32\lib\site-packages\xlrd\__init__.py", line 435, in open_workbook
    ragged_rows=ragged_rows,
  File "C:\Python27_32\lib\site-packages\xlrd\book.py", line 119, in open_workbook_xls
    bk.get_sheets()
  File "C:\Python27_32\lib\site-packages\xlrd\book.py", line 705, in get_sheets
    self.get_sheet(sheetno)
  File "C:\Python27_32\lib\site-packages\xlrd\book.py", line 696, in get_sheet
    sh.read(self)
  File "C:\Python27_32\lib\site-packages\xlrd\sheet.py", line 796, in read
    strg = unpack_string(data, 6, bk.encoding or bk.derive_encoding(), lenlen=2)
  File "C:\Python27_32\lib\site-packages\xlrd\biffh.py", line 269, in unpack_string
    return unicode(data[pos:pos+nchars], encoding)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x92 in position 55: ordinal not in range(128)
WARNING *** OLE2 inconsistency: SSCS size is 0 but SSAT size is non-zero
*** No CODEPAGE record, no encoding_override: will use 'ascii'
*** No CODEPAGE record, no encoding_override: will use 'ascii'
Traceback (most recent call last):
  File "C:\Users\Administrator\workspace\tufinderxlstoxml\tufinderxlstoxml2.py", line 2, in <module>
    workbook = xlrd.open_workbook('export_data.xls', encoding_override="utf-8")
  File "C:\Python27_32\lib\site-packages\xlrd\__init__.py", line 435, in open_workbook
    ragged_rows=ragged_rows,
  File "C:\Python27_32\lib\site-packages\xlrd\book.py", line 119, in open_workbook_xls
    bk.get_sheets()
  File "C:\Python27_32\lib\site-packages\xlrd\book.py", line 705, in get_sheets
    self.get_sheet(sheetno)
  File "C:\Python27_32\lib\site-packages\xlrd\book.py", line 696, in get_sheet
    sh.read(self)
  File "C:\Python27_32\lib\site-packages\xlrd\sheet.py", line 796, in read
    strg = unpack_string(data, 6, bk.encoding or bk.derive_encoding(), lenlen=2)
  File "C:\Python27_32\lib\site-packages\xlrd\biffh.py", line 269, in unpack_string
    return unicode(data[pos:pos+nchars], encoding)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in position 55: invalid start byte
WARNING *** OLE2 inconsistency: SSCS size is 0 but SSAT size is non-zero

我在WindowsServer2008机器上的Python2.7上运行这个

根据我对OOo文档的阅读,xls使用了unicode的utf_16_le风格,而不是utf8(也就是说,它每个字符只使用两个字节存储在little endian中),因此请尝试:


(请参阅第17页的)

有点晚了,但我希望您尝试过编码。

谢谢大家的反馈

我最终用编码覆盖函数修复了它。我无法找到cp代码对应于德语字符的Microsoft文档,所以我尝试了所有这些代码。最终我找到了cp1251,它成功了

workbook = xlrd.open_workbook(path, encoding_override="cp1251")
workbook = xlrd.open_workbook(path, encoding_override="cp1251")