Python 从URL打开gzip文件时出错

Python 从URL打开gzip文件时出错,python,pandas,gzip,Python,Pandas,Gzip,我正在尝试从我们的服务器检索一个gzip csv,并将该文件加载到一个数据帧中 在pandas文档中,pandas.read_csv接受有效的URL方案,如http、ftp、s3和file。我使用的链接是一个https,不要认为这会引起问题 我试过两种方法让它工作 方法1: import pandas as pd print "Downloading file" link = 'https://myserver/logfile.csv.gz' df = pd.read_csv(link,

我正在尝试从我们的服务器检索一个gzip csv,并将该文件加载到一个数据帧中

在pandas文档中,pandas.read_csv接受有效的URL方案,如http、ftp、s3和file。我使用的链接是一个
https
,不要认为这会引起问题

我试过两种方法让它工作

方法1:

import pandas as pd


print "Downloading file" 
link = 'https://myserver/logfile.csv.gz'

df = pd.read_csv(link, compression='gzip', header=0, sep=',', quotechar='"')

print df
这不起作用,我得到以下错误

Traceback (most recent call last):
  File "download.py", line 14, in <module>
    df = pd.read_csv(link, compression='gzip', header=0, sep=',', quotechar='"')
  File "/usr/local/lib/python2.7/dist-packages/pandas-0.16.0_79_g9e4e447-py2.7-linux-x86_64.egg/pandas/io/parsers.py", line 470, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/usr/local/lib/python2.7/dist-packages/pandas-0.16.0_79_g9e4e447-py2.7-linux-x86_64.egg/pandas/io/parsers.py", line 246, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/usr/local/lib/python2.7/dist-packages/pandas-0.16.0_79_g9e4e447-py2.7-linux-x86_64.egg/pandas/io/parsers.py", line 562, in __init__
    self._make_engine(self.engine)
  File "/usr/local/lib/python2.7/dist-packages/pandas-0.16.0_79_g9e4e447-py2.7-linux-x86_64.egg/pandas/io/parsers.py", line 699, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "/usr/local/lib/python2.7/dist-packages/pandas-0.16.0_79_g9e4e447-py2.7-linux-x86_64.egg/pandas/io/parsers.py", line 1066, in __init__
    self._reader = _parser.TextReader(src, **kwds)
  File "pandas/parser.pyx", line 509, in pandas.parser.TextReader.__cinit__ (pandas/parser.c:4722)
  File "pandas/parser.pyx", line 624, in pandas.parser.TextReader._get_header (pandas/parser.c:6111)
  File "pandas/parser.pyx", line 820, in pandas.parser.TextReader._tokenize_rows (pandas/parser.c:8142)
  File "pandas/parser.pyx", line 1758, in pandas.parser.raise_parser_error (pandas/parser.c:20728)
pandas.parser.CParserError: Error tokenizing data. C error: Calling read(nbytes) on source failed. Try engine='python'.
我得到以下错误

Traceback (most recent call last):
  File "download.py", line 12, in <module>
    gz = gzip.GzipFile(StringIO.StringIO(r.content))
  File "/usr/lib/python2.7/gzip.py", line 89, in __init__
    fileobj = self.myfileobj = __builtin__.open(filename, mode or 'rb')
TypeError: coercing to Unicode: need string or buffer, instance found
回溯(最近一次呼叫最后一次):
文件“download.py”,第12行,在
gz=gzip.gzip文件(StringIO.StringIO(r.content))
文件“/usr/lib/python2.7/gzip.py”,第89行,在__
fileobj=self.myfileobj=\uuuuuu内置\uuuuuu.open(文件名、模式或'rb')
TypeError:强制使用Unicode:需要字符串或缓冲区,找到实例

StringIO
创建文件对象,但
GzipFile
需要文件名。
StringIO
创建文件对象,但
GzipFile
需要文件名。
Traceback (most recent call last):
  File "download.py", line 12, in <module>
    gz = gzip.GzipFile(StringIO.StringIO(r.content))
  File "/usr/lib/python2.7/gzip.py", line 89, in __init__
    fileobj = self.myfileobj = __builtin__.open(filename, mode or 'rb')
TypeError: coercing to Unicode: need string or buffer, instance found