Python 从URL打开gzip文件时出错
我正在尝试从我们的服务器检索一个gzip csv,并将该文件加载到一个数据帧中 在pandas文档中,pandas.read_csv接受有效的URL方案,如http、ftp、s3和file。我使用的链接是一个Python 从URL打开gzip文件时出错,python,pandas,gzip,Python,Pandas,Gzip,我正在尝试从我们的服务器检索一个gzip csv,并将该文件加载到一个数据帧中 在pandas文档中,pandas.read_csv接受有效的URL方案,如http、ftp、s3和file。我使用的链接是一个https,不要认为这会引起问题 我试过两种方法让它工作 方法1: import pandas as pd print "Downloading file" link = 'https://myserver/logfile.csv.gz' df = pd.read_csv(link,
https
,不要认为这会引起问题
我试过两种方法让它工作
方法1:
import pandas as pd
print "Downloading file"
link = 'https://myserver/logfile.csv.gz'
df = pd.read_csv(link, compression='gzip', header=0, sep=',', quotechar='"')
print df
这不起作用,我得到以下错误
Traceback (most recent call last):
File "download.py", line 14, in <module>
df = pd.read_csv(link, compression='gzip', header=0, sep=',', quotechar='"')
File "/usr/local/lib/python2.7/dist-packages/pandas-0.16.0_79_g9e4e447-py2.7-linux-x86_64.egg/pandas/io/parsers.py", line 470, in parser_f
return _read(filepath_or_buffer, kwds)
File "/usr/local/lib/python2.7/dist-packages/pandas-0.16.0_79_g9e4e447-py2.7-linux-x86_64.egg/pandas/io/parsers.py", line 246, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/usr/local/lib/python2.7/dist-packages/pandas-0.16.0_79_g9e4e447-py2.7-linux-x86_64.egg/pandas/io/parsers.py", line 562, in __init__
self._make_engine(self.engine)
File "/usr/local/lib/python2.7/dist-packages/pandas-0.16.0_79_g9e4e447-py2.7-linux-x86_64.egg/pandas/io/parsers.py", line 699, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/usr/local/lib/python2.7/dist-packages/pandas-0.16.0_79_g9e4e447-py2.7-linux-x86_64.egg/pandas/io/parsers.py", line 1066, in __init__
self._reader = _parser.TextReader(src, **kwds)
File "pandas/parser.pyx", line 509, in pandas.parser.TextReader.__cinit__ (pandas/parser.c:4722)
File "pandas/parser.pyx", line 624, in pandas.parser.TextReader._get_header (pandas/parser.c:6111)
File "pandas/parser.pyx", line 820, in pandas.parser.TextReader._tokenize_rows (pandas/parser.c:8142)
File "pandas/parser.pyx", line 1758, in pandas.parser.raise_parser_error (pandas/parser.c:20728)
pandas.parser.CParserError: Error tokenizing data. C error: Calling read(nbytes) on source failed. Try engine='python'.
我得到以下错误
Traceback (most recent call last):
File "download.py", line 12, in <module>
gz = gzip.GzipFile(StringIO.StringIO(r.content))
File "/usr/lib/python2.7/gzip.py", line 89, in __init__
fileobj = self.myfileobj = __builtin__.open(filename, mode or 'rb')
TypeError: coercing to Unicode: need string or buffer, instance found
回溯(最近一次呼叫最后一次):
文件“download.py”,第12行,在
gz=gzip.gzip文件(StringIO.StringIO(r.content))
文件“/usr/lib/python2.7/gzip.py”,第89行,在__
fileobj=self.myfileobj=\uuuuuu内置\uuuuuu.open(文件名、模式或'rb')
TypeError:强制使用Unicode:需要字符串或缓冲区,找到实例
StringIO
创建文件对象,但GzipFile
需要文件名。StringIO
创建文件对象,但GzipFile
需要文件名。
Traceback (most recent call last):
File "download.py", line 12, in <module>
gz = gzip.GzipFile(StringIO.StringIO(r.content))
File "/usr/lib/python2.7/gzip.py", line 89, in __init__
fileobj = self.myfileobj = __builtin__.open(filename, mode or 'rb')
TypeError: coercing to Unicode: need string or buffer, instance found