Python:从Web加载存储在CSV中的zip文件

Python:从Web加载存储在CSV中的zip文件,python,csv,zip,Python,Csv,Zip,我安装了pandas 3.5(与您的一些建议相反),似乎无法理解为什么新代码无法从URL加载zip文件: import pandas as pd import numpy as np from io import StringIO from zipfile import ZipFile from urllib.request import urlopen url = urlopen("http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/f

我安装了pandas 3.5(与您的一些建议相反),似乎无法理解为什么新代码无法从URL加载zip文件:

import pandas as pd
import numpy as np
from io import StringIO
from zipfile import ZipFile
from urllib.request import urlopen
url = urlopen("http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/F-F_Research_Data_Factors_CSV.zip")

#Download Zipfile and create pandas DataFrame
zipfile = ZipFile(StringIO(url.read()))
FFdata = pd.read_csv(zipfile.open('F-F_Research_Data_Factors.CSV'), 
                     header = 0, names = ['Date','MKT-RF','SMB','HML','RF'], 
                     skiprows=3)
我相信它在urlopen函数上失败了。但当将URL替换为文本字符串时,它不起作用


有人知道发生了什么吗?谢谢大家!

运行您的程序时,我得到了错误

Traceback (most recent call last):
  File "c.py", line 9, in <module>
    zipfile = ZipFile(StringIO(url.read()))
TypeError: initial_value must be str or None, not bytes
解决办法很简单。。。。只需使用
io.BytesIO
对象即可。这是一个常见的错误,因为
StringIO
本可以在Python2中使用,而且很多示例都是基于2.x的

import pandas as pd
import numpy as np
from io import BytesIO
from zipfile import ZipFile
from urllib.request import urlopen
url = urlopen("http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/F-F_Research_Data_Factors_CSV.zip")

#Download Zipfile and create pandas DataFrame
zipfile = ZipFile(BytesIO(url.read()))
FFdata = pd.read_csv(zipfile.open('F-F_Research_Data_Factors.CSV'), 
                     header = 0, names = ['Date','MKT-RF','SMB','HML','RF'], 
                     skiprows=3)

你会得到一个堆栈帧,告诉你错误在哪里。发帖!
import pandas as pd
import numpy as np
from io import BytesIO
from zipfile import ZipFile
from urllib.request import urlopen
url = urlopen("http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/F-F_Research_Data_Factors_CSV.zip")

#Download Zipfile and create pandas DataFrame
zipfile = ZipFile(BytesIO(url.read()))
FFdata = pd.read_csv(zipfile.open('F-F_Research_Data_Factors.CSV'), 
                     header = 0, names = ['Date','MKT-RF','SMB','HML','RF'], 
                     skiprows=3)