从http服务器下载hdf5文件的Python方式是什么？_Python_Urllib_Hdf5

从http服务器下载hdf5文件的Python方式是什么？

python

从http服务器下载hdf5文件的Python方式是什么？,python,urllib,hdf5,Python,Urllib,Hdf5,我正在尝试从http服务器下载hdf5文件。我可以用python子流程模块和wget来实现这一点，但我觉得我在作弊 # wget solution import subprocess url = 'http://url/to/file.h5' subprocess(['wget', '--proxy=off', url]) 我还可以使用urllib和请求模块下载以下图像： # requests solution url2 = 'http://u

我正在尝试从http服务器下载hdf5文件。我可以用python子流程模块和wget来实现这一点，但我觉得我在作弊

    # wget solution
    import subprocess
    url = 'http://url/to/file.h5' 
    subprocess(['wget', '--proxy=off', url])

我还可以使用urllib和请求模块下载以下图像：

    # requests solution
    url2 = 'http://url/to/image.png'
    r = requests.get(url2)
    with open('image.png', 'wb') as img:
    img.write(r.content)

    # urllib solution
    urllib.urlretrieve(url2, 'outfile.png')

但是，当我尝试使用此方法下载hdf5文件并运行shell命令“file”时，我得到：

    >file test.h5 
    >test.h5: HTML document, ASCII text, with very long lines

这是requests.get（）的头（不确定是否有用）

我应该在子流程中使用wget还是有pythonic解决方案？

解决方案： 问题的原因是，我在尝试下载文件之前没有禁用代理，因此，传输被拦截。这段代码成功了

    import urllib2
    proxy_handler = urllib2.ProxyHandler({})
    opener = urllib2.build_opener(proxy_handler)
    urllib2.install_opener(opener)

    url = 'http://url/to/file.h5'

    req = urllib2.Request(url)
    r = opener.open(req)
    result = r.read()

    with open('my_file.h5', 'wb') as f:
        f.write(result)

尝试使用

urllib.geturl

获取真实的URL（在重定向之后），然后使用urllib将其传递到

urlretrieve

，您实际查看过该文件吗？听起来更像是截获了请求，而您正在下载一个html文档，而不是您正在寻找的png。那么也许你能找到更好的链接？你是对的，请求被拦截了！当我禁用代理并做了一些其他事情后，我终于得到了正确的文件。这是我第一次发表这样的帖子。可以将我对原始问题的完整解决方案包括在内吗？我应该编辑问题的标题吗？是的，你可以编辑你的问题，然后发布你的答案。

    import urllib2
    proxy_handler = urllib2.ProxyHandler({})
    opener = urllib2.build_opener(proxy_handler)
    urllib2.install_opener(opener)

    url = 'http://url/to/file.h5'

    req = urllib2.Request(url)
    r = opener.open(req)
    result = r.read()

    with open('my_file.h5', 'wb') as f:
        f.write(result)