使用Python下载一个文件，而不使用类似于Chrome的selenium'；s"；将链接另存为"；_Python_Selenium_Google Chrome_Python Requests_Urllib2

使用Python下载一个文件，而不使用类似于Chrome的selenium'；s"；将链接另存为"；

python selenium google-chrome

使用Python下载一个文件，而不使用类似于Chrome的selenium'；s"；将链接另存为"；,python,selenium,google-chrome,python-requests,urllib2,Python,Selenium,Google Chrome,Python Requests,Urllib2,我可以在chrome中使用“另存链接为”选项下载zip文件，但当我在浏览器中复制并打开该文件时，返回403/禁止。我试图使用请求库保存文件，但它也得到了禁止响应我不知道chrome如何下载它，但我不能使用请求库下载我如何在不使用selenium web驱动程序的情况下下载该文件，因为对于这个简单的任务来说，这将是一个过度的任务我建议对此使用请求。下面填写第一个文件的简单示例： url = 'https://www.nseindia.com/content/historical/EQUITI

我可以在chrome中使用“另存链接为”选项下载zip文件，但当我在浏览器中复制并打开该文件时，返回403/禁止。我试图使用请求库保存文件，但它也得到了禁止响应

我不知道chrome如何下载它，但我不能使用请求库下载

我如何在不使用selenium web驱动程序的情况下下载该文件，因为对于这个简单的任务来说，这将是一个过度的任务

我建议对此使用请求。下面填写第一个文件的简单示例：

url = 'https://www.nseindia.com/content/historical/EQUITIES/2003/DEC/cm01DEC2003bhav.csv.zip'
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.93 Safari/537.36', 'Referer': 'https://www.nseindia.com/'}
r = requests.get(url, allow_redirects=True, headers=headers)
open('cm01DEC2003bhav.csv.zip', 'wb').write(r.content)

网站会在标题中检查referer，如果referer与网站本身不匹配，则会拒绝请求。

使用

urllib.request.urlretrieve

自定义

referer

标题，如@Douglas指定：

>>> import urllib.request
>>> opener = urllib.request.build_opener()
>>> opener.addheaders = [('Referer', 'https://www.nseindia.com/')]
>>> urllib.request.install_opener(opener)
>>> source = 'https://www.nseindia.com/content/historical/EQUITIES/2001/JAN/cm01JAN2001bhav.csv.zip'
>>> destination = 'destination.csv.zip'  # Path to destination.
>>> urllib.request.urlretrieve(source, destination)
('destination.csv.zip', <http.client.HTTPMessage object at 0x10ce20208>)

导入urllib.request >>>opener=urllib.request.build\u opener（） >>>opener.addheaders=[（'Referer'，'https://www.nseindia.com/')] >>>urllib.request.install_opener（opener） >>>资料来源：https://www.nseindia.com/content/historical/EQUITIES/2001/JAN/cm01JAN2001bhav.csv.zip' >>>destination='destination.csv.zip'#到目的地的路径。 >>>urllib.request.urlretrieve（源、目标）（'destination.csv.zip'，）这将把您的文件下载到指定的文件路径

以上这些对我很有用。您使用的是什么版本的Python？是否检查了请求的状态代码？如果你检查它将是403，你将无法打开zip文件，即使它是下载的，因为里面什么都没有。这是我在写一个空白zip文件之前没有检查r.content得到的。你说得对。我正在对我的原始脚本进行编辑，看看我是否能让你到达终点。@JagjeetSingh我们需要在headerGood答案中为网站本身添加referer，但我认为他不需要第三方库

urllib.request

可以完美地处理这个问题。