如何使用Python从网站下载所有Zip文件_Python_Python 3.x_Beautifulsoup_Zip

如何使用Python从网站下载所有Zip文件

python python-3.x

如何使用Python从网站下载所有Zip文件,python,python-3.x,beautifulsoup,zip,Python,Python 3.x,Beautifulsoup,Zip,我正在尝试从这个：网页下载所有压缩文件充分披露，我不是一个专业的程序员，所以如果我犯了一些愚蠢的错误，请原谅我这是我的代码： from bs4 import BeautifulSoup import requests url = "https://www.google.com/googlebooks/uspto-patents-grants-text.html" html = requests.get(url) soup = BeautifulSoup(html.t

我正在尝试从这个：网页下载所有压缩文件

充分披露，我不是一个专业的程序员，所以如果我犯了一些愚蠢的错误，请原谅我

这是我的代码：

from bs4 import BeautifulSoup            
import requests

url = "https://www.google.com/googlebooks/uspto-patents-grants-text.html"
html = requests.get(url)
soup = BeautifulSoup(html.text, "html.parser")

for link in soup.find_all('a', href=True):
    href = link['href']

    if any(href.endswith(x) for x in ['.zip']):
    #if any(href.endswith('.zip')):
        print("Downloading '{}'".format(href))
        remote_file = requests.get(url + href)

        with open(href, 'wb') as f:
            for chunk in remote_file.iter_content(chunk_size=1024): 
                if chunk: 
                    f.write(chunk)

我运行代码时遇到的错误是：文件“C:/Users/#USER#/#FILEPATH#/Python/patentzipscrawer2.py”，第16行，在将open（href，'wb'）作为f： OSError:[Errno 22]无效参数：'

但是，当我在浏览器中键入该地址时，我可以下载压缩文件。我猜这和压缩文件的格式有关，我不一定能直接下载/打开它们，但我不确定是什么。我所基于的代码是下载文件，你可以直接下载（比如.txt）

任何关于如何下载这些拉链的帮助都将不胜感激

在代码中实现如下内容：

import urllib

archive = urllib.request.URLopener()
archive.retrieve("http://yoursite.com/file.zip", "file.zip")

您想下载1976年的所有数据-…？您正在尝试创建一个名为

'python'的文件以供编写http://storage.googleapis.com/patents/grant_full_text/2015/ipg150106.zip“

。大概

open

不喜欢这个名字。另一个奇怪的地方是：

remote\u file=requests.get（url+href）

，但是

url+href

会改成

”https://www.google.com/googlebooks/uspto-patents-grants-text.htmlhttp://storage.googleapis.com/patents/grant_full_text/2015/‌ipg150106.zip“

。它不应该是

remote\u file=requests.get（href）

或者

以open（os.path.basename（href），'wb'）作为f:

的方式写入

'ipg15006.zip'

。您好，我已经尝试了所有这些更改，但仍然得到相同的错误（除了当我使用os.path.basename时，然后我得到一个错误：with open）（os.path.basename（href），'wb'）作为f:name错误：未定义名称'os'，是否应以其他方式写入文件？