Python 如何将我使用Beautiful soup刮取的图像文件放入列表?
这是我用来从reddit上的r/pics获取所有图片并将其放入目录的代码。我希望能够将目录中的实际文件放入列表中。我一直在想怎么做Python 如何将我使用Beautiful soup刮取的图像文件放入列表?,python,web-scraping,beautifulsoup,python-requests,reddit,Python,Web Scraping,Beautifulsoup,Python Requests,Reddit,这是我用来从reddit上的r/pics获取所有图片并将其放入目录的代码。我希望能够将目录中的实际文件放入列表中。我一直在想怎么做 import requests from bs4 import BeautifulSoup as bs import os url = "https://www.reddit.com/r/pics/" r = requests.get(url) data = r.text soup = bs(data,'lxml') image_tags = soup.find
import requests
from bs4 import BeautifulSoup as bs
import os
url = "https://www.reddit.com/r/pics/"
r = requests.get(url)
data = r.text
soup = bs(data,'lxml')
image_tags = soup.findAll('img')
if not os.path.exists('direct'):
os.makedirs('direct')
os.chdir('direct')
x = 0
for image in image_tags:
try:
url = image['src']
source = requests.get(url)
if source.status_code == 200:
img_path = 'direct-' + str(x) +'.jpg'
with open(img_path, 'wb') as f:
f.write(requests.get(url).content)
f.close()
x+=1
except:
pass
编辑:这里是更新的代码,但仍在处理问题
import requests
from bs4 import BeautifulSoup as bs
import os
url = "https://www.reddit.com/r/drawing"
r = requests.get(url)
data = r.text
soup = bs(data,'lxml')
image_tags = soup.findAll('img')
if not os.path.exists('directory'):
os.makedirs('directory')
os.chdir('directory')
x = 0
mylist = []
for image in image_tags:
url = image['src']
source = requests.get(url)
if source.status_code == 200:
img_path = 'direct-' + str(x) +'.jpg'
with open(img_path, 'wb') as f:
f.write(requests.get(url).content)
mylist.append(img_path)
f.close()
x += 1
print(mylist)
在代码开头创建一个列表:
...
mylist = []
...
然后在获得每个图像后,将其添加到列表中
...
img_path = 'direct-' + str(x) +'.jpg'
mylist.append(img_path)
....
编辑:
我执行了您更新的代码,而图像\u标签
返回为空-实际上是
url = "https://www.reddit.com/r/drawing"
r = requests.get(url)
data = r.text
不包含任何图像。我想reddit提供了某种保护,防止您以这种方式获取图像
尝试添加print(data)
,您就会明白我的意思
您应该使用,这样reddit就不会限制您的请求。Ok,所以我将mylist=[]放在for循环之前。然后我将img_路径附加到mylist的“with open(img_path,'wb')下作为f:”。但一旦我要求打印(mylist),我仍然得到一个空列表:“[]”。有什么想法吗?@AliHalawa似乎很可靠。@AliHalawa还有,删除无用的
try/except
-它将错误隐藏在代码中,当错误消失时很难编写代码,你无法从错误消息中轻易知道错误。好的,我删除了它,但我仍然没有收到任何错误,mylist仍然存在empty@AliHalawa请编辑问题并将更新的代码添加到末尾。如果列表为空,则可能状态代码不是200,这将导致append()
部分永远不会执行,因为If