Python 美女群找到所有标题
html是 结果->Python 美女群找到所有标题,python,beautifulsoup,request,web-crawler,Python,Beautifulsoup,Request,Web Crawler,html是 结果-> from bs4 import BeautifulSoup as bs import requests bsURL = "https://r6.tracker.network/profile/pc/Spoit.GODSENT" respinse = requests.get(bsURL) html = bs(respinse.text, 'html.parser') title = html.find_all(class_='trn-defsta
from bs4 import BeautifulSoup as bs
import requests
bsURL = "https://r6.tracker.network/profile/pc/Spoit.GODSENT"
respinse = requests.get(bsURL)
html = bs(respinse.text, 'html.parser')
title = html.find_all(class_='trn-defstat__value')[4]
print(title)
我该怎么办?这应该有助于你:
<div class="trn-defstat__value">
<img src="https://trackercdn.com/rainbow6-ubi/assets/images/badge-ash.16913d82e3.png" style="height: 35px; padding-right: 8px;" title="ASH"/>
<img src="https://trackercdn.com/rainbow6-ubi/assets/images/badge-jager.600b2773be.png" style="height: 35px; padding-right: 8px;" title="JÄGER"/>
<img src="https://trackercdn.com/rainbow6-ubi/assets/images/badge-bandit.385144d970.png" style="height: 35px; padding-right: 8px;" title="BANDIT"/>
</div>
以下是完整的代码:
ASH
JÄGER
BANDIT
输出:
from bs4 import BeautifulSoup
html = """
<div class="trn-defstat__value">
<img src="https://trackercdn.com/rainbow6-ubi/assets/images/badge-ash.16913d82e3.png" title="ASH" style="height: 35px; padding-right: 8px;">
<img src="https://trackercdn.com/rainbow6-ubi/assets/images/badge-jager.600b2773be.png" title="JÄGER" style="height: 35px; padding-right: 8px;">
<img src="https://trackercdn.com/rainbow6-ubi/assets/images/badge-bandit.385144d970.png" title="BANDIT" style="height: 35px; padding-right: 8px;">
</div>
"""
soup = BeautifulSoup(html,'html.parser')
imgs = soup.find_all('img')
for img in imgs:
print(img['title'])
from bs4 import BeautifulSoup as bs
import requests
bsURL = "https://r6.tracker.network/profile/pc/Spoit.GODSENT"
respinse = requests.get(bsURL)
html = bs(respinse.text, 'html.parser')
divs = html.find_all('div',class_ = "trn-defstat__value")
imgs = []
for div in divs:
try:
imgs.append(div.find_all('img'))
except:
pass
imgs = [ele for ele in imgs if ele != []]
imgs = [j for sub in imgs for j in sub]
for img in imgs:
print(img['title'])
只需使用.get()
函数获取属性并传入属性名称
ASH
JÄGER
BANDIT
我建议您使用它,我相信它是一个更好的解析器
pip install html5lib
我似乎不明白您试图刮取的是网站的哪一部分,但请注意它,以便有时首先获得html代码的
块,其中包含您要刮取的详细信息:)此脚本将打印顶级操作员部分的所有
标题:
from bs4 import BeautifulSoup as bs
import requests
bsURL = "https://r6.tracker.network/profile/pc/Spoit.GODSENT"
respinse = requests.get(bsURL)
html = bs(respinse.content, 'html5lib')
container = html.find("div", class_= "trn-defstat mb0 top-operators")
imgs = container.find_all("img")
for img in imgs:
print(img.get("title"))
印刷品:
from bs4 import BeautifulSoup as bs
import requests
bsURL = "https://r6.tracker.network/profile/pc/Spoit.GODSENT"
respinse = requests.get(bsURL)
html = bs(respinse.text, 'html.parser')
# find Top Operators tag
operators = html.find(class_='trn-defstat__name', text='Top Operators')
for img in operators.find_next('div').find_all('img'):
print(img['title'])
或使用CSS:
ASH
JÄGER
BANDIT
我将我的代码更改为这样``从bs4 import BeautifulSoup作为bs import requests bsURL=”“respinse=requests.get(bsURL)html=bs(respinse.text,'html.parser')imgs=html.find_all(“.trn-defstat\u value”)['img']用于img中的img:print(img.get(“title”)``但这是一个错误,我该怎么办@레이화이트raywhite检查我的更新代码。如果你觉得我的答案很有帮助,请接受我的答案。而且,你忘了接受我(对你之前提出的问题)给出的另一个答案作为最佳答案。确保你接受这个答案,即使是最好的答案。还要确保你对我的两个答案都投赞成票。非常感谢。什么是平均值index=0 imgs=[ele代表imgs中的ele,如果ele!=[]]imgs=[j代表imgs中的sub代表imgs中的j]
@ice我同意你的观点。但我认为OP是tryna获取所有div标记,然后循环通过这些div标记,在bs import请求bsURL=”时从bs4 import beautifulsou获取图像标题https://r6.tracker.network/profile/pc/Spoit.GODSENT“respinse=requests.get(bsURL)html=bs(respinse.text,'html5lib')imgs=html.find_all(img),class_=img中img的“trn-defstat\uu值”):打印(img.get(“title”))
我是这样写的,但是resalt是nulltryresponse。content
而不是response。text
类的值可能是错误的,或者你应该首先得到content
标记,该标记包含你想要刮取的imgs
标记。result是null…sameI已经检查了你试图刮取的网站很多标记都有类trn-defstat\uu值
,因此在soup
变量中直接.find()
是不好的。我建议您首先找到包含这些img
标记的html代码块。谢谢,我也会考虑这一点。
ASH
JÄGER
BANDIT
for img in html.select('.trn-defstat__name:contains("Top Operators") + * img'):
print(img['title'])