Python 美女群找到所有标题

Python 美女群找到所有标题,python,beautifulsoup,request,web-crawler,Python,Beautifulsoup,Request,Web Crawler,html是 结果-> from bs4 import BeautifulSoup as bs import requests bsURL = "https://r6.tracker.network/profile/pc/Spoit.GODSENT" respinse = requests.get(bsURL) html = bs(respinse.text, 'html.parser') title = html.find_all(class_='trn-defsta

html是

结果->

from bs4 import BeautifulSoup as bs
import requests


bsURL = "https://r6.tracker.network/profile/pc/Spoit.GODSENT"
respinse = requests.get(bsURL)
html = bs(respinse.text, 'html.parser')


title = html.find_all(class_='trn-defstat__value')[4]

print(title)

我该怎么办?

这应该有助于你:

<div class="trn-defstat__value">
<img src="https://trackercdn.com/rainbow6-ubi/assets/images/badge-ash.16913d82e3.png" style="height: 35px; padding-right: 8px;" title="ASH"/>
<img src="https://trackercdn.com/rainbow6-ubi/assets/images/badge-jager.600b2773be.png" style="height: 35px; padding-right: 8px;" title="JÄGER"/>
<img src="https://trackercdn.com/rainbow6-ubi/assets/images/badge-bandit.385144d970.png" style="height: 35px; padding-right: 8px;" title="BANDIT"/>
</div>
以下是完整的代码:

ASH
JÄGER
BANDIT
输出:

from bs4 import BeautifulSoup

html = """
<div class="trn-defstat__value">
    <img src="https://trackercdn.com/rainbow6-ubi/assets/images/badge-ash.16913d82e3.png" title="ASH" style="height:    35px; padding-right: 8px;"> 
    <img src="https://trackercdn.com/rainbow6-ubi/assets/images/badge-jager.600b2773be.png" title="JÄGER"   style="height: 35px; padding-right: 8px;">
    <img src="https://trackercdn.com/rainbow6-ubi/assets/images/badge-bandit.385144d970.png" title="BANDIT"     style="height: 35px; padding-right: 8px;">
</div>
"""
soup = BeautifulSoup(html,'html.parser')

imgs = soup.find_all('img')

for img in imgs:
    print(img['title'])
from bs4 import BeautifulSoup as bs
import requests

bsURL = "https://r6.tracker.network/profile/pc/Spoit.GODSENT"
respinse = requests.get(bsURL)
html = bs(respinse.text, 'html.parser')
divs = html.find_all('div',class_ = "trn-defstat__value")
imgs = []
for div in divs:
    try:
        imgs.append(div.find_all('img'))
    except:
        pass

imgs = [ele for ele in imgs if ele != []]
imgs = [j for sub in imgs for j in sub]

for img in imgs:
    print(img['title'])
只需使用
.get()
函数获取属性并传入属性名称

ASH
JÄGER
BANDIT
我建议您使用它,我相信它是一个更好的解析器

pip install html5lib

我似乎不明白您试图刮取的是网站的哪一部分,但请注意它,以便有时首先获得html代码的
,其中包含您要刮取的详细信息:)

此脚本将打印顶级操作员部分的所有
标题:

from bs4 import BeautifulSoup as bs 
import requests   
bsURL = "https://r6.tracker.network/profile/pc/Spoit.GODSENT" 

respinse = requests.get(bsURL) 



html = bs(respinse.content, 'html5lib')   


container = html.find("div", class_= "trn-defstat mb0 top-operators")


imgs = container.find_all("img")


for img in imgs:
     print(img.get("title"))
印刷品:

from bs4 import BeautifulSoup as bs
import requests


bsURL = "https://r6.tracker.network/profile/pc/Spoit.GODSENT"
respinse = requests.get(bsURL)
html = bs(respinse.text, 'html.parser')

# find Top Operators tag
operators = html.find(class_='trn-defstat__name', text='Top Operators')

for img in operators.find_next('div').find_all('img'):
    print(img['title'])

或使用CSS:

ASH
JÄGER
BANDIT

我将我的代码更改为这样``从bs4 import BeautifulSoup作为bs import requests bsURL=”“respinse=requests.get(bsURL)html=bs(respinse.text,'html.parser')imgs=html.find_all(“.trn-defstat\u value”)['img']用于img中的img:print(img.get(“title”)``但这是一个错误,我该怎么办@레이화이트raywhite检查我的更新代码。如果你觉得我的答案很有帮助,请接受我的答案。而且,你忘了接受我(对你之前提出的问题)给出的另一个答案作为最佳答案。确保你接受这个答案,即使是最好的答案。还要确保你对我的两个答案都投赞成票。非常感谢。什么是平均值
index=0 imgs=[ele代表imgs中的ele,如果ele!=[]]imgs=[j代表imgs中的sub代表imgs中的j]
@ice我同意你的观点。但我认为OP是tryna获取所有div标记,然后循环通过这些div标记,在bs import请求bsURL=”时从bs4 import beautifulsou获取图像标题
https://r6.tracker.network/profile/pc/Spoit.GODSENT“respinse=requests.get(bsURL)html=bs(respinse.text,'html5lib')imgs=html.find_all(img),class_=img中img的“trn-defstat\uu值”):打印(img.get(“title”))
我是这样写的,但是resalt是nulltry
response。content
而不是
response。text
类的值可能是错误的,或者你应该首先得到
content
标记,该标记包含你想要刮取的
imgs
标记。result是null…sameI已经检查了你试图刮取的网站很多标记都有类
trn-defstat\uu值
,因此在
soup
变量中直接
.find()
是不好的。我建议您首先找到包含这些
img
标记的html代码块。谢谢,我也会考虑这一点。
ASH
JÄGER
BANDIT
for img in html.select('.trn-defstat__name:contains("Top Operators") + * img'):
    print(img['title'])