Python 美女群找到所有标题_Python_Beautifulsoup_Request_Web Crawler

Python 美女群找到所有标题

python web-crawler

Python 美女群找到所有标题,python,beautifulsoup,request,web-crawler,Python,Beautifulsoup,Request,Web Crawler,html是结果-> from bs4 import BeautifulSoup as bs import requests bsURL = "https://r6.tracker.network/profile/pc/Spoit.GODSENT" respinse = requests.get(bsURL) html = bs(respinse.text, 'html.parser') title = html.find_all(class_='trn-defsta

html是

结果->

from bs4 import BeautifulSoup as bs
import requests


bsURL = "https://r6.tracker.network/profile/pc/Spoit.GODSENT"
respinse = requests.get(bsURL)
html = bs(respinse.text, 'html.parser')


title = html.find_all(class_='trn-defstat__value')[4]

print(title)

我该怎么办？

这应该有助于你：

<div class="trn-defstat__value">
<img src="https://trackercdn.com/rainbow6-ubi/assets/images/badge-ash.16913d82e3.png" style="height: 35px; padding-right: 8px;" title="ASH"/>
<img src="https://trackercdn.com/rainbow6-ubi/assets/images/badge-jager.600b2773be.png" style="height: 35px; padding-right: 8px;" title="JÄGER"/>
<img src="https://trackercdn.com/rainbow6-ubi/assets/images/badge-bandit.385144d970.png" style="height: 35px; padding-right: 8px;" title="BANDIT"/>
</div>

以下是完整的代码：

ASH
JÄGER
BANDIT

输出：

from bs4 import BeautifulSoup

html = """
<div class="trn-defstat__value">
    <img src="https://trackercdn.com/rainbow6-ubi/assets/images/badge-ash.16913d82e3.png" title="ASH" style="height:    35px; padding-right: 8px;"> 
    <img src="https://trackercdn.com/rainbow6-ubi/assets/images/badge-jager.600b2773be.png" title="JÄGER"   style="height: 35px; padding-right: 8px;">
    <img src="https://trackercdn.com/rainbow6-ubi/assets/images/badge-bandit.385144d970.png" title="BANDIT"     style="height: 35px; padding-right: 8px;">
</div>
"""
soup = BeautifulSoup(html,'html.parser')

imgs = soup.find_all('img')

for img in imgs:
    print(img['title'])

from bs4 import BeautifulSoup as bs
import requests

bsURL = "https://r6.tracker.network/profile/pc/Spoit.GODSENT"
respinse = requests.get(bsURL)
html = bs(respinse.text, 'html.parser')
divs = html.find_all('div',class_ = "trn-defstat__value")
imgs = []
for div in divs:
    try:
        imgs.append(div.find_all('img'))
    except:
        pass

imgs = [ele for ele in imgs if ele != []]
imgs = [j for sub in imgs for j in sub]

for img in imgs:
    print(img['title'])

只需使用

.get（）

函数获取属性并传入属性名称

ASH
JÄGER
BANDIT

我建议您使用它，我相信它是一个更好的解析器

pip install html5lib

我似乎不明白您试图刮取的是网站的哪一部分，但请注意它，以便有时首先获得html代码的

块，其中包含您要刮取的详细信息：）
此脚本将打印顶级操作员部分的所有
标题：
from bs4 import BeautifulSoup as bs 
import requests   
bsURL = "https://r6.tracker.network/profile/pc/Spoit.GODSENT" 

respinse = requests.get(bsURL) 



html = bs(respinse.content, 'html5lib')   


container = html.find("div", class_= "trn-defstat mb0 top-operators")


imgs = container.find_all("img")


for img in imgs:
     print(img.get("title"))

印刷品：
from bs4 import BeautifulSoup as bs
import requests


bsURL = "https://r6.tracker.network/profile/pc/Spoit.GODSENT"
respinse = requests.get(bsURL)
html = bs(respinse.text, 'html.parser')

# find Top Operators tag
operators = html.find(class_='trn-defstat__name', text='Top Operators')

for img in operators.find_next('div').find_all('img'):
    print(img['title'])


或使用CSS：
ASH
JÄGER
BANDIT

我将我的代码更改为这样``从bs4 import BeautifulSoup作为bs import requests bsURL=”“respinse=requests.get（bsURL）html=bs（respinse.text，'html.parser'）imgs=html.find_all（“.trn-defstat\u value”）['img']用于img中的img:print（img.get（“title”）``但这是一个错误，我该怎么办@레이화이트raywhite检查我的更新代码。如果你觉得我的答案很有帮助，请接受我的答案。而且，你忘了接受我（对你之前提出的问题）给出的另一个答案作为最佳答案。确保你接受这个答案，即使是最好的答案。还要确保你对我的两个答案都投赞成票。非常感谢。什么是平均值index=0 imgs=[ele代表imgs中的ele，如果ele！=[]]imgs=[j代表imgs中的sub代表imgs中的j]
@ice我同意你的观点。但我认为OP是tryna获取所有div标记，然后循环通过这些div标记，在bs import请求bsURL=”时从bs4 import beautifulsou获取图像标题https://r6.tracker.network/profile/pc/Spoit.GODSENT“respinse=requests.get（bsURL）html=bs（respinse.text，'html5lib'）imgs=html.find_all（img），class_=img中img的“trn-defstat\uu值”）：打印（img.get（“title”））
我是这样写的，但是resalt是nulltryresponse。content
而不是response。text
类的值可能是错误的，或者你应该首先得到content
标记，该标记包含你想要刮取的imgs
标记。result是null…sameI已经检查了你试图刮取的网站很多标记都有类trn-defstat\uu值
，因此在soup
变量中直接.find（）
是不好的。我建议您首先找到包含这些img标记的html代码块。谢谢，我也会考虑这一点。
ASH
JÄGER
BANDIT

for img in html.select('.trn-defstat__name:contains("Top Operators") + * img'):
    print(img['title'])