Python 如何获得多个类到BeautifulSoup？_Python_Beautifulsoup_Python Webbrowser

Python 如何获得多个类到BeautifulSoup？

python

Python 如何获得多个类到BeautifulSoup？,python,beautifulsoup,python-webbrowser,Python,Beautifulsoup,Python Webbrowser,正在尝试从SkidRowReload获取torrent链接在post detail页面上，我们有一个这样的div，我尝试按id获取该div，但我认为id是动态的，因此我尝试按类获取该div，但没有成功 <div id="tabs-105235-0-0" aria-labelledby="ui-id-1" class="ui-tabs-panel ui-widget-content ui-corner-bottom" role=&

正在尝试从SkidRowReload获取torrent链接

在post detail页面上，我们有一个这样的div，我尝试按id获取该div，但我认为id是动态的，因此我尝试按类获取该div，但没有成功

<div id="tabs-105235-0-0" aria-labelledby="ui-id-1" class="ui-tabs-panel ui-widget-content ui-corner-bottom" role="tabpanel" aria-hidden="false">
错误：
完整代码：

import os from bs4 import BeautifulSoup import requests import webbrowser clear = lambda: os.system('cls') clear() r = requests.get('https://www.skidrowreloaded.com/') source = BeautifulSoup(r.content,"lxml") source2 = source.find_all("h2") games = [] for i in source2: games.append(i.a.get("href")) lastgame = games[0] r = requests.get(lastgame) source = BeautifulSoup(r.content,"lxml") source2 = source.find("div", {"class": "ui-tabs-panel ui-widget-content ui-corner-bottom"}) source3 = source2.find_all("a") k = 0; for i in source3: if k == 0: #hide steam link. k = k + 1 else: if i.get("href") == "https://www.skidrowreloaded.com": #hide null links pass else: #throwing links to the browser print(i.get("href")) webbrowser.open(i.get("href")) k = k + 1

如中所述，您可以使用
find_all
编辑- 查看
response.text
，
div
存在，但没有要查找的
类，因此返回空。您可以像这样使用正则表达式进行搜索 import requests, re from bs4 import BeautifulSoup response = requests.get("your URL here") soup = BeautifulSoup(response.text, 'html.parser') raw_data = soup.find_all("div", id=re.compile("^tabs")) for ele in raw_data: a_tag = ele.find("a") # do something with the a_tag 要获取所有链接，请尝试以下操作： import requests from bs4 import BeautifulSoup url = "https://www.skidrowreloaded.com/projection-first-light-goldberg/" soup = BeautifulSoup(requests.get(url).text, "html.parser").find_all("a", {"target": "_blank"}) skip = 'https://www.skidrowreloaded.com' print([a['href'] for a in soup if a['href'].startswith('https') and a['href'] != skip]) 输出： ['https://store.steampowered.com/app/726490/Projection_First_Light/', 'https://mega.nz/file/geogAATS#-0U0PklF-Q5i5l_SELzYx3klh5FZob9HaD4QKcFH_8M', 'https://uptobox.com/rqnlpcp7yb3v', 'https://1fichier.com/?0syphwpyndpo38af04ky', 'https://yadi.sk/d/KAmlsBmGaI1f2A', 'https://pixeldra.in/u/wmcsjuhv', 'https://dropapk.to/v6r7mjfgxjq6', 'https://gofile.io/?c=FRWL1o', 'https://racaty.net/dkvdyjqvg02e', 'https://bayfiles.com/L0k7Qea2pb', 'https://tusfiles.com/2q00y4huuv15', 'https://megaup.net/2f0pv/Projection.First.Light-GoldBerg.zip', 'https://letsupload.org/88t5', 'https://filesupload.org/0d7771dfef54d055', 'https://dl.bdupload.in/17ykjrifizrb', 'https://clicknupload.co/o0k9dnd3iwoy', 'https://dailyuploads.net/n1jihwjwdmjp', 'https://userscloud.com/nircdd4q1t5w', 'https://rapidgator.net/file/b6b8f5782c7c2bdb534214342b58ef18', 'https://turbobit.net/m308zh1hdpba.html', 'https://hitfile.net/5OhkcqZ', 'https://filerio.in/0wbvn4md4i91', 'https://mirrorace.org/m/1Fiic', 'https://go4up.com/dl/0ee9f4866312b5/Projection.First.Light-GoldBerg.zip', 'https://katfile.com/w74l823vuyw5/Projection.First.Light-GoldBerg.zip.html', 'https://multiup.org/download/3d355ba18d58234c792da7a872ab4998/Projection.First.Light-GoldBerg.zip', 'https://dl1.indishare.in/hs55pkx4ex82'] 你能提供完整的html代码吗？或者甚至网页的链接都可以。当然，这是skidrow的最后一篇帖子：好的……那么你想要的就是下面关于游戏的文字。我说的对吗？哦，不，我需要所有的“a”标签，比如MEGA、YANDEX、TORRENT。。很快，如果我能从这个div中获得所有“a”元素，我就知道了，但是我的div只有一个，没有这样的div，所以我想我应该使用“find”？我试图从这个div中获取所有的“a”标记，但是我无法访问这个div。我已经在编辑中更新了代码。这确实会返回您可以继续搜索的项目列表。但是如果链接正在打开同一页面上的选项卡（因此未定义目标），我该怎么办 import requests, re from bs4 import BeautifulSoup response = requests.get("your URL here") soup = BeautifulSoup(response.text, 'html.parser') raw_data = soup.find_all("div", id=re.compile("^tabs")) for ele in raw_data: a_tag = ele.find("a") # do something with the a_tag import requests from bs4 import BeautifulSoup url = "https://www.skidrowreloaded.com/projection-first-light-goldberg/" soup = BeautifulSoup(requests.get(url).text, "html.parser").find_all("a", {"target": "_blank"}) skip = 'https://www.skidrowreloaded.com' print([a['href'] for a in soup if a['href'].startswith('https') and a['href'] != skip]) ['https://store.steampowered.com/app/726490/Projection_First_Light/', 'https://mega.nz/file/geogAATS#-0U0PklF-Q5i5l_SELzYx3klh5FZob9HaD4QKcFH_8M', 'https://uptobox.com/rqnlpcp7yb3v', 'https://1fichier.com/?0syphwpyndpo38af04ky', 'https://yadi.sk/d/KAmlsBmGaI1f2A', 'https://pixeldra.in/u/wmcsjuhv', 'https://dropapk.to/v6r7mjfgxjq6', 'https://gofile.io/?c=FRWL1o', 'https://racaty.net/dkvdyjqvg02e', 'https://bayfiles.com/L0k7Qea2pb', 'https://tusfiles.com/2q00y4huuv15', 'https://megaup.net/2f0pv/Projection.First.Light-GoldBerg.zip', 'https://letsupload.org/88t5', 'https://filesupload.org/0d7771dfef54d055', 'https://dl.bdupload.in/17ykjrifizrb', 'https://clicknupload.co/o0k9dnd3iwoy', 'https://dailyuploads.net/n1jihwjwdmjp', 'https://userscloud.com/nircdd4q1t5w', 'https://rapidgator.net/file/b6b8f5782c7c2bdb534214342b58ef18', 'https://turbobit.net/m308zh1hdpba.html', 'https://hitfile.net/5OhkcqZ', 'https://filerio.in/0wbvn4md4i91', 'https://mirrorace.org/m/1Fiic', 'https://go4up.com/dl/0ee9f4866312b5/Projection.First.Light-GoldBerg.zip', 'https://katfile.com/w74l823vuyw5/Projection.First.Light-GoldBerg.zip.html', 'https://multiup.org/download/3d355ba18d58234c792da7a872ab4998/Projection.First.Light-GoldBerg.zip', 'https://dl1.indishare.in/hs55pkx4ex82']