Python bs4如何提取<;中的文本;p>;标签
我正在练习解析,我真的很想知道,我如何才能在这个确切的Python bs4如何提取<;中的文本;p>;标签,python,html,beautifulsoup,html-parsing,Python,Html,Beautifulsoup,Html Parsing,我正在练习解析,我真的很想知道,我如何才能在这个确切的标签中提取文本,因为有很多这样的标签,我只需要一个标签中的信息。谢谢你的帮助和帮助 import requests as r from bs4 import BeautifulSoup def find_info(self): api = r.get(self.url) #url is above in the description soup = BeautifulSoup(api.text, "html.par
标签中提取文本,因为有很多这样的标签,我只需要一个标签中的信息。谢谢你的帮助和帮助
import requests as r
from bs4 import BeautifulSoup
def find_info(self):
api = r.get(self.url) #url is above in the description
soup = BeautifulSoup(api.text, "html.parser")
soup.find_all('p')
# and here I'm stuck.
# I need to get the text from the chunk of HTML below.
<p>
<strong>
Bitcoin price today
</strong>
is ₽3.795.164 RUB with a 24-hour trading volume of ₽6.527.780.409.893 RUB. Bitcoin is down,12% in the last 24 hours. The current CoinMarketCap ranking is #1, with a market cap of ₽70.707.857.530.563 RUB. It has a circulating supply of 18.631.043 BTC coins and a max. supply of 21.000.000 BTC coins.
</p>
将请求作为r导入
从bs4导入BeautifulSoup
def查找信息(自我):
api=r.get(self.url)#url在描述中的上面
soup=BeautifulSoup(api.text,“html.parser”)
汤。找到所有的('p')
#我被困在这里了。
#我需要从下面的HTML块中获取文本。
今日比特币价格
是₽3.795.164卢布,24小时交易量为₽6.527.780.409.893摩擦。比特币在过去24小时内下跌了12%。目前CoinMarketCap排名为#1,市值为₽70.707.857.530.563摩擦。其流通供应量为18.631.043 BTC硬币,最大供应量为21.000.000 BTC硬币。
我尝试了不同的方法,但是对于许多p标签,我不知道如何得到这个确切的标签。使用
css选择器来获取你想要的段落
以下是方法:
import requests
from bs4 import BeautifulSoup
page = requests.get("https://coinmarketcap.com/currencies/bitcoin/").content
print(BeautifulSoup(page, "html.parser").select_one('.about___1OuKY p').getText())
输出:
Bitcoin price today is $51,393.64 USD with a 24-hour trading volume of $88,784,693,272 USD. Bitcoin is up 4.87% in the last 24 hours. The current CoinMarketCap ranking is #1, with a market cap of $957,517,202,639 USD. It has a circulating supply of 18,631,043 BTC coins and a max. supply of 21,000,000 BTC coins.
您可以使用get\u text()
方法
[1] :或只需检查文档: