Python 从Id获取文本
我试图用BeautifulSoup捕捉id的文本。结果应该是30,66 我的实际代码打印完整的span元素:Python 从Id获取文本,python,beautifulsoup,Python,Beautifulsoup,我试图用BeautifulSoup捕捉id的文本。结果应该是30,66 我的实际代码打印完整的span元素: [<span class="mainValueAmount simpleTextFit" id="ctl00_ContentPlaceHolder1_PublicPagePlaceholder1_PageUserControl_ctl00_PublicPageLoadFixPage_energyYieldWidget_energyYieldValue">30,66</s
[<span class="mainValueAmount simpleTextFit" id="ctl00_ContentPlaceHolder1_PublicPagePlaceholder1_PageUserControl_ctl00_PublicPageLoadFixPage_energyYieldWidget_energyYieldValue">30,66</span>]
我如何得到30,66的值
from bs4 import BeautifulSoup
u = '<div class="widgetBox" data-name="pvEnergy"><div class="widgetHead">PV-Energie</div><div class="widgetBody"><div class="mainValue"><span id="ctl00_ContentPlaceHolder1_PublicPagePlaceholder1_PageUserControl_ctl00_PublicPageLoadFixPage_energyYieldWidget_energyYieldValue" class="mainValueAmount simpleTextFit">30,66</span><span id="ctl00_ContentPlaceHolder1_PublicPagePlaceholder1_PageUserControl_ctl00_PublicPageLoadFixPage_energyYieldWidget_energyYieldUnit" class="mainValueUnit">kWh</span><br><span id="ctl00_ContentPlaceHolder1_PublicPagePlaceholder1_PageUserControl_ctl00_PublicPageLoadFixPage_energyYieldWidget_energyYieldPeriodTitle" class="mainValueDescription">Heute</span></div></div><div id="ctl00_ContentPlaceHolder1_PublicPagePlaceholder1_PageUserControl_ctl00_PublicPageLoadFixPage_energyYieldWidget_energyYieldTotalDiv" class="widgetFooter">Gesamt: <span id="ctl00_ContentPlaceHolder1_PublicPagePlaceholder1_PageUserControl_ctl00_PublicPageLoadFixPage_energyYieldWidget_energyYieldTotalValue">158,953</span><span id="ctl00_ContentPlaceHolder1_PublicPagePlaceholder1_PageUserControl_ctl00_PublicPageLoadFixPage_energyYieldWidget_energyYieldTotalUnit">MWh</span></div></div>'
idAktWert = 'ctl00_ContentPlaceHolder1_PublicPagePlaceholder1_PageUserControl_ctl00_PublicPageLoadFixPage_energyYieldWidget_energyYieldValue'
soup = BeautifulSoup(u, "html.parser")
aktWert = soup.select("#" + idAktWert)
print(aktWert)
谢谢你的帮助 使用.text
例:
您只需获取文本即可
soup.select函数返回选定元素的列表。因此,上面的命令将打印“选定对象”列表。要仅检索标记内的文本,可以访问选定对象的“文本”属性。
from bs4 import BeautifulSoup
u = '<div class="widgetBox" data-name="pvEnergy"><div class="widgetHead">PV-Energie</div><div class="widgetBody"><div class="mainValue"><span id="ctl00_ContentPlaceHolder1_PublicPagePlaceholder1_PageUserControl_ctl00_PublicPageLoadFixPage_energyYieldWidget_energyYieldValue" class="mainValueAmount simpleTextFit">30,66</span><span id="ctl00_ContentPlaceHolder1_PublicPagePlaceholder1_PageUserControl_ctl00_PublicPageLoadFixPage_energyYieldWidget_energyYieldUnit" class="mainValueUnit">kWh</span><br><span id="ctl00_ContentPlaceHolder1_PublicPagePlaceholder1_PageUserControl_ctl00_PublicPageLoadFixPage_energyYieldWidget_energyYieldPeriodTitle" class="mainValueDescription">Heute</span></div></div><div id="ctl00_ContentPlaceHolder1_PublicPagePlaceholder1_PageUserControl_ctl00_PublicPageLoadFixPage_energyYieldWidget_energyYieldTotalDiv" class="widgetFooter">Gesamt: <span id="ctl00_ContentPlaceHolder1_PublicPagePlaceholder1_PageUserControl_ctl00_PublicPageLoadFixPage_energyYieldWidget_energyYieldTotalValue">158,953</span><span id="ctl00_ContentPlaceHolder1_PublicPagePlaceholder1_PageUserControl_ctl00_PublicPageLoadFixPage_energyYieldWidget_energyYieldTotalUnit">MWh</span></div></div>'
idAktWert = 'ctl00_ContentPlaceHolder1_PublicPagePlaceholder1_PageUserControl_ctl00_PublicPageLoadFixPage_energyYieldWidget_energyYieldValue'
soup = BeautifulSoup(u, "html.parser")
aktWert = soup.select("#" + idAktWert)[0] #Note: I have used Index to select the first element in list.
print(aktWert.text)
30,66
from bs4 import BeautifulSoup
u = '<div class="widgetBox" data-name="pvEnergy"><div class="widgetHead">PV-Energie</div><div class="widgetBody"><div class="mainValue"><span id="ctl00_ContentPlaceHolder1_PublicPagePlaceholder1_PageUserControl_ctl00_PublicPageLoadFixPage_energyYieldWidget_energyYieldValue" class="mainValueAmount simpleTextFit">30,66</span><span id="ctl00_ContentPlaceHolder1_PublicPagePlaceholder1_PageUserControl_ctl00_PublicPageLoadFixPage_energyYieldWidget_energyYieldUnit" class="mainValueUnit">kWh</span><br><span id="ctl00_ContentPlaceHolder1_PublicPagePlaceholder1_PageUserControl_ctl00_PublicPageLoadFixPage_energyYieldWidget_energyYieldPeriodTitle" class="mainValueDescription">Heute</span></div></div><div id="ctl00_ContentPlaceHolder1_PublicPagePlaceholder1_PageUserControl_ctl00_PublicPageLoadFixPage_energyYieldWidget_energyYieldTotalDiv" class="widgetFooter">Gesamt: <span id="ctl00_ContentPlaceHolder1_PublicPagePlaceholder1_PageUserControl_ctl00_PublicPageLoadFixPage_energyYieldWidget_energyYieldTotalValue">158,953</span><span id="ctl00_ContentPlaceHolder1_PublicPagePlaceholder1_PageUserControl_ctl00_PublicPageLoadFixPage_energyYieldWidget_energyYieldTotalUnit">MWh</span></div></div>'
idAktWert = 'ctl00_ContentPlaceHolder1_PublicPagePlaceholder1_PageUserControl_ctl00_PublicPageLoadFixPage_energyYieldWidget_energyYieldValue'
soup = BeautifulSoup(u, "html.parser")
aktWert = soup.select("#" + idAktWert)
// since aktWert is an array, we need to get the 1st index
print(aktWert[0].get_text()) // outputs 30,66