如何在python中使用BeautifulSoup获得第二个跨度?
我试图得到这个div中的第二个span值,以及下面所示的其他值如何在python中使用BeautifulSoup获得第二个跨度?,python,web-scraping,beautifulsoup,html-parsing,Python,Web Scraping,Beautifulsoup,Html Parsing,我试图得到这个div中的第二个span值,以及下面所示的其他值 <div class="C(#959595) Fz(11px) D(ib) Mb(6px)"> <span>VALUE 1</span> <i aria-hidden="true" class="Mx(4px)">•</i> <span>TRYING TO GET THIS</span> </div> 我试过像I
<div class="C(#959595) Fz(11px) D(ib) Mb(6px)">
<span>VALUE 1</span>
<i aria-hidden="true" class="Mx(4px)">•</i>
<span>TRYING TO GET THIS</span>
</div>
我试过像I.span、I.contents、I.children等等。
我真的很感激任何帮助,谢谢 试试这个
从io导入StringIO
从bs4导入BeautifulSoup作为bs
数据=
值1
•
想得到这个吗
值1
•
想得到这个吗
汤=bsStringIOdata
span=汤。选择'div[class=C959595 Fz11px Dib Mb6px]>span'
printspans[1]。文本
试试这个
从io导入StringIO
从bs4导入BeautifulSoup作为bs
数据=
值1
•
想得到这个吗
值1
•
想得到这个吗
汤=bsStringIOdata
span=汤。选择'div[class=C959595 Fz11px Dib Mb6px]>span'
printspans[1]。文本
基本上你已经拥有了它,你只需要在每个div find_的第二个跨度处找到下一个:
soup = BeautifulSoup(HTML, 'html.parser')
divs = soup.find_all('div', {'class': 'C(#959595) Fz(11px) D(ib) Mb(6px)'})
for div in divs:
# want the second span in the div
span = div.find_next('span').find_next('span')
print(span.string)
基本上你已经拥有了它,你只需要在每个div find_的第二个跨度处找到下一个:
soup = BeautifulSoup(HTML, 'html.parser')
divs = soup.find_all('div', {'class': 'C(#959595) Fz(11px) D(ib) Mb(6px)'})
for div in divs:
# want the second span in the div
span = div.find_next('span').find_next('span')
print(span.string)
有几种方法可以获得您想要的值
from simplified_scrapy.simplified_doc import SimplifiedDoc
html='''
<div class="C(#959595) Fz(11px) D(ib) Mb(6px)">
<span>VALUE 1</span>
<i aria-hidden="true" class="Mx(4px)">•</i>
<span>TRYING TO GET THIS</span>
</div>
'''
doc = SimplifiedDoc(html)
divs = doc.getElementsByClass('C(#959595) Fz(11px) D(ib) Mb(6px)')
for div in divs:
value = div.getElementByTag('span',start='</span>') # Use start to skip the first
print (value)
value = div.getElementByTag('span',before='<span>',end=len(div.html)) # Locate the last
print (value)
value = div.i.next # Use <i> to locate
print (value)
value = div.spans[-1]
print (value)
print (value.text)
有几种方法可以获得您想要的值
from simplified_scrapy.simplified_doc import SimplifiedDoc
html='''
<div class="C(#959595) Fz(11px) D(ib) Mb(6px)">
<span>VALUE 1</span>
<i aria-hidden="true" class="Mx(4px)">•</i>
<span>TRYING TO GET THIS</span>
</div>
'''
doc = SimplifiedDoc(html)
divs = doc.getElementsByClass('C(#959595) Fz(11px) D(ib) Mb(6px)')
for div in divs:
value = div.getElementByTag('span',start='</span>') # Use start to skip the first
print (value)
value = div.getElementByTag('span',before='<span>',end=len(div.html)) # Locate the last
print (value)
value = div.i.next # Use <i> to locate
print (value)
value = div.spans[-1]
print (value)
print (value.text)
有没有像“doc.find_last'span'”span=div.find_all'span这样更干净的东西?有没有像“doc.find_last'span'”span=div.find_all'span这样更干净的东西?
{'tag': 'span', 'html': 'TRYING TO GET THIS'}
{'tag': 'span', 'html': 'TRYING TO GET THIS'}
{'tag': 'span', 'html': 'TRYING TO GET THIS'}
{'tag': 'span', 'html': 'TRYING TO GET THIS'}
TRYING TO GET THIS