如何在python中使用BeautifulSoup获得第二个跨度？_Python_Web Scraping_Beautifulsoup_Html Parsing

如何在python中使用BeautifulSoup获得第二个跨度？

python web-scraping

如何在python中使用BeautifulSoup获得第二个跨度？,python,web-scraping,beautifulsoup,html-parsing,Python,Web Scraping,Beautifulsoup,Html Parsing,我试图得到这个div中的第二个span值，以及下面所示的其他值 <div class="C(#959595) Fz(11px) D(ib) Mb(6px)"> <span>VALUE 1</span> <i aria-hidden="true" class="Mx(4px)">•</i> <span>TRYING TO GET THIS</span> </div> 我试过像I

我试图得到这个div中的第二个span值，以及下面所示的其他值

<div class="C(#959595) Fz(11px) D(ib) Mb(6px)">
    <span>VALUE 1</span>
    <i aria-hidden="true" class="Mx(4px)">•</i>
    <span>TRYING TO GET THIS</span>
</div>

我试过像I.span、I.contents、I.children等等。我真的很感激任何帮助，谢谢

试试这个

从io导入StringIO 从bs4导入BeautifulSoup作为bs 数据= 值1 • 想得到这个吗值1 • 想得到这个吗汤=bsStringIOdata span=汤。选择'div[class=C959595 Fz11px Dib Mb6px]>span' printspans[1]。文本试试这个

从io导入StringIO 从bs4导入BeautifulSoup作为bs 数据= 值1 • 想得到这个吗值1 • 想得到这个吗汤=bsStringIOdata span=汤。选择'div[class=C959595 Fz11px Dib Mb6px]>span' printspans[1]。文本

基本上你已经拥有了它，你只需要在每个div find_的第二个跨度处找到下一个：

soup = BeautifulSoup(HTML, 'html.parser')
divs = soup.find_all('div', {'class': 'C(#959595) Fz(11px) D(ib) Mb(6px)'})
for div in divs:
    # want the second span in the div
    span = div.find_next('span').find_next('span')
    print(span.string)

基本上你已经拥有了它，你只需要在每个div find_的第二个跨度处找到下一个：

soup = BeautifulSoup(HTML, 'html.parser')
divs = soup.find_all('div', {'class': 'C(#959595) Fz(11px) D(ib) Mb(6px)'})
for div in divs:
    # want the second span in the div
    span = div.find_next('span').find_next('span')
    print(span.string)

有几种方法可以获得您想要的值

from simplified_scrapy.simplified_doc import SimplifiedDoc
html='''
<div class="C(#959595) Fz(11px) D(ib) Mb(6px)">
    <span>VALUE 1</span>
    <i aria-hidden="true" class="Mx(4px)">•</i>
    <span>TRYING TO GET THIS</span>
</div>
'''
doc = SimplifiedDoc(html)
divs = doc.getElementsByClass('C(#959595) Fz(11px) D(ib) Mb(6px)')
for div in divs:
  value = div.getElementByTag('span',start='</span>') # Use start to skip the first
  print (value)
  value = div.getElementByTag('span',before='<span>',end=len(div.html)) # Locate the last
  print (value)
  value = div.i.next # Use <i> to locate
  print (value)
  value = div.spans[-1]
  print (value)
  print (value.text)

有几种方法可以获得您想要的值

from simplified_scrapy.simplified_doc import SimplifiedDoc
html='''
<div class="C(#959595) Fz(11px) D(ib) Mb(6px)">
    <span>VALUE 1</span>
    <i aria-hidden="true" class="Mx(4px)">•</i>
    <span>TRYING TO GET THIS</span>
</div>
'''
doc = SimplifiedDoc(html)
divs = doc.getElementsByClass('C(#959595) Fz(11px) D(ib) Mb(6px)')
for div in divs:
  value = div.getElementByTag('span',start='</span>') # Use start to skip the first
  print (value)
  value = div.getElementByTag('span',before='<span>',end=len(div.html)) # Locate the last
  print (value)
  value = div.i.next # Use <i> to locate
  print (value)
  value = div.spans[-1]
  print (value)
  print (value.text)

有没有像“doc.find_last'span'”span=div.find_all'span这样更干净的东西？有没有像“doc.find_last'span'”span=div.find_all'span这样更干净的东西？

{'tag': 'span', 'html': 'TRYING TO GET THIS'}
{'tag': 'span', 'html': 'TRYING TO GET THIS'}
{'tag': 'span', 'html': 'TRYING TO GET THIS'}
{'tag': 'span', 'html': 'TRYING TO GET THIS'}
TRYING TO GET THIS