Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/clojure/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 用漂亮的汤选div_Python_Html_Web Scraping_Beautifulsoup - Fatal编程技术网

Python 用漂亮的汤选div

Python 用漂亮的汤选div,python,html,web-scraping,beautifulsoup,Python,Html,Web Scraping,Beautifulsoup,你好,我有这样一个html,当我用BeautifulSoup解析它时,我无法选择类文本。认为问题在于嵌套的标记没有被识别为它的子项。 如何选择跨度标记文本 谢谢 <div data-component="new_enquiry_form_app" data-props="{"isTelRequired":false,"placement":"top",}"> <section

你好,我有这样一个html,当我用BeautifulSoup解析它时,我无法选择类文本。认为问题在于嵌套的标记没有被识别为它的子项。 如何选择跨度标记文本

谢谢

<div data-component="new_enquiry_form_app" data-props="{"isTelRequired":false,"placement":"top",}">
  <section class="enquiry-form-box__wrapper">
    <div class="enquiry-form-box enquiry-form-box--inverted"> 
      <form class="enquiry-form-box__form" tabindex="-1">
        <fieldset class="enquiry-form-box__wrapper">
          <div class="enquiry-form-box__fields">
            <div class="k-ns">
              <span class="text-gray block mt-3 font-bold text-sm">Property reference: 412</span>
            </div>
          </div>
        </fieldset>
      </form>
    </div>
  </section>

物业编号:412
试试这个:

from bs4 import BeautifulSoup

html = '''<div data-component="new_enquiry_form_app" data-props="{"isTelRequired":false,"placement":"top",}">
  <section class="enquiry-form-box__wrapper">
    <div class="enquiry-form-box enquiry-form-box--inverted"> 
      <form class="enquiry-form-box__form" tabindex="-1">
        <fieldset class="enquiry-form-box__wrapper">
          <div class="enquiry-form-box__fields">
            <div class="k-ns">
              <span class="text-gray block mt-3 font-bold text-sm">Property reference: 412</span>
            </div>
          </div>
        </fieldset>
      </form>
    </div>
  </section>'''
soup = BeautifulSoup(html, 'html.parser')
span = soup.select_one('span.text-gray.block.mt-3.font-bold.text-sm')
print(span.get_text())
那么这是一种方式:

from selenium import webdriver
driver = webdriver.Firefox(executable_path='c:program/geckodriver')
driver.get('https://www.kyero.com/en/property/7689206-villa-for-sale-sant-joan-de-labritja')

span = driver.find_element_by_css_selector('span.text-gray.block.mt-3.font-bold.text-sm')
print(span.text)
driver.close()
印刷品:

Property reference: 412
Property reference: 412
请注意,在本代码中,geckodriver被设置为从
c:/program/geckodriver.exe导入
@安德烈·凯斯利回答另一个问题的速度更快,所以我给出了一个硒元素的答案。

试试这个:

from bs4 import BeautifulSoup

html = '''<div data-component="new_enquiry_form_app" data-props="{"isTelRequired":false,"placement":"top",}">
  <section class="enquiry-form-box__wrapper">
    <div class="enquiry-form-box enquiry-form-box--inverted"> 
      <form class="enquiry-form-box__form" tabindex="-1">
        <fieldset class="enquiry-form-box__wrapper">
          <div class="enquiry-form-box__fields">
            <div class="k-ns">
              <span class="text-gray block mt-3 font-bold text-sm">Property reference: 412</span>
            </div>
          </div>
        </fieldset>
      </form>
    </div>
  </section>'''
soup = BeautifulSoup(html, 'html.parser')
span = soup.select_one('span.text-gray.block.mt-3.font-bold.text-sm')
print(span.get_text())
那么这是一种方式:

from selenium import webdriver
driver = webdriver.Firefox(executable_path='c:program/geckodriver')
driver.get('https://www.kyero.com/en/property/7689206-villa-for-sale-sant-joan-de-labritja')

span = driver.find_element_by_css_selector('span.text-gray.block.mt-3.font-bold.text-sm')
print(span.text)
driver.close()
印刷品:

Property reference: 412
Property reference: 412
请注意,在本代码中,geckodriver被设置为从
c:/program/geckodriver.exe导入

@Andrej Kesely用另一个答案更快,因此我给出了selenium答案。

要打印参考标签,可以使用此脚本(数据存储在HTML文档中的javascript变量中):

印刷品:

Property reference: 412

要打印引用标签,可以使用此脚本(数据存储在HTML文档中的javascript变量中):

印刷品:

Property reference: 412

你想要什么文本<代码>属性参考:412
?@MendelG yes属性参考:412您想要什么文本<代码>属性引用:412
?@MendelG yes属性引用:412对于答案,此代码单独打印“属性引用:412”,但不幸的是,当我尝试在整个网页上打印时,却没有打印。然后给我url@ArthurUpdated my answer@ArthurThanks,但我仍然很好奇,为什么没有办法只选择跨度,或者在这种情况下使用div,就像在其他情况下一样?数据加载了JavaScript,请求没有运行JavaScriptsHanks来获得答案,这段代码单独打印“Property reference:412”,但不幸的是,当我在整个网页上尝试时,却没有打印任何内容。然后给我url@ArthurUpdated my answer@ArthurHanks很多,但我仍然很好奇,为什么在这种情况下无法像在其他情况下一样选择span或div?数据是用JavaScript加载的,请求不会运行JavaScriptsHanks来获得答案!谢谢你的回答!