Python 使用BeautifulSoup的标记搜索失败，使用Java生成的HTML_Python_Selenium_Beautifulsoup

Python 使用BeautifulSoup的标记搜索失败，使用Java生成的HTML

python selenium

Python 使用BeautifulSoup的标记搜索失败，使用Java生成的HTML,python,selenium,beautifulsoup,Python,Selenium,Beautifulsoup,我正在尝试捕获html页面中的URL。该页面需要一个asyncronis调用来完全生成嵌入的URL。出于某种原因，BeautifulSoup没有收到标签我的代码： from bs4 import BeautifulSoup from selenium import webdriver driver = webdriver.Firefox() link = "https://www.sec.gov/ix?doc=/Archives/edgar/data/320193/0000320193200

我正在尝试捕获html页面中的URL。该页面需要一个asyncronis调用来完全生成嵌入的URL。出于某种原因，BeautifulSoup没有收到标签

我的代码：

from bs4 import BeautifulSoup
from selenium import webdriver

driver = webdriver.Firefox()
link = "https://www.sec.gov/ix?doc=/Archives/edgar/data/320193/000032019320000008/a8-kq1202012282019.htm"
driver.get(link)
html = driver.execute_script("return document.documentElement.outerHTML")

soup = BeautifulSoup(html,'lxml')
print(soup.findall('a'))

查看时，汤的相关部分如下所示：

         <td style="vertical-align:top;padding-left:2px;padding-top:2px;padding-bottom:2px;padding-right:2px;">
           <div style="font-size:9pt;">
            <a href="https://www.sec.gov/Archives/edgar/data/320193/000032019320000008/a8-kexhibit991q1202012.htm" style="text-decoration:underline;color:#0000FF;-sec-extract:exhibit;" tabindex="18">
             <span style="font-family:Helvetica,sans-serif;font-size:9pt;">
              Press release issued by Apple Inc. on January 28, 2020.
             </span>

我正在试着得到a href

我做错了什么？

来自selenium import webdriver 从selenium.webdriver.firefox.options导入选项从bs4导入BeautifulSoup 进口稀土从时间上导入睡眠选项=选项选项。添加参数“-headless” driver=webdriver.Firefoxoptions=options 驾驶员gethttps://www.sec.gov/ix?doc=/Archives/edgar/data/320193/000032019320000008/a8-kq1202012282019.htm 睡眠2 soup=BeautifulSoupdriver.page_源代码'html.parser' 对于soup.findAlla中的项目，style=re.compile^text： printitem.gethref 司机，退出输出：

https://www.sec.gov/Archives/edgar/data/320193/000032019320000008/a8-kexhibit991q1202012.htm

html=self.driver.page\u这是否回答了您的问题？您希望如何从页面上的所有其他链接中找到一个特定链接？谢谢。“选项”有什么作用？另外，我看不出Selenium是如何运行javascript的。调用“driver.get”时会发生什么情况？headless选项将在没有图形用户界面的情况下运行浏览器。此程序的行为类似于浏览器，但不会显示任何GUI。如果答案满足您的目标，请勾选答案旁边的复选标记，接受我的答案。@Windstorm1981是的，它的行为与您手动浏览的行为相同，这意味着它已打开url，JavaScript已呈现。是正则表达式操作，我们在其中编译任何锚定标记，包括带有word文本的style Starts