Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/sharepoint/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python BeautifulSoup返回空结果_Python_Beautifulsoup - Fatal编程技术网

Python BeautifulSoup返回空结果

Python BeautifulSoup返回空结果,python,beautifulsoup,Python,Beautifulsoup,我对使用beauifulsoup非常陌生,我正在尝试使用下面的代码从网站上抓取文本。 但是,find_all不返回任何内容 import bs4 as bs import urllib.request source = urllib.request.urlopen('https://beta.regulations.gov/document/USCIS-2019-0010-9175').read() soup = BeautifulSoup(page.content,'html.parser')

我对使用beauifulsoup非常陌生,我正在尝试使用下面的代码从网站上抓取文本。 但是,find_all不返回任何内容

import bs4 as bs
import urllib.request
source = urllib.request.urlopen('https://beta.regulations.gov/document/USCIS-2019-0010-9175').read()
soup = BeautifulSoup(page.content,'html.parser')
text = soup.find_all(class_="px-2")
print(text)


如注释中所述,数据是通过Javascript动态加载的。但当您打开Firefox/Chrome网络选项卡时,您可以看到数据来自何处:

import requests

url = 'https://beta.regulations.gov/document/USCIS-2019-0010-9175'
ajax_url = 'https://beta.regulations.gov/api/documentdetails/{}'

document_id = url.split('/')[-1]
data = requests.get(ajax_url.format(document_id)).json()

# from pprint import pprint # <-- uncoment to see all data
# pprint(data)

print(data['data']['attributes']['content'])

很可能内容是使用Javascript检索的,很可能内容不在原始源代码中,而是由Javascript生成的。你可以检查原始页面的来源,看看它是否在那里。BeautifulSoup无法执行JS,它只查看HTML源代码。我投票决定关闭它。数据是在网站上动态生成的,这个确切的问题在堆栈溢出之前已经讨论过很多次。我不相信它会带来任何有价值的东西。涉及这个问题的其他问题:。我肯定还有很多。
Rescind the increase in fees. This is draconian. For all intents and purposes, denying access to this information will prevent many Americans from knowing where they came from. This is an outrage. This is not the mark of a democracy. I strongly disagree with this fee increase