Python 使用beautifulsoup从网站获取json数据_Python_Python 3.x

Python 使用beautifulsoup从网站获取json数据

python python-3.x

Python 使用beautifulsoup从网站获取json数据,python,python-3.x,Python,Python 3.x,很抱歉，我对此有点陌生，所以我想获取特定的json数据“getMe”：“inedthisdata” 从bs4导入美化组导入json html_doc=“” 样品 utag_cfg_ovrd=window.utag_cfg_ovrd | |{}；utag_cfg_ovrd.noview=true； window.REDUX_STATE={“appConfig”： {“dataLab”：“能量”、“最小值”：“最大值”、“getMe”：“IneedThisData”} """ soup=Beaut

很抱歉，我对此有点陌生，所以我想获取特定的json数据

“getMe”：“inedthisdata”

从bs4导入美化组
导入json
html_doc=“”
样品
utag_cfg_ovrd=window.utag_cfg_ovrd | |{}；utag_cfg_ovrd.noview=true；
window.REDUX_STATE={“appConfig”：
{“dataLab”：“能量”、“最小值”：“最大值”、“getMe”：“IneedThisData”}
"""
soup=BeautifulSoup（html_doc，'html.parser'）
data=json.load（soup.find（'script'，'window.REDUX_STATE'）.text）

我收到一个错误，错误为

AttributeError:“NoneType”对象没有属性“text”

我仍然在将数据加载到变量中。

假设

“最小值”：“最大值”：“getMe”

是一个拼写错误，实际上是

“最小值”：“最大值”，“getMe”

没有拼写错误（这使它成为一个正确的JSON），您可以使用以下代码：

soup = BeautifulSoup(html_doc, 'html.parser')
tag = soup.find("script", text=re.compile(".*window\.REDUX_STATE.*"))
text = str(tag.contents[0])
splits = text.split("=")
data = json.loads(splits[1])

在代码中，

soup.find（'script'、'window.REDUX_STATE'）

与任何标记都不匹配。这就是您得到

AttributeError

错误的原因。

的属性用于根据标记的属性筛选标记。“window.REDUX_STATE”不是一个属性。

我非常喜欢这种方法。谢谢

soup = BeautifulSoup(html_doc, 'html.parser')
tag = soup.find("script", text=re.compile(".*window\.REDUX_STATE.*"))
text = str(tag.contents[0])
splits = text.split("=")
data = json.loads(splits[1])