Python Beautifulsoup:如何从;窗口。“初始状态”;

Python Beautifulsoup:如何从;窗口。“初始状态”;,python,python-3.x,beautifulsoup,Python,Python 3.x,Beautifulsoup,如何删除“窗口。初始状态”并从中获取数据 我也不需要来自“window.\uu CONFIG\uuu=”或“window.\uu USER\u ID\uuu=”的数据 最好是JSON 我有以下代码: def html(): response = urllib.request.urlopen(url) soup = BeautifulSoup(response.read(), "html.parser") results = soup.select_on

如何删除“窗口。初始状态”并从中获取数据

我也不需要来自“window.\uu CONFIG\uuu=”或“window.\uu USER\u ID\uuu=”的数据

最好是JSON

我有以下代码:

def html():
    response = urllib.request.urlopen(url)
    soup = BeautifulSoup(response.read(), "html.parser")
    results = soup.select_one("script:-soup-contains('user_id')").string
    print(results)
我得到的回答是:

window.__INITIAL_STATE__={"room_info":{"my_id":45316761,"user_id":45316761,"loginname":"ahmad  TalentYou can get the data using the 
re
/
json
modules:

import re
import json
import requests

url = "https://nonolive.com/45316761"

html_doc = requests.get(url).text
data = re.search(r"window\.__INITIAL_STATE__=(.*?);", html_doc).group(1)
data = json.loads(data)

# pretty print the data:
print(json.dumps(data, indent=4))

window.\uuuu初始状态\uuuuu={“房间信息”:{“我的id”:45316761,“用户id”:45316761,“登录名”:“ahmad Talent您可以使用
re
/
json
模块获取数据:

印刷品:

{
“房间信息”:{
“我的id”:45316761,
“用户id”:45316761,
“登录名”:“ahmad Talent\ud83c\udfac”,
“地位”:10,
“阿凡达”:https://nono-vpic-dl.akamaized.net/download/file/fra/nonolive-fra/nnphotos/45316761/7acc36198d61d535b556b5c9da4722a8.jpg",
“简介”:“,
“主播组”:[
“官方偶像”
],
“锚”的介绍介绍介绍“锚”的介绍介绍“锚”的介绍介绍介绍“锚”的介绍介绍介绍“以下”““\uu062a\uu062A\uuuuuuuuuuuuu介绍介绍“锚”的介绍介绍介绍”如下““\u062a\u062a\u062a\u062a\u062a\u062a\u062a\u062a\u062a\u062\u067\u0624\u0627\u0627\u0627\u0624\u0624\u0624\u0648\u0648\u0648\u0648\u0648\u0648\u0648\u0648\u0648\u0648\u0648\u0648\u0648\u0648\u0648\u0648\u0648\u0648\u0648\u0648\u0648\u0648\u0648\u0648\u0648 u0631\u064a\u0627“,
“anchor_live”:13,
“图片”:https://nono-vpic-dl.akamaized.net/download/file/fra/nonolive-fra/imgs/3f95b611-bce2-4090-a354-74884b986105.jpg",
“粉丝”:204,
“经验”:20683.416664,
“级别”:49,
“地点”:“伊朗”,
“国家”:“伊朗”,
“金融国”:“叙利亚”,
“用户群集”:“aws\U新加坡”
},
“比赛\现场\房间\列表”:[
18433562,
23281295,
34344391,
47277372,
47256825,
54286750,
47255406,
47353920,
47256699,
47795451,
47262679,
8646078,
23419136,
21453881,
29813714,
29710653,
47262066,
29368045,
13938893,
48391752,
14673269,
29333298,
18485050,
20545338,
14485392,
19220336,
14597081,
32203926,
32284062,
15130785,
47543990,
8623919,
34033944,
34030962,
34099216,
34403020,
19973173,
12376400,
35225245,
35303307,
35277251,
35357151,
35397160,
35157680,
35486592,
35517567,
35517948,
35530929,
26480251,
35541332,
19267293,
35791502,
17367640,
35003909,
35857349,
35684312,
36294570,
35858792,
8181931,
8181894,
8181904,
20519246,
100008,
36155173,
36346677,
36641437,
36641273,
36639480,
36641374,
36639155,
36643700,
36671965,
37993657,
37209374,
31627285,
37397273,
38191007,
34969242,
8021848,
37256644,
38890560,
35679023,
35963867,
35678785,
35664149,
35678453,
36146858,
38566654,
47623047,
38565866,
33489767,
38566762,
40605811,
37683851,
36172817,
36114494,
37669650,
40589540,
36277491,
41085963,
38965463,
38575592,
39590981,
36771882,
33514817,
37409947,
37557443,
38814672,
36878613,
39786744,
38985315,
40227952,
39768448,
39597105,
2880999,
745773,
43248166,
40693308,
38018122,
36730051,
37930534,
42377740,
36912971,
38283433,
47397760,
48544218,
47928342,
47288183,
34803161,
47353280,
47660138,
47851530,
36240127,
41677978,
31433574,
34134849,
48223842,
44517516,
41686787,
44084034,
32136191,
30911886,
32764558
],
“默认标题”:“非橄榄-游戏和视频直播”
}

您可以共享URL吗?您需要执行JavaScript。请使用Selenium WebDriver而不是BS4。@AndrejKesely