Python使用beautifulsoup解析Javascript

Python使用beautifulsoup解析Javascript,javascript,python,beautifulsoup,python-requests,html-parsing,Javascript,Python,Beautifulsoup,Python Requests,Html Parsing,我试图在JavaScript中解析内容。我知道怎么做,但我不完全确定。我已经阅读了一些例子,我认为使用re库可能是一种方法 以下是我目前的代码: import requests import json import re from bs4 import BeautifulSoup url = r'https://login.live.com/login.srf?wa=wsignin1.0&rpsnv=13&rver=6.7.6643.0&wp=MBI_SSL&w

我试图在JavaScript中解析内容。我知道怎么做,但我不完全确定。我已经阅读了一些例子,我认为使用re库可能是一种方法

以下是我目前的代码:

import requests
import json
import re
from bs4 import BeautifulSoup

url = r'https://login.live.com/login.srf?wa=wsignin1.0&rpsnv=13&rver=6.7.6643.0&wp=MBI_SSL&wreply=https:%2f%2faccount.xbox.com%2fen-us%2faccountcreation%3freturnUrl%3dhttps:%252f%252fwww.xbox.com:443%252fen-US%252f%26pcexp%3dtrue%26uictx%3dme%26rtc%3d1&lc=1033&id=292543&aadredir=1'


s = requests.Session()


soup = BeautifulSoup(s.get(url).content, 'html.parser')


print(soup.find_all("script", type="text/javascript")[5].prettify())
这里只是解析内容的一个片段。我正在尝试访问这些数据,尤其是“值”



我感谢所有提前回复。谢谢

当我运行此命令时,输出是['DVSXQAQAHHTOMXS2Y4K2ITS5MPP52MJGUKC7LH!W*1MJHIWKNPAJBFGXK5YP3!BU3WUVVS7XAVLEUV3NIBJLZHCKJ73QME8WIPWXHCQUZQ2WNJVNYAVNCG9XXKPUIOVP7!SLBUMUYEFYZM6QLKMB5C7MUMDOFVHLLKXPI7POHE8SO2X8R63FCTDPHWZQJE3B8DRK*如何删除['SMDPJ0ZK7KZK7KKKKKKKZZ7KKKKJJJJJZL7K8KKZZL7D8KKKKKKKKKZZZLZLKKKKKKKKKK8KZ?它是动态内容,因此会不断变化。另外,只需将值替换为我刚刚编辑代码时使用的新数据。我理解这一点,但我只需要字符串,不需要开头的“[”和结尾的“]),谢谢!
from bs4 import BeautifulSoup as bs
import requests
import re
url = 'https://login.live.com/login.srf?wa=wsignin1.0&rpsnv=13&rver=6.7.6643.0&wp=MBI_SSL&wreply=https:%2f%2faccount.xbox.com%2fen-us%2faccountcreation%3freturnUrl%3dhttps:%252f%252fwww.xbox.com:443%252fen-US%252f%26pcexp%3dtrue%26uictx%3dme%26rtc%3d1&lc=1033&id=292543&aadredir=1'
page = requests.get(url)
html = bs(page.text, 'lxml')
input = html.findAll('script', type="text/javascript")[5].prettify()
value = re.findall(r'value=".+"/', input)
#value = str(value).replace('value="', '').replace('"/','')
value = str(value).replace('value="', '').replace('"/','').replace("['",'').replace("']",'')
print(value)
Output:
DVSXQahhtomXS2Y4k2itS5MPP52mJgUkC7LH!W*1DmjHiWk*npajBfgXK5yp3*!bu3Wuvvs7xavleUV3nIbjLZHckj73QMe8wipwXhCqpXuUZQ2wnJvNYAVNCg9XxKPuIovp7!sLbumrufuYefyzM6UQLkMb5c7MuImDofVhLlKxpI7Pohe8sO2x8r63TtFCTDphWzqXKJE3B8DRK*AhMbFsmdP0sj2CXMZ7dyTfLJSr1zWBlaHTqJPLvhgzLSiaEg$$
from bs4 import BeautifulSoup as bs
import requests
import re
url = 'https://login.live.com/login.srf?wa=wsignin1.0&rpsnv=13&rver=6.7.6643.0&wp=MBI_SSL&wreply=https:%2f%2faccount.xbox.com%2fen-us%2faccountcreation%3freturnUrl%3dhttps:%252f%252fwww.xbox.com:443%252fen-US%252f%26pcexp%3dtrue%26uictx%3dme%26rtc%3d1&lc=1033&id=292543&aadredir=1'
page = requests.get(url)
html = bs(page.text, 'lxml')
input = html.findAll('script', type="text/javascript")[5].prettify()
value = re.findall(r'value=".+"/', input)
#value = str(value).replace('value="', '').replace('"/','')
value = str(value).replace('value="', '').replace('"/','').replace("['",'').replace("']",'')
print(value)
Output:
DVSXQahhtomXS2Y4k2itS5MPP52mJgUkC7LH!W*1DmjHiWk*npajBfgXK5yp3*!bu3Wuvvs7xavleUV3nIbjLZHckj73QMe8wipwXhCqpXuUZQ2wnJvNYAVNCg9XxKPuIovp7!sLbumrufuYefyzM6UQLkMb5c7MuImDofVhLlKxpI7Pohe8sO2x8r63TtFCTDphWzqXKJE3B8DRK*AhMbFsmdP0sj2CXMZ7dyTfLJSr1zWBlaHTqJPLvhgzLSiaEg$$