Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/281.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Javascript 如何使用Beautiful Soup访问此项目_Javascript_Python_Html_Web Scraping_Beautifulsoup - Fatal编程技术网

Javascript 如何使用Beautiful Soup访问此项目

Javascript 如何使用Beautiful Soup访问此项目,javascript,python,html,web-scraping,beautifulsoup,Javascript,Python,Html,Web Scraping,Beautifulsoup,我正在尝试访问中的元素 <script type="text/javascript">ReportPopper("http://asd.asd.asd/ReportOutput/asd-asd-41cc-asd-asd.xls");<script> 我希望得到的最终结果是: javascript=”http://asd.asd.asd/ReportOutput/asd-asd-41cc-asd-asd.xls“ 作为我的输出: Traceback (most recen

我正在尝试访问中的元素

<script type="text/javascript">ReportPopper("http://asd.asd.asd/ReportOutput/asd-asd-41cc-asd-asd.xls");<script>
我希望得到的最终结果是:

javascript=”http://asd.asd.asd/ReportOutput/asd-asd-41cc-asd-asd.xls“

作为我的输出:

Traceback (most recent call last):
  File "c:\Users\John asd\Documents\GitHub\asd.net\testing.py", line 184, in <module>
    javascript = n['ReportPopper']
  File "C:\Users\John asd\asd\Local\Programs\Python\Python37\lib\site-packages\bs4\element.py", line 1016, in __getitem__
    return self.attrs[key]
KeyError: 'ReportPopper'
回溯(最近一次呼叫最后一次):
文件“c:\Users\John asd\Documents\GitHub\asd.net\testing.py”,第184行,在
javascript=n['ReportPopper']
文件“C:\Users\John asd\asd\Local\Programs\Python\Python37\lib\site packages\bs4\element.py”,第1016行,位于\uu getitem中__
返回self.attrs[键]
KeyError:“ReportPopper”
返回a,表示h是正则表达式对象

regex对象有自己的方法,带有可选的pos和endpos参数:


对于bs4.7.1,如果响应中存在该字符串,则可以使用:contains

from bs4 import BeautifulSoup as bs
# r = requests.get(url)
# html - r.content
html = '<script type="text/javascript">ReportPopper("http://asd.asd.asd/ReportOutput/asd-asd-41cc-asd-asd.xls");<script>'
soup = bs(html, 'lxml')
s = soup.select_one('script:contains(ReportPopper)').text
url = s.split('"')[1]
print(url)
从bs4导入美化组作为bs
#r=请求。获取(url)
#html-r.content
html='报告弹出程序(“http://asd.asd.asd/ReportOutput/asd-asd-41cc-asd-asd.xls");'
soup=bs(html,“lxml”)
s=汤。选择一个('script:contains(ReportPopper)')。文本
url=s.split(“”)[1]
打印(url)

report\u post\u url的值是多少?请检查您发布的元素是否正确。我的意思是:
ReportPopper(“http://asd.asd.asd/ReportOutput/asd-asd-41cc-asd-asd.xls”;
welcome。应该比正则表达式更高效一些。
from bs4 import BeautifulSoup
import  re

html = """<script>ReportPopper("http://asd.asd.asd/ReportOutput/asd-asd-41cc-asd-asd.xls");</script>"""

soup = BeautifulSoup(html, 'lxml')
script = soup.find_all("script")

pattern = re.compile('ReportPopper(.*);')

for i in script:
    strObj = i.text
    match = pattern.search(strObj)
    if match:
        print(strObj.split("ReportPopper(")[1][:-2])
"http://asd.asd.asd/ReportOutput/asd-asd-41cc-asd-asd.xls"
from bs4 import BeautifulSoup as bs
# r = requests.get(url)
# html - r.content
html = '<script type="text/javascript">ReportPopper("http://asd.asd.asd/ReportOutput/asd-asd-41cc-asd-asd.xls");<script>'
soup = bs(html, 'lxml')
s = soup.select_one('script:contains(ReportPopper)').text
url = s.split('"')[1]
print(url)