Web scraping 在使用servlet的页面上使用requests.get（）_Web Scraping_Servlets_Beautifulsoup_Python Requests

Web scraping 在使用servlet的页面上使用requests.get（）

web-scraping servlets

Web scraping 在使用servlet的页面上使用requests.get（）,web-scraping,servlets,beautifulsoup,python-requests,Web Scraping,Servlets,Beautifulsoup,Python Requests,我试图使用Python中的请求库和BeautifulSoup从下面的网页中获取数据。不幸的是，该网站似乎使用servlet检索数据，我不太确定如何处理它我已尝试直接从结果页面进行两种查询： http://a810-bisweb.nyc.gov/bisweb/PropertyProfileOverviewServlet?bin=1014398&go4=+GO+&requestid=0 html = requests.get(url) soup = BeautifulSoup(ht

我试图使用Python中的请求库和BeautifulSoup从下面的网页中获取数据。不幸的是，该网站似乎使用servlet检索数据，我不太确定如何处理它

我已尝试直接从结果页面进行两种查询：

http://a810-bisweb.nyc.gov/bisweb/PropertyProfileOverviewServlet?bin=1014398&go4=+GO+&requestid=0
html = requests.get(url)
soup = BeautifulSoup(html.text, 'html')

以及从搜索页面进行查询：

url = 'http://a810-bisweb.nyc.gov/bisweb/bispi00.jsp'
html = requests.get(url, params={'bin':1014398})
soup = BeautifulSoup(html.text, 'html')

两者都以请求超时结束，可能是因为我没有正确格式化请求。有没有办法从结果页面成功捕获html？

尝试使用

selenium

：

from bs4 import BeautifulSoup
from selenium import webdriver
import time

url = 'http://a810-bisweb.nyc.gov/bisweb/PropertyProfileOverviewServlet?bin=1014398&go4=+GO+&requestid=0'
driver = webdriver.Chrome()
driver.get(url)

time.sleep(3)

soup = BeautifulSoup(driver.page_source, 'html5lib')

driver.close()