Python Web从交互式Web地图中抓取屏幕图像_Python_Node.js_Web Scraping_Beautifulsoup_Cheerio

Python Web从交互式Web地图中抓取屏幕图像

python node.js web-scraping

Python Web从交互式Web地图中抓取屏幕图像,python,node.js,web-scraping,beautifulsoup,cheerio,Python,Node.js,Web Scraping,Beautifulsoup,Cheerio,我需要从以下位置将贴图组件提取到静态图像：此页面包含基于传单的交互式web地图，其中图层数据通过web地图服务每天更新。提取的图像应包含地图上加载的任何图层这也需要实现自动化，这样就不会有人打开web浏览器访问URL。提取的图像将转到Word文档我是一名Python和nodejs程序员，但我无法通过BeautifulSoup for Python或Cheerio for nodejs for web scraping来实现它，因为map不是img元素，而是几个动态div。如何将地图组件视

我需要从以下位置将贴图组件提取到静态图像：

此页面包含基于传单的交互式web地图，其中图层数据通过web地图服务每天更新。提取的图像应包含地图上加载的任何图层

这也需要实现自动化，这样就不会有人打开web浏览器访问URL。提取的图像将转到Word文档

我是一名Python和nodejs程序员，但我无法通过BeautifulSoup for Python或Cheerio for nodejs for web scraping来实现它，因为map不是img元素，而是几个动态div。如何将地图组件视为图像？

您可以使用：

from PIL import Image
from selenium import webdriver

driver = webdriver.Firefox()
driver.maximize_window() # maximize window
driver.get("http://www.bom.gov.au/water/landscape/#/sm/Relative/day/-35.30/145.17/5/Point////2018/12/16/")
element = driver.find_element_by_xpath("//*[@id=\"mapid\"]"); # this is the map xpath
location = element.location;
size = element.size;
driver.save_screenshot("canvas.png");
x = location['x'];
y = location['y'];
width = location['x']+size['width'];
height = location['y']+size['height'];
im = Image.open('canvas.png')
im = im.crop((int(x), int(y), int(width), int(height)))
im.save('canvas_el.png') # your file

如果需要在每个层上循环，请使用以下代码：

from time import sleep
driver.find_elements_by_class_name("leaflet-control-layers-toggle")[0].click(); # make layer selector visible
layers = driver.find_elements_by_class_name("leaflet-control-layers-selector"); # select each layer and wait 5seconds
for layer in layers:
    layer.click()
    sleep(5)
    # you can also capture screenshots here

您可以使用：

from PIL import Image
from selenium import webdriver

driver = webdriver.Firefox()
driver.maximize_window() # maximize window
driver.get("http://www.bom.gov.au/water/landscape/#/sm/Relative/day/-35.30/145.17/5/Point////2018/12/16/")
element = driver.find_element_by_xpath("//*[@id=\"mapid\"]"); # this is the map xpath
location = element.location;
size = element.size;
driver.save_screenshot("canvas.png");
x = location['x'];
y = location['y'];
width = location['x']+size['width'];
height = location['y']+size['height'];
im = Image.open('canvas.png')
im = im.crop((int(x), int(y), int(width), int(height)))
im.save('canvas_el.png') # your file

如果需要在每个层上循环，请使用以下代码：

from time import sleep
driver.find_elements_by_class_name("leaflet-control-layers-toggle")[0].click(); # make layer selector visible
layers = driver.find_elements_by_class_name("leaflet-control-layers-selector"); # select each layer and wait 5seconds
for layer in layers:
    layer.click()
    sleep(5)
    # you can also capture screenshots here

我仍然没有很好地理解你的问题。如果可能的话，你能用简单的方式解释一下吗？你在哪里被卡住了？好的。简单地说。我需要一个nodejs或Python脚本，该脚本将在URL上获取地图组件并将其保存到图像中。与其将其删除，不如使用页面上提供的实际netCDF数据源，并使用光栅地图渲染器，以便可以获得所需的任何颜色、分辨率等。我认为在一个无头Firefox实例上使用selenium webdriver拍摄屏幕截图是可能的，你可能想试试。我仍然没有很好地理解你的问题。如果可能的话，你能用简单的方式解释一下吗？你在哪里卡住了？好的。简单地说。我需要一个nodejs或Python脚本，该脚本将在URL上获取地图组件并将其保存到图像中。与其将其删除，不如使用页面上提供的实际netCDF数据源，并使用光栅地图渲染器，以便可以获得所需的任何颜色、分辨率等。我认为可以在无头Firefox实例上使用selenium webdriver拍摄屏幕截图，您可能想试试。