Web scraping 从交互式图表中刮取数据_Web Scraping_Scrapy

Web scraping 从交互式图表中刮取数据

web-scraping scrapy

Web scraping 从交互式图表中刮取数据,web-scraping,scrapy,Web Scraping,Scrapy,我正试图从下面网站底部的互动图表中获取数据：我在chrome中使用过开发工具，但在elements选项卡中找不到数据点如果有人能看一下并告诉我数据点是否存储在页面的某个位置，我将不胜感激。该网站正在打印excel文件的数据。因此，您不必尝试查找图表数据输出。我给你写一个剪贴稿 import scrapy,os,wget from xlrd import open_workbook class MarshSpider(scrapy.Spider): name = 'marsh'

我正试图从下面网站底部的互动图表中获取数据：

我在chrome中使用过开发工具，但在elements选项卡中找不到数据点

如果有人能看一下并告诉我数据点是否存储在页面的某个位置，我将不胜感激。

该网站正在打印excel文件的数据。因此，您不必尝试查找图表数据输出。我给你写一个剪贴稿

import scrapy,os,wget
from xlrd import open_workbook

class MarshSpider(scrapy.Spider):
    name = 'marsh'
    allowed_domains = ['www.marsh.com']
    start_urls = ['https://www.marsh.com/us/insights/research/global-insurance-market-index-q4-2020.html']

    def parse(self, response):
        xlsx_url = response.xpath('//div[contains(@class,"htmleditor")]//@data-csv-url').get() #Get the xlsx URL here
        main_url = "https://www.marsh.com"
        file = wget.download(main_url+xlsx_url) #download the url
        data = open_workbook(file) #open in workbook
        worksheet = data.sheet_by_index(0)

        for row in range(1,worksheet.nrows):
            yield{
                "Global Insurance Composite Renewal Rate":worksheet.cell(row,1).value,
                "Global Casualty Insurance Renewal Rate":worksheet.cell(row,2).value,
            }
        os.remove(file)

很好，谢谢。当我看到开发者标签“很高兴工作”时，我完全没有注意到这一点：）你能接受这个答案吗？