Python SCRAPY：在SCRAPY中从熊猫写入XLSX文件_Python_Excel_Dataframe_Scrapy_Scrapy Pipeline

Python SCRAPY：在SCRAPY中从熊猫写入XLSX文件

python excel dataframe scrapy

Python SCRAPY：在SCRAPY中从熊猫写入XLSX文件,python,excel,dataframe,scrapy,scrapy-pipeline,Python,Excel,Dataframe,Scrapy,Scrapy Pipeline,我是scrapy的新手，想在excel中编写数据。我知道我写数据帧的方式。我将能够从一个页面获得数据。但是我很困惑，我必须在哪里定义数据框架才能将所有数据写入excel import scrapy from scrapy.linkextractors import LinkExtractor from scrapy.spiders import CrawlSpider, Rule import pandas as pd class MonarkSpider(CrawlSpider):

我是scrapy的新手，想在excel中编写数据。我知道我写数据帧的方式。我将能够从一个页面获得数据。但是我很困惑，我必须在哪里定义数据框架才能将所有数据写入excel

import scrapy 
from scrapy.linkextractors import LinkExtractor
from scrapy.spiders import CrawlSpider, Rule
import pandas as pd


class MonarkSpider(CrawlSpider):
    name = 'monark'    
    allowed_domains = ['monark.com.pk']
    start_urls = ['https://monark.com.pk/collections/t-shirts']

    #Extracting links   
    rules = (Rule(LinkExtractor(restrict_xpaths=('//h4[@class="h6 m-0 ff-main"]/a')), callback='parse_item', follow=True),)


    def parse_item(self, response):
    
        item = {

            'Product_Name' : response.xpath('//h1[@class="h3 m-0"]/text()').get(),
            'Price' : response.xpath('//span[@class="price price--sale"]/span[2]/text()').get(),
            'Description' : response.xpath('//div[@class="tabs__content rte overflow-hidden"]/p/text()').get()
           }
       # writing data  
       df = pd.DataFrame([item],columns=['Product_Name','Price','Description'])
       df.to_excel('Monark.xlsx',sheet_name='Monark_DATA', index=False, header=True)

有很多种方法，其中之一是，您的解析项返回一个字典或列表，您可以不断积累这些字典或列表。收集完所有数据后，创建一个数据框，然后写入excel。因此，df和df.to_excel的创建可以在类外进行。因此，在课堂上构建列表，然后在最后构建df并编写。到目前为止，最简单的方法是使用内置功能将数据导出到CSV，例如

scrapy crawl monark-o data。CSV

Excel可以打开这些文件，也可以在抓取后使用

pandas

将它们转换为Excel。这对你不起作用有什么特别的原因吗？