Scrapy/Python：如何将项目添加到类内创建的列表中？_Python_List_Class_Collections_Scrapy

Scrapy/Python：如何将项目添加到类内创建的列表中？

python list class collections scrapy

Scrapy/Python：如何将项目添加到类内创建的列表中？,python,list,class,collections,scrapy,Python,List,Class,Collections,Scrapy,我试图将一个对象添加到另一个对象内创建的列表中。下面是我的课程： # Clases auxiliares class job(scrapy.Item): # containing class, it has a List of 'batches' job_name = scrapy.Field() status = scrapy.Field() start = scrapy.Field() end = scrapy.Field() operator =

我试图将一个对象添加到另一个对象内创建的列表中。下面是我的课程：

# Clases auxiliares
class job(scrapy.Item): # containing class, it has a List of 'batches'
    job_name = scrapy.Field()
    status = scrapy.Field()
    start = scrapy.Field()
    end = scrapy.Field()
    operator = scrapy.Field()
    recipe = scrapy.Field()
    planned = scrapy.Field()
    executed = scrapy.Field()   
    def __init__(self):
        self.batches = [] 

class batch(scrapy.Item): # this class goes inside a job class, and
                          # also stores a list of 'units'
    batch_name = scrapy.Field()
    status = scrapy.Field()
    start = scrapy.Field()
    end = scrapy.Field()
    def __init__(self):
        self.units = [] 

class unit(scrapy.Item): # Finally, this class stores a list of data
    unit_name = scrapy.Field()
    status = scrapy.Field()
    start = scrapy.Field()
    end = scrapy.Field()
    operator = scrapy.Field()
    recipe = scrapy.Field()
    def __init__(self):
        self.datos = []

下面是我正在尝试运行的代码（不幸的是，有错误）：

任何帮助都将不胜感激

谢谢。

根据官方文件（）

Field类只是内置dict类的别名，不提供任何额外的功能或属性。换句话说，字段对象是普通的老Python dict。一个单独的类用于支持基于类属性的项声明语法

因此，您可以将

批次

、

单位

和

数据

定义为

字段

对象，如下所示

# Clases auxiliares
class job(scrapy.Item): # containing class, it has a List of 'batches'
    job_name = scrapy.Field()
    status = scrapy.Field()
    start = scrapy.Field()
    end = scrapy.Field()
    operator = scrapy.Field()
    recipe = scrapy.Field()
    planned = scrapy.Field()
    executed = scrapy.Field()
    batches = scrapy.Field()   


class batch(scrapy.Item): # this class goes inside a job class, and
                          # also stores a list of 'units'
    batch_name = scrapy.Field()
    status = scrapy.Field()
    start = scrapy.Field()
    end = scrapy.Field()
    units = scrapy.Field()

class unit(scrapy.Item): # Finally, this class stores a list of data
    unit_name = scrapy.Field()
    status = scrapy.Field()
    start = scrapy.Field()
    end = scrapy.Field()
    operator = scrapy.Field()
    recipe = scrapy.Field()
    datos = scrapy.Field()

def inicializa_batches(self, lista_batches, jobs):

# 1- the param lista_batches is an extract() of a portion of the 
# response.css with the required data

# 2 - The param jobs is a list of job() objects previously created
    for batchname in lista_batches:
        bn =  str(batchname.strip()) #mejor recibir pura cadena de texto
        if len(bn) > 0:
            newbatch = batch() #declare a new batch object
            newbatch['batch_name'] = bn
            for job in jobs:
                job['batches'] = []
                nom_job = job['job_name']
                if nom_job[0:4] == bn[0:4]: #4 letter match

                    job['batches'].append(newbatch)
        self.log(bn)

在函数中，您可以将其更改为以下内容

# Clases auxiliares
class job(scrapy.Item): # containing class, it has a List of 'batches'
    job_name = scrapy.Field()
    status = scrapy.Field()
    start = scrapy.Field()
    end = scrapy.Field()
    operator = scrapy.Field()
    recipe = scrapy.Field()
    planned = scrapy.Field()
    executed = scrapy.Field()
    batches = scrapy.Field()   


class batch(scrapy.Item): # this class goes inside a job class, and
                          # also stores a list of 'units'
    batch_name = scrapy.Field()
    status = scrapy.Field()
    start = scrapy.Field()
    end = scrapy.Field()
    units = scrapy.Field()

class unit(scrapy.Item): # Finally, this class stores a list of data
    unit_name = scrapy.Field()
    status = scrapy.Field()
    start = scrapy.Field()
    end = scrapy.Field()
    operator = scrapy.Field()
    recipe = scrapy.Field()
    datos = scrapy.Field()

def inicializa_batches(self, lista_batches, jobs):

# 1- the param lista_batches is an extract() of a portion of the 
# response.css with the required data

# 2 - The param jobs is a list of job() objects previously created
    for batchname in lista_batches:
        bn =  str(batchname.strip()) #mejor recibir pura cadena de texto
        if len(bn) > 0:
            newbatch = batch() #declare a new batch object
            newbatch['batch_name'] = bn
            for job in jobs:
                job['batches'] = []
                nom_job = job['job_name']
                if nom_job[0:4] == bn[0:4]: #4 letter match

                    job['batches'].append(newbatch)
        self.log(bn)

谢谢你的回答。但现在每次我运行代码时，批都会被覆盖（一个作业可以有多个批，一个批可以有多个单元，一个单元可以有多个不同的数据）。这是因为“for job in jobs:”循环中的job['batches']=[]导致的。inicializa_batch函数的预期行为是什么？inicializa_batches将浏览所有文档，以查找特定类中的特定文本，它必须创建一系列批处理对象，并将它们附加到“for job in jobs”循环中迭代的作业对象的批处理字段中。我为什么要这样做？因为源文件的结构没有正确格式化——整个文档是一个表，没有任何类型的分区，没有div或任何东西。幸运的是，关键数据点有特定的类，但是无法区分每个批属于哪个作业。