Python 从csv中提取数据的问题

Python 从csv中提取数据的问题,python,csv,scrapy,Python,Csv,Scrapy,这是一个尝试从csv文件(第3列中包含提取公司名称)检查谷歌电子邮件的代码。 第一列包含我想在csv中提取为“SIRET”的信息。 我怎么做 如果我在读取csv时在start_url中提取它,我的url将是错误的。如果我使用它来解析,我不会:解析与数据相关的好数据,我可能会有错误,因为访问一个文件2次 如何使第一次读取的信息进入解析函数中的SIRET 我为此挣扎了几个小时:( 最好的,我们可以使用它 class QuotesSpider(scrapy.Spider): name = "g

这是一个尝试从csv文件(第3列中包含提取公司名称)检查谷歌电子邮件的代码。 第一列包含我想在csv中提取为“SIRET”的信息。 我怎么做

如果我在读取csv时在start_url中提取它,我的url将是错误的。如果我使用它来解析,我不会:解析与数据相关的好数据,我可能会有错误,因为访问一个文件2次

如何使第一次读取的信息进入解析函数中的SIRET

我为此挣扎了几个小时:(

最好的,我们可以使用它

class QuotesSpider(scrapy.Spider):
    name = "googlemailverif"

    with open('input.csv', "r") as csvfile:
        datareader = csv.reader(csvfile)

        start_urls=['https://www.google.fr/search?q=email'+str(row[2]) for row in datareader]



    # starting parsing
    def parse(self, response):
        yield {
                'url': response.url,
                'nom': "nom",
                'emails': re.findall(r"[a-zA-Z0-9_\.+-]+@[a-zA-Z0-9_\.+-]+\.[a-zA-Z]{2,6}",''.join(response.xpath("//body//text()").extract()).strip()),
                'SIRET':"SIRET",
                    }
现在您有了一个包含SIRET值的列表和另一个包含url的列表

sirets, start_urls = zip(*[(row[0], 'https://www.google.fr/search?q=email'+str(row[2])) for row in datareader])
这是csv的摘录

每次我都有一个“SIRET”作为sirets值,但另一个var每次都会递增和更改


非常感谢+++

查看scrapy的文档,特别是method@roganjosh如果您查看原始问题,您将看到它已经是一个类,但格式不正确。我所做的只是Ctrl-kThanks,这非常令人鼓舞!尽管如此,self.sirets[0]或者self.siret总是显示第一行,在第1行之后什么也不显示..我不知道如何增加变量sirets您能从
数据读取器
发布几行吗?没有数据就很难调试我想我理解了一些东西。当给出变量行[0]对于sirets,它不会进入scrapy进入的过程。因此,var没有正确递增。实际上,var应该与中间件中的url以及要在摘录中排序的管道一起。您知道如何做到这一点吗/
"SIRET","NIC","L1_NORMALISEE","L2_NORMALISEE","L3_NORMALISEE","L4_NORMALISEE","L5_NORMALISEE","L6_NORMALISEE","L7_NORMALISEE","L1_DECLAREE","L2_DECLAREE","L3_DECLAREE","L4_DECLAREE","L5_DECLAREE","L6_DECLAREE","L7_DECLAREE","NUMVOIE","INDREP","TYPVOIE","LIBVOIE","CODPOS","CEDEX","RPET","LIBREG","DEPET","ARRONET","CTONET","COMET","LIBCOM","DU","TU","UU","EPCI","TCD","ZEMET","SIEGE","ENSEIGNE","IND_PUBLIPO","DIFFCOM","AMINTRET","NATETAB","LIBNATETAB","APET700","LIBAPET","DAPET","TEFET","LIBTEFET","EFETCENT","DEFET","ORIGINE","DCRET","DATE_DEB_ETAT_ADM_ET","ACTIVNAT","LIEUACT","ACTISURF","SAISONAT","MODET","PRODET","PRODPART","AUXILT","NOMEN_LONG","SIGLE","NOM","PRENOM","CIVILITE","RNA","NICSIEGE","RPEN","DEPCOMEN","ADR_MAIL","NJ","LIBNJ","APEN700","LIBAPEN","DAPEN","APRM","ESSEN","DATEESS","TEFEN","LIBTEFEN","EFENCENT","DEFEN","CATEGORIE","DCREN","AMINTREN","MONOACT","MODEN","PRODEN","ESAANN","TCA","ESAAPEN","ESASEC1N","ESASEC2N","ESASEC3N","ESASEC4N","VMAJ","VMAJ1","VMAJ2","VMAJ3","DATEMAJ"
"005720164","00028","SA SAINTE ISABELLE","","","236 ROUTE D AMIENS","","80100 ABBEVILLE","FRANCE","SA SAINTE-ISABELLE","","","236 RTE D AMIENS","","80100 ABBEVILLE","","236","","RTE","D AMIENS","80100","","32","Nord-Pas-de-Calais-Picardie","80","1","98","001","ABBEVILLE","80","4","01","248000556","41","2209","1","","1","O","201209","","","8610Z","Activités hospitalières","2008","22","100 à 199 salariés","100","2015","1","19830928","19830928","NR","99","","P","S","O","","0","SA SAINTE-ISABELLE","","","","","","00028","32","80001","","5599","SA à conseil d'administration (s.a.i.)","8610Z","Activités hospitalières","2008","","","","22","100 à 199 salariés","100","2015","ETI","19570101","201209","1","S","O","","","","","","","","","","","","2014-07-30T00:00:00"
"005720784","00031","ETABLISSEMENTS DECAYEUX","","","ZONE INDUSTRIELLE","","80210 FEUQUIERES EN VIMEU","FRANCE","ETABLISSEMENTS DECAYEUX","","","ZONE INDUSTRIELLE","","80210 FEUQUIERES EN VIMEU","","","","","ZONE INDUSTRIELLE","80210","","32","Nord-Pas-de-Calais-Picardie","80","1","17","308","FEUQUIERES EN VIMEU","80","1","18","248000630","15","0055","0","","1","O","201209","","","2572Z","Fabrication de serrures et de ferrures","2008","22","100 à 199 salariés","100","2015","4","19930401","19930401","NR","99","","P","S","O","","0","ETABLISSEMENTS DECAYEUX","","","","","","00015","32","80308","","5710","SAS/// société par actions simplifiée","2599A","Fabrication d'articles métalliques ménagers","2008","","N","20160915","32","250 à 499 salariés","200","2015","ETI","19570101","201209","3","S","O","2012","6","2599A","2599A","2599B","2572Z","4649Z","","","","","2001-12-13T00:00:00"