Python 将零碎管道数据写入数据库时,错误是值太多,无法解包

Python 将零碎管道数据写入数据库时,错误是值太多,无法解包,python,scrapy,pymssql,Python,Scrapy,Pymssql,下午好, 我尝试从web获取数据并存储在SQLServer中。使用lib-pymssql,已经建立了连接。但当处理项目时,会出现错误“太多值无法解包”,因此我还附加了MyItem类。我看不出明显的错误? 下面是pipelines.py中的代码 -*- coding: utf-8 -*- import pymssql from scrapy import signals import json import codecs class MyPipeline(object):

下午好, 我尝试从web获取数据并存储在SQLServer中。使用lib-pymssql,已经建立了连接。但当处理项目时,会出现错误“太多值无法解包”,因此我还附加了MyItem类。我看不出明显的错误? 下面是pipelines.py中的代码

-*- coding: utf-8 -*-
import pymssql
from scrapy import signals   
import json   
import codecs   
class MyPipeline(object):   
    def __init__(self):
         self.conn = pymssql.connect(host=r".\\MyPC",user='sa',password='XXXX',database='Webmining')
         self.cursor = self.conn.cursor()
    def process_item(self, item, spider):
         try:
             self.cursor.executemany("INSERT INTO RecruitInformation(recruitNumber,name,detailLink,publishTime,catalog,worklocation) VALUES (%d,%s,%s,%t,%s,%s)",(item['recruitNumber'],item['name'],item['detailLink'],item['publishTime'],item['catalog'],item['worklocation']))
             self.conn.commit()
         except pymssql.InterfaceError, e:
             print ("pymssql.InterfaceError")
         except pymssql.DataError, e:
             print ("pymssql.DataError")
         except pymssql.OperationalError, e:
             print ("pymssql.OperationalError")
         except pymssql.IntegrityError, e:
             print ("pymssql.IntegrityError")
         except pymssql.InternalError, e:
             print ("pymssql.InternalError")
         except pymssql.ProgrammingError, e:
             print ("pymssql.ProgrammingError")
         except pymssql.NotSupportedError, e:
             print ("pymssql.NotSupportedError")
             return item
     def spider_closed(self, spider):
         self.conn.close()
//the code in item.py is as follow
import scrapy
from scrapy.item import Item, Field  
class MyItem(Item): 
     name = Field()         
     catalog = Field()          
     workLocation = Field()     
     recruitNumber = Field()       
     detailLink = Field()      
     publishTime = Field()

class MySpider(CrawlSpider):   
     name = "xxxx"   
     allowed_domains = ["xxxx.com"]   
     start_urls = [   "http://xx.xxxx.com/position.php"]   
     rules = [Rule(sle(allow=("/position.php\?&start=\d{,4}#a")),                         follow=True,callback='parse_item')]     
     def parse_item(self, response): 
         items = []   
         sel = Selector(response)    
         base_url = get_base_url(response)   
         sites_even = sel.css('table.tablelist tr.even')   
     for site in sites_even:   
         item = MyItem()   
         item['name'] = site.css('.l.square a').xpath('text()').extract()   
         relative_url = site.css('.l.square a').xpath('@href').extract()[0]   
         item['detailLink'] = urljoin_rfc(base_url, relative_url)   
         item['catalog'] = site.css('tr > td:nth-child(2)::text').extract()   
         item['workLocation'] = site.css('tr > td:nth-child(4)::text').extract()   
         item['recruitNumber'] = site.css('tr > td:nth-child(3)::text').extract()   
         item['publishTime'] = site.css('tr > td:nth-child(5)::text').extract()   
         items.append(item)   
         sites_odd = sel.css('table.tablelist tr.odd')   
         for site in sites_odd:   
              item = MyItem()   
              item['name'] = site.css('.l.square a').xpath('text()').extract()   
              relative_url = site.css('.l.square a').xpath('@href').extract()[0]   
              item['detailLink'] = urljoin_rfc(base_url, relative_url)   
              item['catalog'] = site.css('tr > td:nth-child(2)::text').extract()   
              item['workLocation'] = site.css('tr > td:nth-child(4)::text').extract()   
              item['recruitNumber'] = site.css('tr > td:nth-child(3)::text').extract()   
              item['publishTime'] = site.css('tr > td:nth-child(5)::text').extract()   
              items.append(item)   
              return items   
      def _process_request(self, request):   
              info('process ' + str(request))   
              return request

请尝试在代码中使用
self.cursor.execute
而不是
self.cursor.executemany

由于我不熟悉Scrapy,请查看项的定义,它类似于字典,并且不仅包含一个值。所以它不能用在sql中。但是如何让它一个接一个地获取呢?在这种情况下,填充元素的代码将是至关重要的,因为我认为您用列表而不是单个值填充一个
字段。请编辑您的问题并添加来自spider的代码。我已经添加了来自spider的代码,我还将函数executemany()替换为可以处理dictionary@GHajbaDid的executemany()。例如,您可以查看
item['name']
中的内容吗?因为我猜一切都是一个列表,
detailLink
是一个简单的字符串。5月5日,U531\U5f0 0\U531\U531\U5f0 0\U531 1\U531\U531\U531\U531\U531\U5 5 5 5\U510 10 10 10 10-10 10 10 10 10 10-10 10 10 10 10 10 10-10 10 10 10-10 10-10 10-8-10-10-10-10-10-10-8-10 10-8-8-8-8-8-Ufffffffffff8-8-8-8-8-8-8-8-8-10-8-8-8-8-8-10-8-8-8-8-8-8-8-8-8-10-8-8-8-8-8-8-10-8-8-10-10-10-8-8-8-8-8-8-8-8-8-8-8-8-10 10-10-10 1\u5733\uff09',u'TEG07-CDC\u9ad8\u7ea7\u4ea4\u4e92\u8bbe\u8ba1\u 5e08(\u6df1\u5733)“,u'TEG10-IDC\u8d44\u6e90\u7ba1\u7406\u7ecf\u7406\uff08\u6df1\u5733\uff09']