得到一个奇怪的';例外。类型错误';关于Python代码
我在这一行得到以下错误,我不知道为什么。。。它以前工作过,但在调试代码的某个地方,坏了。。。有什么帮助吗?不确定有多少代码对发布有用,如果这还不够让我知道,我会更新。基本上,我只是试图将代码中的所有链接从以前混乱的列表中提取到同一个列表中得到一个奇怪的';例外。类型错误';关于Python代码,python,Python,我在这一行得到以下错误,我不知道为什么。。。它以前工作过,但在调试代码的某个地方,坏了。。。有什么帮助吗?不确定有多少代码对发布有用,如果这还不够让我知道,我会更新。基本上,我只是试图将代码中的所有链接从以前混乱的列表中提取到同一个列表中 exceptions.TypeError:“generator”对象没有属性“\uu getitem\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu item['playerurl'
exceptions.TypeError:“generator”对象没有属性“\uu getitem\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu
item['playerurl'] = re.findall(r'"[^"]*"',"".join(item['playerurl'])) #used to parse
编辑:项目文件中的项目声明
class TeamStats(Item):
# define the fields for your item here like:
# name = scrapy.Field()
team = Field()
division = Field()
rosterurl = Field()
player_desc = Field()
playerurl = Field()
pass
我将发布我的全部代码:
##the above code is for the real run but the below code is just for testing as it hits less pages
division = response.xpath('//div[@id="content"]//div[contains(@class, "mod-teams-list-medium")]')
for team in response.xpath('//div[@id="content"]//div[contains(@class, "mod-teams-list-medium")]'): #goes through all teams in each division
item = TeamStats() #creates new TeamStats item
item['division'] = division.xpath('.//div[contains(@class, "mod-header")]/h4/text()').extract()[0] #extracts the text which represents division, team and roster url
item['team'] = team.xpath('.//h5/a/text()').extract()[0]
item['rosterurl'] = "http://espn.go.com" + team.xpath('.//div/span[2]/a[3]/@href').extract()[0]
request = scrapy.Request(item['rosterurl'], callback = self.parseWPNow) #opens up roster url to parse player data
request.meta['play'] = item
yield request #run the request through parseWPNow
def parseWPNow(self, response): #after each request in parse, this is run
item = response.meta['play'] #current item gets restored through meta tag
item = self.parseRoster(item, response) #goes through and takes basic player data while filling playerurl (needed for next step)
item = self.parsePlayer(item, response) #gets player stats
return item #returns filled item object and on to next item
def parseRoster(self, item, response):
players = Player() #creates player object to be filled
int = 0
for player in response.xpath("//td[@class='sortcell']"): #fills basic player stats in each player object
players['name'] = player.xpath("a/text()").extract()[0]
players['position'] = player.xpath("following-sibling::td[1]/text()").extract()[0]
players['age'] = player.xpath("following-sibling::td[2]/text()").extract()[0]
players['height'] = player.xpath("following-sibling::td[3]/text()").extract()[0]
players['weight'] = player.xpath("following-sibling::td[4]/text()").extract()[0]
players['college'] = player.xpath("following-sibling::td[5]/text()").extract()[0]
players['salary'] = player.xpath("following-sibling::td[6]/text()").extract()[0]
players['height'] = players['height']
yield players
item['playerurl'] = response.xpath("//td[@class='sortcell']/a").extract() #playerurl is important for extracting the data info
yield item
def parsePlayer(self,item,response):
item['playerurl'] = re.findall(r'"[^"]*"',"".join(item['playerurl'])) #used to parse
for each in item['playerurl']: #goes through each player in url and sets up requests1 to extract requests
each = each[1:-1]
each = each[:30]+"gamelog/"+each[30:]
request1 = scrapy.Request(each, callback = self.parsePlayerNow)
yield request1
看起来项
不是字典。它是一台发电机
您应该检查您的逻辑,并查看您在哪里制作项
作为生成器
请注意,生成器是一个类似于列表的对象。例如:
gen = (e for e in [1,2])
print type(gen)
# <generator object <genexpr> at 0x0000000001DB6E10>
你会得到一个例外:
TypeError: 'generator' object has no attribute '__getitem__'
编辑:是,项是一个生成器。您的parsePlayer
方法是“返回”一个生成器(因为yield
语句)。请参见此示例:
def f():
a = 1
yield a + 1
print f()
# <generator object f at 0x0000000002A793A8>
def():
a=1
产量a+1
打印f()
#
好吧,什么是项目
?项目是我在代码中声明的一个不完整的项目在代码中您没有声明项目
。我很确定我声明了它,我在几个不同的功能中使用它,没有任何问题我已经将它声明为一个项目,我很确定我没有将其用作生成器。要了解yield
和return
之间的区别吗?@user3042850正如Haken Lid所说,yield
和return
之间是有区别的。如果您指向Parse花名册函数下的“yield item”,则不是100%,我之所以有收益项目,是因为我不能在同一个函数下混合收益和收益。。。这也是为什么我不得不创建另一个名为parseWPNow的函数,而不是在我最初的forloop结束时返回项。我知道yield用于生成器,因此您可以使用不同的输入重用函数,但我还没有掌握它。问题实际上出在parsePlayer
方法。
def f():
a = 1
yield a + 1
print f()
# <generator object f at 0x0000000002A793A8>