Python 将刮痧中的元素提取到不同的领域
我是新来的刮痧,并试图从我的供应商网站上刮一些图像 这是HTML代码:Python 将刮痧中的元素提取到不同的领域,python,xpath,scrapy,Python,Xpath,Scrapy,我是新来的刮痧,并试图从我的供应商网站上刮一些图像 这是HTML代码: <div class="image-additional caruofredsel-additional"> <a href="http://scssports.in/image/cache/data/adidas-t-t-husgado-white-unisex-sports-shoes-g40422-500x500.jpg" title="Adidas T T Husgado Badminton a
<div class="image-additional caruofredsel-additional">
<a href="http://scssports.in/image/cache/data/adidas-t-t-husgado-white-unisex-sports-shoes-g40422-500x500.jpg" title="Adidas T T Husgado Badminton and Squash Shoes" class="colorbox" rel="colorbox"><img class="fade-image"
src="http://scssports.in/image/cache/data/adidas-t-t-husgado-white-unisex-sports-shoes-g40422-80x80.jpg" title="Adidas T T Husgado Badminton and Squash Shoes" alt="Adidas T T Husgado Badminton and Squash Shoes" />
</a>
谢谢 由于
标记位于
标记内,因此应能正常工作
titles = hxs.select("//div[@class='image-additional caruofredsel-additional']")
items = []
for titles in titles:
item = SampleItem()
item ["imgurl2"] = titles.select("a/@href").extract()
items.append(item)
item ["imgurl1"] = titles.select("a/img/@src").extract()