Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/vba/17.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
无法通过使用scrapy和css在中遍历进行刮取_Css_Scrapy - Fatal编程技术网

无法通过使用scrapy和css在中遍历进行刮取

无法通过使用scrapy和css在中遍历进行刮取,css,scrapy,Css,Scrapy,html代码如下: <td class="column-3"> (price per 1,000 images)<br> 0-1M images - <span class="price-data " data-amount="{&quot;regional&quot;:{&quot;asia-pacific-

html代码如下:

<td class="column-3">
                                (price per 1,000 images)<br>
0-1M images                                    -
<span class="price-data " data-amount="{&quot;regional&quot;:{&quot;asia-pacific-southeast&quot;:0.5,&quot;australia-east&quot;:0.5,&quot;brazil-south&quot;:0.5,&quot;canada-central&quot;:0.5,&quot;central-india&quot;:0.5,&quot;europe-north&quot;:0.5,&quot;europe-west&quot;:0.5,&quot;united-kingdom-south&quot;:0.5,&quot;us-east&quot;:0.5,&quot;us-east-2&quot;:0.5,&quot;us-south-central&quot;:0.5,&quot;us-west-2&quot;:0.5,&quot;us-west-central&quot;:0.5}}" data-decimals="3" data-decimals-force="0" data-region-unavailable="N/A" data-has-valid-price="true">$0.50</span>                                    <br>
1M-5M images                                    -
<span class="price-data " data-amount="{&quot;regional&quot;:{&quot;asia-pacific-southeast&quot;:0.4,&quot;australia-east&quot;:0.4,&quot;brazil-south&quot;:0.4,&quot;canada-central&quot;:0.4,&quot;central-india&quot;:0.4,&quot;europe-north&quot;:0.4,&quot;europe-west&quot;:0.4,&quot;united-kingdom-south&quot;:0.4,&quot;us-east&quot;:0.4,&quot;us-east-2&quot;:0.4,&quot;us-south-central&quot;:0.4,&quot;us-west-2&quot;:0.4,&quot;us-west-central&quot;:0.4}}" data-decimals="3" data-decimals-force="0" data-region-unavailable="N/A" data-has-valid-price="true">$0.40</span>                                    <br>
5M+ images                                    -
<span class="price-data " data-amount="{&quot;regional&quot;:{&quot;asia-pacific-southeast&quot;:0.325,&quot;australia-east&quot;:0.325,&quot;brazil-south&quot;:0.325,&quot;canada-central&quot;:0.325,&quot;central-india&quot;:0.325,&quot;europe-north&quot;:0.325,&quot;europe-west&quot;:0.325,&quot;united-kingdom-south&quot;:0.325,&quot;us-east&quot;:0.325,&quot;us-east-2&quot;:0.325,&quot;us-south-central&quot;:0.325,&quot;us-west-2&quot;:0.325,&quot;us-west-central&quot;:0.325}}" data-decimals="3" data-decimals-force="0" data-region-unavailable="N/A" data-has-valid-price="true">$0.325</span>                                    <br>
                            </td>
网址:

如何遍历和刮取数据?我想将td标签拆分为countbr次,然后刮取。我不想使用xpath。我想通过css得到结果

dumb = 'Your response, or above text'
html_dumb = Selector(text=dumb)
td_vals = [x.strip().strip('- ') for x in 
html_dumb.xpath("//td/text()").extract() if x.strip()]   #got all td values
f_val = td_vals[0] # seperate the first one. here (price per 1,000 images)
td_vals = td_vals[1:]
span_vals = [x.strip() for x in html_dumb.xpath("//span/@data-amount").extract() if x.strip()]    #got all span data, you can also get span text if you need
inner_json = {}
result = {}

for td_val, span_val in zip(td_vals, span_vals):
    d[td_val] = json.loads(span_val)    #building inner dictionary

result[f_val] = d   #append in outer one

{u'每1000张图片的价格:{u'5M+图片:{u'区域':{u'英国-南部:0.325,u'欧洲-北部:0.325,u'巴西-南部:0.325,u'us-west-2':0.325,u'us-south-central:0.325,u'central-india':0.325,u'us-east':0.325,u'canada-central':0.325,u'europe-west:0.325,u'us-east-2':0.325,u'us-west-central':0.325,u'asia-pacific-south:0.325,u'east-australia':0.325:1M'{u'地区':{u'英国-南':0.5,u'欧洲-北':0.5,u'巴西-南':0.5,u'us-west-2':0.5,u'us-south-central':0.5,u'us-east':0.5,u'canada-central':0.5,u'europe-west':0.5,u'us-east-2':0.5,u'us-west-central':0.5,u'asia-pacific-south':0.5,u'australia-east':0.5,u's-east:5M''{u'regional':{u'united kingdom-south':0.4,u'europe-north':0.4,u'brazil-south':0.4,u'us-west-2':0.4,u'us-south-central':0.4,u'us-east':0.4,u'canada-central':0.4,u'europe-west':0.4,u'us-east-2':0.4,u'us-west-central':0.4,u'asia-pacific-south':0.4,u'australia-east'

完全不清楚您想要的是什么。CSS无法“拆分”任何内容,也无法“查找匹配项”。请您澄清您试图检索的数据是什么?或者您只是问是否可以遍历br?我希望o/p如下:-{“每1000张图像的价格”:{“0-1M张图像”:{“德国中部”:0.009,“英国南部”:0.01,“欧洲北部”:0.008,“美国东部-2”:0.009,},“100万至500万张图片”:{‘德国中部’:0.009,‘英国南部’:0.01,‘欧洲北部’:0.008,‘美国东部’:0.009,‘亚太东部’:0.01,‘英国西部’:0.01},'5M+图像:{‘德国中部’:0.009,‘英国南部’:0.01,‘欧洲北部’:0.008}}