使用js2xml和Scrapy,如何迭代json对象以选择特定节点?
我试图使用js2xml从页面迭代JSON响应。 我的问题是,如何调用“stores”节点并仅将其作为响应传递?JSON如下所示:使用js2xml和Scrapy,如何迭代json对象以选择特定节点?,json,scrapy,js2xml,Json,Scrapy,Js2xml,我试图使用js2xml从页面迭代JSON响应。 我的问题是,如何调用“stores”节点并仅将其作为响应传递?JSON如下所示: <script> window.appData = { "ressSize": "large", "cssPath": "http://css.bbystatic.com/", "imgPath": "http://images.bbystatic.com/", "jsPath": "
<script>
window.appData = {
"ressSize": "large",
"cssPath": "http://css.bbystatic.com/",
"imgPath": "http://images.bbystatic.com/",
"jsPath": "http://js.bbystatic.com/",
"bbyDomain": "http://www.bestbuy.com/",
"bbySslDomain": "https://www-ssl.bestbuy.com/",
"isUserLoggedIn": false,
"zipCode": "46801",
"stores": [{
"id": "2727",
"name": "GLENBROOK SQUARE",
"addr1": "4201 coldwater rd",
"addr2": "spc g10",
"city": "fort wayne",
"state": "IN",
"country": "US",
"zipCode": "46805",
"phone": "260-482-5230"...
<\script>
def parse(self, response):
js = response.xpath('//script[contains(.,"window.appData")]/text()').extract_first()
jstree = js2xml.parse(js)
app_data_node = jstree.xpath('//assign[left//identifier[@name="appData"]]/right/*')[0]
app_data = js2xml.make_dict(app_data_node)
for store in app_data['stores']:
yield store
对此的回应让我感到:
大的
http://css.bbystatic.com/
http://images.bbystatic.com/
http://js.bbystatic.com/
http://www.bestbuy.com/
https://www-ssl.bestbuy.com/
假的
{'bbyDomain':'http://www.bestbuy.com/',
“bbySslDomain”:”https://www-ssl.bestbuy.com/',
“cssPath”:http://css.bbystatic.com/',
“imgPath”:http://images.bbystatic.com/',
'isUserLoggedIn':False,
“jsPath”:”http://js.bbystatic.com/',
“首选门店”:[],
“ressize”:“large”,
“门店”:[],
'zipCode':''}
任何想法都会有帮助 我们以纽约为地点 在Scrapy回调中,您可以这样翻译:
<script>
window.appData = {
"ressSize": "large",
"cssPath": "http://css.bbystatic.com/",
"imgPath": "http://images.bbystatic.com/",
"jsPath": "http://js.bbystatic.com/",
"bbyDomain": "http://www.bestbuy.com/",
"bbySslDomain": "https://www-ssl.bestbuy.com/",
"isUserLoggedIn": false,
"zipCode": "46801",
"stores": [{
"id": "2727",
"name": "GLENBROOK SQUARE",
"addr1": "4201 coldwater rd",
"addr2": "spc g10",
"city": "fort wayne",
"state": "IN",
"country": "US",
"zipCode": "46805",
"phone": "260-482-5230"...
<\script>
def parse(self, response):
js = response.xpath('//script[contains(.,"window.appData")]/text()').extract_first()
jstree = js2xml.parse(js)
app_data_node = jstree.xpath('//assign[left//identifier[@name="appData"]]/right/*')[0]
app_data = js2xml.make_dict(app_data_node)
for store in app_data['stores']:
yield store
让我们以纽约为地点 在Scrapy回调中,您可以这样翻译:
<script>
window.appData = {
"ressSize": "large",
"cssPath": "http://css.bbystatic.com/",
"imgPath": "http://images.bbystatic.com/",
"jsPath": "http://js.bbystatic.com/",
"bbyDomain": "http://www.bestbuy.com/",
"bbySslDomain": "https://www-ssl.bestbuy.com/",
"isUserLoggedIn": false,
"zipCode": "46801",
"stores": [{
"id": "2727",
"name": "GLENBROOK SQUARE",
"addr1": "4201 coldwater rd",
"addr2": "spc g10",
"city": "fort wayne",
"state": "IN",
"country": "US",
"zipCode": "46805",
"phone": "260-482-5230"...
<\script>
def parse(self, response):
js = response.xpath('//script[contains(.,"window.appData")]/text()').extract_first()
jstree = js2xml.parse(js)
app_data_node = jstree.xpath('//assign[left//identifier[@name="appData"]]/right/*')[0]
app_data = js2xml.make_dict(app_data_node)
for store in app_data['stores']:
yield store
请提供源URL,以便使用真实数据测试代码。“stores”键的值是HTML源中的空数组:
window.appData={“ressize”:“large”,“cssPath”:http://css.bbystatic.com/,“imgPath”:http://images.bbystatic.com/,“jsPath”:http://js.bbystatic.com/,“bbyDomain”:http://www.bestbuy.com/“,“bbySslDomain”:"https://www-ssl.bestbuy.com/“,“isUserLoggedIn”:false,“zipCode”:”“stores”:[],“preferredStores”:[]};
当我下拉页面时,它不是空的。但我确实看到它是活动的。我认为如果填充了zipCode,它将生成列表。请提供源URL,以便使用真实数据测试代码。“stores”的值键是HTML源代码中的空数组:window.appData={“ressize”:“large”,“cssPath”:http://css.bbystatic.com/,“imgPath”:http://images.bbystatic.com/,“jsPath”:http://js.bbystatic.com/,“bbyDomain”:http://www.bestbuy.com/,“bbySslDomain”:https://www-ssl.bestbuy.com/,“isUserLoggedIn”:false,“zipCode”:“,“stores”:[],“preferredStores”:[]};
当我下拉页面时,它不是空的。但我确实看到了它。我想如果有一个zipcode填充,它会生成列表。非常感谢您的帮助!非常感谢您的帮助!