Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/laravel/11.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 2.7 防止scrapy从url中删除方括号和花括号_Python 2.7_Scrapy - Fatal编程技术网

Python 2.7 防止scrapy从url中删除方括号和花括号

Python 2.7 防止scrapy从url中删除方括号和花括号,python-2.7,scrapy,Python 2.7,Scrapy,我需要将嵌套的dict作为参数传递给get请求 下面是它的工作原理 query = {%22channel%22:%22rent%22,%22page%22:2,%22pageSize%22:12,%22filters%22:{%22agencyIds%22:[%22CBPHMG%22]}} 以下是我从废日志中得到的信息: %7B%22pageSize%22:%20300,%20%22page%22:%208,%20%22channel%22:%20%22rent%22,%20%22filte

我需要将嵌套的dict作为参数传递给get请求

下面是它的工作原理

query = {%22channel%22:%22rent%22,%22page%22:2,%22pageSize%22:12,%22filters%22:{%22agencyIds%22:[%22CBPHMG%22]}}
以下是我从废日志中得到的信息:

%7B%22pageSize%22:%20300,%20%22page%22:%208,%20%22channel%22:%20%22rent%22,%20%22filters%22:%20%7B%22agencyIds%22:%20%22VDTUED%22%7D%7D
问题在于方括号和花括号

我现在所做的只是json.dumpsdict并将其附加到url。我还尝试使用反斜杠来防止更改符号。没有艾维尔

 q = {"channel":"sold","page":1,"pageSize":300,"filters":{"agencyIds":["PRDNEW"]}}
 query = json.dumps(q)
 query = query.replace('"', '\\"')
 url = url + query
下面的代码也可以很好地处理python3请求

import requests

url = "https://services.realestate.com.au/services/listings/search"

querystring = {"query":"{\"channel\":\"buy\",\"page\":2,\"pageSize\":12,\"filters\":{\"agencyIds\":[\"CBPHMG\"]}}"}

headers = {'cache-control': 'no-cache'}

response = requests.request("GET", url, headers=headers, params=querystring)

print(response.text)
您可以使用w3lib.url.add_或_replace_参数将查询参数附加到url。它将以与python请求相同的方式进行URL编码:

$ scrapy shell
2017-07-18 11:03:28 [scrapy.utils.log] INFO: Scrapy 1.4.0 started (bot: scrapybot)
(...)
>>> url = "https://services.realestate.com.au/services/listings/search"
>>> querystring = {"query":"{\"channel\":\"buy\",\"page\":2,\"pageSize\":12,\"filters\":{\"agencyIds\":[\"CBPHMG\"]}}"}
这是与python请求示例相同的输入数据

与参数名称及其值一起使用注意:Scrapy已依赖于w3lib.:

>>> from w3lib.url import add_or_replace_parameter
>>> add_or_replace_parameter(url, 'query', querystring['query'])
'https://services.realestate.com.au/services/listings/search?query=%7B%22channel%22%3A%22buy%22%2C%22page%22%3A2%2C%22pageSize%22%3A12%2C%22filters%22%3A%7B%22agencyIds%22%3A%5B%22CBPHMG%22%5D%7D%7D'
在这里,在Scrapy shell中,获取新URL将获得JSON响应,正如预期的那样:

>>> new_url = add_or_replace_parameter(url, 'query', querystring['query'])
>>> fetch(new_url)
2017-07-18 11:04:45 [scrapy.core.engine] INFO: Spider opened
2017-07-18 11:04:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://services.realestate.com.au/services/listings/search?query=%7B%22channel%22%3A%22buy%22%2C%22page%22%3A2%2C%22pageSize%22%3A12%2C%22filters%22%3A%7B%22agencyIds%22%3A%5B%22CBPHMG%22%5D%7D%7D> (referer: None)


>>> import json
>>> data = json.loads(response.text)
>>> data.keys()
dict_keys(['prettyUrl', 'totalResultsCount', 'resolvedQuery', '_links', 'tieredResults', 'channel'])

>>> from pprint import pprint
>>> pprint(data)
{'_links': {'adCall': {'href': 'https://sasinator.realestate.com.au/rea/hserver/site=rea/area=buy.resultslist/proptype=villa/constructionStatus=established/sub=marsden/state=qld/pcode=4132/region=logan/price=200k_300k/platform={platform}/version={version}/pos={position}/size={size}/viewid={viewId}/random={random}',
                       'templated': True},
            'canonical': {'href': 'http://www.realestate.com.au/buy/by-cbphmg/list-2'},
            'exclusiveShowcaseUrl': {'href': 'https://services.realestate.com.au/services/listings/exclusiveShowcase?query=%7B%22propertyTypes%22:[],%22atlasIds%22:[],%22channel%22:%22buy%22%7D'},
            'neighbourhoodsUrl': {'href': 'http://www.realestate.com.au/neighbourhoods?state=qld'},
            'next': {},
            'ofi': {'href': 'https://services.realestate.com.au/services/listings/ofi/{date}/daytotals?query=%7B%22channel%22:%22buy%22,%22pageSize%22:%2212%22,%22page%22:%222%22,%22filters%22:%7B%22agencyIds%22:%5B%22CBPHMG%22%5D%7D%7D',
                    'templated': True},
            'prettyUrl': {'href': '/buy/by-cbphmg/list-2'},
            'saveSearchUrl': {'href': 'https://www.realestate.com.au/saved-searches/#/save?search=%7B%22channel%22:%22buy%22,%22pageSize%22:%2212%22,%22page%22:%222%22,%22filters%22:%7B%22agencyIds%22:%5B%22CBPHMG%22%5D%7D%7D'},
            'self': {'href': 'https://services.realestate.com.au/services/listings/search?query=%7B%22channel%22:%22buy%22,%22pageSize%22:%2212%22,%22page%22:%222%22,%22filters%22:%7B%22agencyIds%22:%5B%22CBPHMG%22%5D%7D%7D'}},
 'channel': 'buy',
 'prettyUrl': '/buy/by-cbphmg/list-2',
 'resolvedQuery': {'channel': 'buy',
                   'filters': {'agencyIds': ['CBPHMG']},
                   'page': '2',
                   'pageSize': '12'},
 'tieredResults': [{'count': 11,
                    'results': [{...}],
                    'tier': 1}],
 'totalResultsCount': 23}