Python 动态组装scrapy GET请求字符串_Python_Scrapy

Python 动态组装scrapy GET请求字符串

python scrapy

Python 动态组装scrapy GET请求字符串,python,scrapy,Python,Scrapy,我一直在使用firebug，我有以下字典来查询api url = "htp://my_url.aspx#top" querystring = {"dbkey":"x1","stype":"id","s":"27"} headers = { 'accept': "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8", 'upgrade-insecure-requests': "1"

我一直在使用firebug，我有以下字典来查询api

url = "htp://my_url.aspx#top"

querystring = {"dbkey":"x1","stype":"id","s":"27"}

headers = {
    'accept': "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
    'upgrade-insecure-requests': "1",
    'user-agent': "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.125 
    }

对于python请求，使用它非常简单：

import requests
response = requests.request("GET", url, headers=headers, params=querystring)
print(response.text)

我怎样才能用在刮痧中？我一直在阅读，我知道以下内容适用于post：

        r = Request(my_url, method="post",  headers= headers, body=payload, callback=self.parse_method)

我试过：

    r = Request("GET", url, headers=headers, body=querystring, callback=self.parse_third_request)

我得到：

r = Request("GET", url, headers=headers, body=querystring, callback=self.parse_third_request)
TypeError: __init__() got multiple values for keyword argument 'callback'

编辑：

改为：

    r = Request(method="GET", url=url, headers=headers, body=querystring, callback=self.parse_third_request)

现在得到：

  File "C:\envs\r2\tutorial\tutorial\spiders\parker_spider.py", line 90, in parse_second_request
    r = Request(method="GET", url=url, headers=headers, body=querystring, callback=self.parse_third_request)
  File "C:\envs\virtalenvs\teat\lib\site-packages\scrapy\http\request\__init__.py", line 26, in __init__
    self._set_body(body)
  File "C:\envs\virtalenvs\teat\lib\site-packages\scrapy\http\request\__init__.py", line 68, in _set_body
    self._body = to_bytes(body, self.encoding)
  File "C:\envs\virtalenvs\teat\lib\site-packages\scrapy\utils\python.py", line 117, in to_bytes
    'object, got %s' % type(text).__name__)
TypeError: to_bytes must receive a unicode, str or bytes object, got dict

编辑2：

我现在有：

    yield Request(method="GET", url=url, headers=headers, body=urllib.urlencode(querystring), callback=self.parse_third_request)

def parse_third_request(self, response):
    from scrapy.shell import inspect_response
    inspect_response(response, self)
    print("hi")
    return None

没有错误，但在shell中，当我执行“response.url”时，我只获取基本url，没有获取参数。

查看

请求的签名初始化方法：
class scrapy.http.Request(url[, callback, method='GET', headers, body, cookies, meta, encoding='utf-8', priority=0, dont_filter=False, errback])

案例中的GET
字符串用作回调
参数的位置值
改为使用方法的关键字参数
（默认为GET
）：
谢谢，现在的问题似乎是我正在加载参数的dict。有一个简单的解决方法吗？@user61629啊，是的，你可以使用json.dumps（querystring）
。别忘了导入json
@user61629你知道吗，最后的建议可能不正确。我认为应该使用urllib.urlencode（querystring）
。让我知道什么对你有用。谢谢。@user61629好的，恐怕您不能使用body
，请尝试使用url+'？'+urllib.urlencode（querystring）之类的内容。让我知道它是否有用。现在的问题是它产生了以下形式的东西：htp://my_url.aspx#top?s=27&s=d&dbkey=jkl，顶部应该在末尾。也许我应该使用请求来形成url。
r = Request(url, method="GET", headers=headers, body=querystring, callback=self.parse_third_request)