Python请求操作系统错误104连接中断错误
大家好,我正在尝试使用python的请求模块实现一个API。由于页面数量约为20000,Api必须被点击20000次。在每一次点击中,数据约为10MB。在这个过程结束时,它会创建一个大约100gb的json文件。这是我写的代码Python请求操作系统错误104连接中断错误,python,python-3.x,python-requests,azure-databricks,chunked-encoding,Python,Python 3.x,Python Requests,Azure Databricks,Chunked Encoding,大家好,我正在尝试使用python的请求模块实现一个API。由于页面数量约为20000,Api必须被点击20000次。在每一次点击中,数据约为10MB。在这个过程结束时,它会创建一个大约100gb的json文件。这是我写的代码 with open('file.json','wb',buffering=100*1048567) as f: while(next_page_cursor != ""): with request.get(url,headers=headers) as re
with open('file.json','wb',buffering=100*1048567) as f:
while(next_page_cursor != ""):
with request.get(url,headers=headers) as response:
json_response = json.loads(response.content.decode('utf-8'))
"""
json response looks something like this
{
content:[{},{},{}........50 dictionaries]
next_page_cursor : "abcd"
}
"""
next_page_cursor = json_response['next_page_cursor']
for data in json_response['content']:
f.write((json.dumps(data) + "\n").encode())
但在成功运行几页后,代码失败,出现以下错误:
Traceback (most recent call last):
File "<command-1206920060120926>", line 65, in <module>
with requests.get(data_url, headers = headers) as response:
File "/databricks/python/lib/python3.7/site-packages/requests/api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "/databricks/python/lib/python3.7/site-packages/requests/api.py", line 60, in request
return session.request(method=method, url=url, **kwargs)
File "/databricks/python/lib/python3.7/site-packages/requests/sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "/databricks/python/lib/python3.7/site-packages/requests/sessions.py", line 686, in send
r.content
File "/databricks/python/lib/python3.7/site-packages/requests/models.py", line 828, in content
self._content = b''.join(self.iter_content(CONTENT_CHUNK_SIZE)) or b''
File "/databricks/python/lib/python3.7/site-packages/requests/models.py", line 753, in generate
raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ('Connection broken: OSError("(104, \'ECONNRESET\')")', OSError("(104, 'ECONNRESET')"))
回溯(最近一次呼叫最后一次):
文件“”,第65行,在
使用requests.get(data_url,headers=headers)作为响应:
文件“/databricks/python/lib/python3.7/site packages/requests/api.py”,第75行,在get中
返回请求('get',url,params=params,**kwargs)
文件“/databricks/python/lib/python3.7/site packages/requests/api.py”,请求中的第60行
return session.request(method=method,url=url,**kwargs)
文件“/databricks/python/lib/python3.7/site packages/requests/sessions.py”,请求中的第533行
resp=自我发送(准备,**发送)
文件“/databricks/python/lib/python3.7/site packages/requests/sessions.py”,第686行,在send中
r、 内容
content中的文件“/databricks/python/lib/python3.7/site packages/requests/models.py”,第828行
self.\u content=b“”。加入(self.iter\u content(content\u CHUNK\u SIZE))或b“”
文件“/databricks/python/lib/python3.7/site packages/requests/models.py”,第753行,在generate中
raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError:('Connection breaked:OSError('104,\'ECONNRESET\')),OSError('104,'ECONNRESET'))
您需要使用response.iter\u content
您需要使用
response.iter\u content
但这将以二进制格式转换响应。那么我如何从二进制响应中捕获下一页光标啊,对不起,可能是我回答得太快了。你的情况看起来不一样。json响应告诉您是否必须执行下一个请求。因此,可能不适合您的情况。我可以考虑的另一件事是在request.get中尝试一个超时值,但这将以二进制格式转换响应。那么我如何从二进制响应中捕获下一页光标啊,对不起,可能是我回答得太快了。你的情况看起来不一样。json响应告诉您是否必须执行下一个请求。因此可能不适合您的情况。我可以想到的另一件事是在request.get中尝试一个超时值