Python API端点上令人费解的格式化魔法_Python_Python 3.x_Api_Python Requests_Opendata

Python API端点上令人费解的格式化魔法

python python-3.x api

Python API端点上令人费解的格式化魔法,python,python-3.x,api,python-requests,opendata,Python,Python 3.x,Api,Python Requests,Opendata,我正在为这封信写封信但是，我似乎无法产生与以下简单curl请求相同的结果： >>>import requests >>>header = {'Authorization': 'Bearer 36e39957ace6f405a82cfb09522d0a8d'} >>>departure_data = requests.get('https://api.deutschebahn.com/fahrplan-plus/v1/departureBoa

我正在为这封信写封信

但是，我似乎无法产生与以下简单curl请求相同的结果：

>>>import requests
>>>header = {'Authorization': 'Bearer 36e39957ace6f405a82cfb09522d0a8d'}
>>>departure_data = requests.get('https://api.deutschebahn.com/fahrplan-plus/v1/departureBoard/8011160?date=2017-06-30', headers=header)

# Now, using a journey's details id, lets request some journey details from the endpoint
>>>requests.get('https://api.deutschebahn.com/fahrplan-plus/v1/journeyDetails/' + departure_data.json()[0]['detailsId'], headers=header)
<Response [404]>
>>>requests.get('https://api.deutschebahn.com/fahrplan-plus/v1/journeyDetails/' + departure_data.json()[0]['detailsId'], headers=header).request.url
'https://api.deutschebahn.com/fahrplan-plus/v1/journeyDetails/782334%2F275830%2F795514%2F136979%2F80%3fstation_evaId%3D8098160'

这一点神奇发生了：

原始旅行ID

'782334%2F275830%2F795514%2F136979%2F80%3fstation_evaId%3D8098160'

变成：

'782334%252F275830%252F795514%252F136979%252F80%253fstation_evaId%253D8098160'

并返回状态

似乎不知从何而来，我在旅程中加入了一些角色。我将它复制粘贴到给定的字段中，仅此而已，因此我知道它不是我

我相信有某种编码/解码发生，但我以前从未见过这种情况，老实说，我不知道该如何理解它

在代码中如何处理这个问题？显然，除了简单地解析

偏离

端点之外，我还需要做些什么？或者，更好的是，我只是错过了一些显而易见的事情

我已经向DB开发人员发送了多封邮件，但到目前为止还没有收到他们的回复。

您看到的是双URL编码。百分比符号

正在使用相应的

%25

序列进行URL编码：

/ -> %2F -> %252F

在执行以下操作之前，请尝试URL解码

离开\u data.json（）[0]['detailsId']

>>> requests.get('https://api.deutschebahn.com/fahrplan-plus/v1/journeyDetails/' + departure_data.json()[0]['detailsId'], headers=header)

比如像这样

requests.get('https://api.deutschebahn.com/fahrplan-plus/v1/journeyDetails/' + urllib.unquote(urllib.unquote(departure_data.json()[0]['detailsId'])), headers=header)

在中，定义了四个端点：

GET /location/{name} GET /arrivalBoard/{id} GET /departureBoard/{id} GET /journeyDetails/{id}

detailsId

是可以用来点击

/journeyDetails/{id}

端点的。因此，最小工作代码如下所示（注意调用

urllib.parse.quote

）：

travely\u id

的值本身是URL编码的，并解码为类似URL片段的内容：

urllib.parse.unquote(journey_id)
# -> '564552/203236/867650/245641/80?station_evaId=8098160'

因此，看起来您可以简单地使用原始值进行进一步的请求，但这是一个误解

将ID视为需要编码的不透明纯文本值，就像在URL中使用它之前对任何其他任意值进行编码一样

引用该值时，百分比符号将由

%25

转义，这将导致较长的值：

'564552%2F203236%2F867650%2F245641%2F80%3fstation_evaId%3D8098160'
'564552%252F203236%252F867650%252F245641%252F80%253fstation_evaId%253D8098160'

由于Deutsche Bahn API是通过自文档化的，因此安装一个swagger客户端可能是最简单的，让它为您创建一个API包装器（）。看起来很有用，但还有很多地方需要尝试

通过这种方式，您可以集中精力进行API请求和获取数据，而像URL编码甚至授权这样的低级管道将在后台透明地进行。

我认为实际需要的是

urllib.quote

，而不是

urllib.unquote

（看起来要放入URL的正确字符串被编码了两次）。另外，在Python 3中，这些函数是

urllib.parse.quote

和

urllib.parse.unquote

。很好！本以为您在使用Python 2，但无论如何，很高兴您发现了这一点。当您说它必须是urlencoded时，这是简单已知的还是您在文档中的某个地方读到的？嗯，您不能放置纯文本将值转换为URL，即通过字符串连接创建URL，而不存在破坏它的风险。这仍然是经常做的，而且通常是有效的，因为值的urlencoded版本看起来与普通版本完全相同，但它始终是错误的，因为当所讨论的值包含URL特有的字符时，URL会立即破坏，l就像本例中的

符号一样。编码错误的数据会使服务器错误地解释请求。当我看到服务器的端点是

GET/journeyDetails/{id}

时，我立即清楚地看到，

{id}

是一个移动的目标，必须正确地转义其中的任何内容。

import requests
import urllib

header = {'Authorization': 'Bearer 36e39957ace6f405a82cfb09522d0a8d'}
departure_data = requests.get('https://api.deutschebahn.com/fahrplan-plus/v1/departureBoard/8011160?date=2017-06-30', headers=header)

journey_id = departure_data.json()[0]['detailsId']
journey_details = requests.get('https://api.deutschebahn.com/fahrplan-plus/v1/journeyDetails/' + urllib.parse.quote(journey_id), headers=header)

urllib.parse.unquote(journey_id)
# -> '564552/203236/867650/245641/80?station_evaId=8098160'

'564552%2F203236%2F867650%2F245641%2F80%3fstation_evaId%3D8098160'
'564552%252F203236%252F867650%252F245641%252F80%253fstation_evaId%253D8098160'