C++ URL编码混乱
我正在尝试对请求进行编码。请求如下:C++ URL编码混乱,c++,curl,encoding,C++,Curl,Encoding,我正在尝试对请求进行编码。请求如下: https://www.overpass-api.de/api/interpreter?data=area["name"="Nicaragua"]["admin_level"="2"]->.boundaryarea;(node["type"="route"]["route"="bus"](area.boundaryarea);way["type"="route"]["route"="bus"](area.boundaryarea);>;relat
https://www.overpass-api.de/api/interpreter?data=area["name"="Nicaragua"]["admin_level"="2"]->.boundaryarea;(node["type"="route"]["route"="bus"](area.boundaryarea);way["type"="route"]["route"="bus"](area.boundaryarea);>;relation["type"="route"]["route"="bus"](area.boundaryarea);>>;);out meta;
正如你所看到的,你有很多特别的角色。如果我将这个URL设置为curl,我将不会处理它,因为有一些字符。因此,我决定用我自己的方法和curl的方法对URL进行编码。下面是使用curl编码的代码示例:
std::string d = ...;
CURL *curl = curl_easy_init();
if(curl) {
char *output = curl_easy_escape(curl, d.c_str(), d.length());
if(output) {
printf("Encoded: %s\n", output);
curl_free(output);
}
}
将对整个请求进行编码,结果如下
https%3A%2F%2Fwww.overpass-api.de%2Fapi%2Finterpreter%3Fdata%3D ...
如果我尝试将它交给curl来处理它,它会抛出并说它无法解析宿主,这对我来说是有意义的。所以我决定检查chrome在编码时做了什么——这要感谢开发工具。这就是它的样子:
https://www.overpass-api.de/api/interpreter?data=area[%22name%22=%22Nicaragua%22][%22admin_level%22=%222%22]-%3E.boundaryarea;(node[%22type%22=%22route%22][%22route%22=%22bus%22](area.boundaryarea);way[%22type%22=%22route%22][%22route%22=%22bus%22](area.boundaryarea);%3E;relation[%22type%22=%22route%22][%22route%22=%22bus%22](area.boundaryarea);%3E%3E;);out%20meta;
如果我把这个给curl,它会正确地处理它
为什么有些字符是编码的,其余的都不是?为什么curl会接受这种方式
编辑:更重要的是,我如何在代码中复制它?不要将整个URL作为单个字符串转义。仅转义实际需要转义的各个部分,如查询参数。但即使如此,在name=value
对中,根据需要分别转义name
和value
,否则name=value
对中的定界=
,以及对之间的定界&
,将被转义,这是您不希望发生的
尝试类似以下内容:
std::string query_encode(const std::string &s)
{
std::string ret;
// curl_easy_escape() escapes way more than it needs to in
// a URL Query component! Which is not TECHNICALLY wrong, but
// it won't produce the output you are expecting...
/*
char *output = curl_easy_escape(curl, s.c_str(), s.length());
if (output) {
ret = output;
curl_free(output);
}
*/
#define IS_BETWEEN(ch, low, high) (ch >= low && ch <= high)
#define IS_ALPHA(ch) (IS_BETWEEN(ch, 'A', 'Z') || IS_BETWEEN(ch, 'a', 'z'))
#define IS_DIGIT(ch) IS_BETWEEN(ch, '0', '9')
#define IS_HEXDIG(ch) (IS_DIGIT(ch) || IS_BETWEEN(ch, 'A', 'F') || IS_BETWEEN(ch, 'a', 'f'))
for(size_t i = 0; i < s.size();)
{
char ch = s[i++];
if (IS_ALPHA(ch) || IS_DIGIT(ch))
{
ret += ch;
}
else if ((ch == '%') && IS_HEXDIG(s[i+0]) && IS_HEXDIG(s[i+1]))
{
ret += s.substr(i-1, 3);
i += 2;
}
else
{
switch (ch)
{
case '-':
case '.':
case '_':
case '~':
case '!':
case '$':
case '&':
case '\'':
case '(':
case ')':
case '*':
case '+':
case ',':
case ';':
case '=':
case ':':
case '@':
case '/':
case '?':
case '[':
case ']':
ret += ch;
break;
default:
{
static const char hex[] = "0123456789ABCDEF";
char pct[] = "% ";
pct[1] = hex[(ch >> 4) & 0xF];
pct[2] = hex[ch & 0xF];
ret.append(pct, 3);
break;
}
}
}
}
return ret;
}
std::string d = "https://www.overpass-api.de/api/interpreter?data=" + query_encode("area[\"name\"=\"Nicaragua\"][\"admin_level\"=\"2\"]->.boundaryarea;(node[\"type\"=\"route\"][\"route\"=\"bus\"](area.boundaryarea);way[\"type\"=\"route\"][\"route\"=\"bus\"](area.boundaryarea);>;relation[\"type\"=\"route\"][\"route\"=\"bus\"](area.boundaryarea);>>;);out meta;");
std::cout << "Encoded: " + d + "\n";
std::string查询\u编码(const std::string&s)
{
std::字符串ret;
//curl\u easy\u escape()的逃逸方式比它需要的多
//URL查询组件!这在技术上并没有错,但是
//它不会产生您期望的输出。。。
/*
char*output=curl\u easy\u escape(curl,s.c\u str(),s.length());
如果(输出){
ret=输出;
无旋度(输出);
}
*/
#定义介于(ch,low,high)之间(ch>=low&&ch您必须转义URI部分。请查看JavaScript的功能,这就是方法
我正在使用下面的函数,它模仿JavaScript的Encodeuri组件,以便对各个部分进行编码
std::string encodeURIComponent(std::string const&value)
{
std::ostringstream oss;
oss编码与chrome的区别在于“通用分隔符”没有编码(也不需要编码)。
https://www.overpass-api.de/api/interpreter?data=area[%22name%22=%22Nicaragua%22][%22admin_level%22=%222%22]-%3E.boundaryarea;(node[%22type%22=%22route%22][%22route%22=%22bus%22](area.boundaryarea);way[%22type%22=%22route%22][%22route%22=%22bus%22](area.boundaryarea);%3E;relation[%22type%22=%22route%22][%22route%22=%22bus%22](area.boundaryarea);%3E%3E;);out%20meta;