如有必要,用于URL编码的Java库(如浏览器)

如有必要,用于URL编码的Java库(如浏览器),java,urlencode,Java,Urlencode,如果我把http://localhost:9000/space 测试它使用URL调用服务器的web浏览器的地址栏http://localhost:9000/space%20test。 http://localhost:9000/specÁÍtest也将被编码为http://localhost:9000/spec%C3%81%C3%89%C3%8Dtest 如果将编码的URL放入地址栏(即http://localhost:9000/space%20test和http://localhost:900

如果我把
http://localhost:9000/space 测试它使用
URL调用服务器的web浏览器的地址栏http://localhost:9000/space%20test
http://localhost:9000/specÁÍtest
也将被编码为
http://localhost:9000/spec%C3%81%C3%89%C3%8Dtest

如果将编码的URL放入地址栏(即
http://localhost:9000/space%20test
http://localhost:9000/spec%C3%81%C3%89%C3%8Dtest
)它们保持不变(不会双重编码)

是否有任何Java API或库执行此编码?URL来自用户,所以我不知道它们是否经过编码

(如果没有,在输入字符串中搜索
%
是否足够,如果没有找到,则进行编码,或者是否有任何特殊情况下不起作用?)

编辑:

import java.util.logging.Level;
import java.util.logging.Logger;
import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
import javax.script.ScriptException;

public class TextHelper {
    private static ScriptEngine engine = new ScriptEngineManager()
        .getEngineByName("JavaScript");

/**
 * Encoding if need escaping %$&+,/:;=?@<>#%
 *
 * @param str should be encoded
 * @return encoded Result 
 */
public static String escapeJavascript(String str) {
    try {
        return engine.eval(String.format("escape(\"%s\")", 
            str.replaceAll("%20", " "))).toString()
                .replaceAll("%3A", ":")
                .replaceAll("%2F", "/")
                .replaceAll("%3B", ";")
                .replaceAll("%40", "@")
                .replaceAll("%3C", "<")
                .replaceAll("%3E", ">")
                .replaceAll("%3D", "=")
                .replaceAll("%26", "&")
                .replaceAll("%25", "%")
                .replaceAll("%24", "$")
                .replaceAll("%23", "#")
                .replaceAll("%2B", "+")
                .replaceAll("%2C", ",")
                .replaceAll("%3F", "?");
    } catch (ScriptException ex) {
        Logger.getLogger(TextHelper.class.getName())
            .log(Level.SEVERE, null, ex);
        return null;
    }
}
urlcoder.encode(“space%20test”,“UTF-8”)
返回
space%2520test
,这不是我想要的,因为它是双重编码的

编辑2:

import java.util.logging.Level;
import java.util.logging.Logger;
import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
import javax.script.ScriptException;

public class TextHelper {
    private static ScriptEngine engine = new ScriptEngineManager()
        .getEngineByName("JavaScript");

/**
 * Encoding if need escaping %$&+,/:;=?@<>#%
 *
 * @param str should be encoded
 * @return encoded Result 
 */
public static String escapeJavascript(String str) {
    try {
        return engine.eval(String.format("escape(\"%s\")", 
            str.replaceAll("%20", " "))).toString()
                .replaceAll("%3A", ":")
                .replaceAll("%2F", "/")
                .replaceAll("%3B", ";")
                .replaceAll("%40", "@")
                .replaceAll("%3C", "<")
                .replaceAll("%3E", ">")
                .replaceAll("%3D", "=")
                .replaceAll("%26", "&")
                .replaceAll("%25", "%")
                .replaceAll("%24", "$")
                .replaceAll("%23", "#")
                .replaceAll("%2B", "+")
                .replaceAll("%2C", ",")
                .replaceAll("%3F", "?");
    } catch (ScriptException ex) {
        Logger.getLogger(TextHelper.class.getName())
            .log(Level.SEVERE, null, ex);
        return null;
    }
}

此外,浏览器处理部分编码的URL,如
http://localhost:9000/specÁÉ%C3%8Dtest
,嗯,不需要对它们进行双重编码。在这种情况下,服务器会收到以下URL:
http://localhost:9000/spec%C3%81%C3%89%C3%8Dtest
。它与
…specÁÍtest的编码形式相同

标准Java api的It本身将进行URL编码和解码

请尝试使用
URLDecoder
URLDecoder

为安全通过互联网对文本进行编码:

import java.net.*;
...
try {
    encodedValue= URLEncoder.encode(rawValue, "UTF-8");
} catch (UnsupportedEncodingException uee) { }
以及解码:

try {
    decodedValue = URLDecoder.decode(rawValue, "UTF-8");
} catch (UnsupportedEncodingException uee) { }
使用java:

注意:编码完整的URL将导致不希望出现的情况,例如
http://
http%3A%2F%2F
中编码

编辑:为了防止对URL进行两次编码,您可以检查URL是否包含
%
,因为它仅对编码有效。但是,如果用户错误地弄乱了编码(比如,只对URL进行部分编码,或者在URL中使用
%
,而不将其用于编码),那么使用这种方法就没什么用了……

为什么我需要URL编码?

The URL specification RFC 1738 specifies that only a small set of characters 
can be used in a URL. Those characters are:

A to Z (ABCDEFGHIJKLMNOPQRSTUVWXYZ)
a to z (abcdefghijklmnopqrstuvwxyz)
0 to 9 (0123456789)
$ (Dollar Sign)
- (Hyphen / Dash)
_ (Underscore)
. (Period)
+ (Plus sign)
! (Exclamation / Bang)
* (Asterisk / Star)
' (Single Quote)
( (Open Bracket)
) (Closing Bracket)
All offending characters are replaced by a % and a two digit hexadecimal value 
that represents the character in the proper ISO character set. Here are a 
couple of examples:

$ (Dollar Sign) becomes %24
& (Ampersand) becomes %26
+ (Plus) becomes %2B
, (Comma) becomes %2C
: (Colon) becomes %3A
; (Semi-Colon) becomes %3B
= (Equals) becomes %3D
? (Question Mark) becomes %3F
@ (Commercial A / At) becomes %40
URL编码是如何工作的?

The URL specification RFC 1738 specifies that only a small set of characters 
can be used in a URL. Those characters are:

A to Z (ABCDEFGHIJKLMNOPQRSTUVWXYZ)
a to z (abcdefghijklmnopqrstuvwxyz)
0 to 9 (0123456789)
$ (Dollar Sign)
- (Hyphen / Dash)
_ (Underscore)
. (Period)
+ (Plus sign)
! (Exclamation / Bang)
* (Asterisk / Star)
' (Single Quote)
( (Open Bracket)
) (Closing Bracket)
All offending characters are replaced by a % and a two digit hexadecimal value 
that represents the character in the proper ISO character set. Here are a 
couple of examples:

$ (Dollar Sign) becomes %24
& (Ampersand) becomes %26
+ (Plus) becomes %2B
, (Comma) becomes %2C
: (Colon) becomes %3A
; (Semi-Colon) becomes %3B
= (Equals) becomes %3D
? (Question Mark) becomes %3F
@ (Commercial A / At) becomes %40
简单示例:

import java.util.logging.Level;
import java.util.logging.Logger;
import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
import javax.script.ScriptException;

public class TextHelper {
    private static ScriptEngine engine = new ScriptEngineManager()
        .getEngineByName("JavaScript");

/**
 * Encoding if need escaping %$&+,/:;=?@<>#%
 *
 * @param str should be encoded
 * @return encoded Result 
 */
public static String escapeJavascript(String str) {
    try {
        return engine.eval(String.format("escape(\"%s\")", 
            str.replaceAll("%20", " "))).toString()
                .replaceAll("%3A", ":")
                .replaceAll("%2F", "/")
                .replaceAll("%3B", ";")
                .replaceAll("%40", "@")
                .replaceAll("%3C", "<")
                .replaceAll("%3E", ">")
                .replaceAll("%3D", "=")
                .replaceAll("%26", "&")
                .replaceAll("%25", "%")
                .replaceAll("%24", "$")
                .replaceAll("%23", "#")
                .replaceAll("%2B", "+")
                .replaceAll("%2C", ",")
                .replaceAll("%3F", "?");
    } catch (ScriptException ex) {
        Logger.getLogger(TextHelper.class.getName())
            .log(Level.SEVERE, null, ex);
        return null;
    }
}
import java.util.logging.Level;
导入java.util.logging.Logger;
导入javax.script.ScriptEngine;
导入javax.script.ScriptEngineManager;
导入javax.script.ScriptException;
公共类TextHelper{
私有静态ScriptEngine=new ScriptEngineManager()
.getEngineByName(“JavaScript”);
/**
*如果需要转义%$&+,/:;=?@#%
*
*@param str应该被编码
*@return编码结果
*/
公共静态字符串escapeJavascript(字符串str){
试一试{
返回engine.eval(String.format(“转义(\%s\”)),
str.replaceAll(“%20”和“).toString()
.replaceAll(“%3A”和“:”)
.replaceAll(“%2F”和“/”)
.replaceAll(“%3B”和“;”)
.replaceAll(“%40”,“@”)
.replaceAll(“%3C”和“”)
.replaceAll(“%3D”和“=”)
.replaceAll(“%26”、“和”)
.replaceAll(“%25”和“%”)
.replaceAll(“%24”和“$”)
.replaceAll(“%23”和“#”)
.replaceAll(“%2B”和“+”)
.replaceAll(“%2C”,“,”)
.replaceAll(“%3F”和“?”);
}捕获(脚本异常){
Logger.getLogger(TextHelper.class.getName())
.log(Level.SEVERE,null,ex);
返回null;
}
}

最后,我检查了Firefox和Chrome的功能。我在两种浏览器中都使用了以下URL,并使用netcat(
nc-l-p9000
)捕获HTTP请求:

镀铬:

GET /!%22$%&'()*+,-./:;%3C=%3E?@[\]^_`{|}~%7F HTTP/1.1
Firefox编码的字符比Chrome多。表中有:

Char | Hex    | Dec     | Encoded by
-----------------------------------------
"    | %22    | 34      | Firefox, Chrome
'    | %27    | 39      | Firefox
<    | %3C    | 60      | Firefox, Chrome
>    | %3E    | 62      | Firefox, Chrome
`    | %60    | 96      | Firefox
     | %7F    | 127     | Firefox, Chrome

由于它使用
URLEncoder.encode
,因此它处理
ÁÍ
字符以及ASCII字符。

这是一个Scala代码段。此编码器将对URL中的非ASCII字符和保留字符进行编码。此外,由于操作是幂等的,因此URL不会进行双重编码

import java.net.URL
导入scala.util.parsing.combinator.RegexParsers
对象幂等UrlEncoder扩展RegexParsers{
覆盖def skipWhitespace=false
专用def段=代表(字符)
private def char=unreserved | escape | any ^{java.net.urlcoder.encode(_,“UTF-8”)}
private def unreserved=“”[A-Za-z0-9.~!$&'()*+,;=:@-]”。r
private def escape=“”%[A-Fa-f0-9]{2}”“。r
private def any=“”。r
私有def encodeSegment(输入:String):String=parseAll(段,输入)。get.mkString
私有def encodeSearch(输入:String):String=encodeSegment(输入)
def编码(url:字符串):字符串={
val u=新URL(URL)
val path=u.getPath.split(“/”).map(encodeSegment.mkString(“/”)
val query=u.getQuery匹配{
大小写null=>“”
案例q:String=>“?”+encodeSearch(q)
}
val hash=u.getRef匹配{
大小写null=>“”
案例h:String=>“#”+encodeSegment(h)
}
s“${u.getProtocol}://${u.getAuthority}$path$query$hash”
}
}
示例用法(测试代码)
import org.scalatest.{FunSuite,Matchers}
类幂等元UrlEncoderSpec使用匹配器扩展FunSuite{
导入幂等编码器_
测试(“幂等运算”){
val url=”http://ja.wikipedia.org/wiki/文字"
断言(编码(url)==encode(编码(url)))
断言(encode(url)==encode(encode(url)))
}
测试(“段编码”){
编码(“http://ja.wikipedia.org/wiki/文字")
.应该是(”http://ja.wikipedia.org/wiki/%E6%96%87%E5%AD%97")
}
测试(“查询字符串编码”){
编码(“http://qiita.com/search?utf8=✓&sort=rel&q=開発&排序=rel“)
.应该是(”http://qiita.com/search?utf8=%E2%9C%93&