Java中的HTTP URL地址编码
我的Java独立应用程序从用户那里获得一个URL(指向一个文件),我需要点击它并下载它。我面临的问题是,我无法正确编码HTTP URL地址 例如:Java中的HTTP URL地址编码,java,http,urlencode,Java,Http,Urlencode,我的Java独立应用程序从用户那里获得一个URL(指向一个文件),我需要点击它并下载它。我面临的问题是,我无法正确编码HTTP URL地址 例如: URL: http://search.barnesandnoble.com/booksearch/first book.pdf java.net.URLEncoder.encode(url.toString(), "ISO-8859-1"); 返回给我: http%3A%2F%2Fsearch.barnesandnoble.com%2Fbook
URL: http://search.barnesandnoble.com/booksearch/first book.pdf
java.net.URLEncoder.encode(url.toString(), "ISO-8859-1");
返回给我:
http%3A%2F%2Fsearch.barnesandnoble.com%2Fbooksearch%2Ffirst+book.pdf
但是,我想要的是
http://search.barnesandnoble.com/booksearch/first%20book.pdf
(空格替换为%20)
我猜
URLEncoder
不是为编码HTTP URL而设计的。。。JavaDoc说“HTML表单编码的实用程序类”。。。还有其他方法吗?是的,URL编码将对该字符串进行编码,以便在URL中正确地传递到最终目的地。例如,你不可能有。对参数进行URL编码将修复该参数值
所以我有两个选择:
仅供参考,您需要commons编解码器和commons日志才能在运行时使用此方法。URLEncoding可以很好地对HTTP URL进行编码,正如您不幸发现的那样。您传入的字符串“book.pdf”已正确、完整地编码为URL编码形式。您可以将返回的整个gobbledigook长字符串作为URL中的参数传递,然后可以将其解码为您传入的字符串 听起来您希望做一些与将整个URL作为参数传递稍有不同的事情。据我所知,您正在尝试创建一个类似“”的搜索URL。您唯一需要编码的是“whateverTheUserPassesIn”位,因此您可能需要做的就是这样:
String url = "http://search.barnesandnoble.com/booksearch/" +
URLEncoder.encode(userInput,"UTF-8");
这对你来说应该会产生一些更有效的东西。课程可以帮助你;在您找到的URL的文档中
注意,在某些情况下,URI类确实执行其组件字段的转义。管理URL编码和解码的推荐方法是使用URI
使用一个具有多个参数的构造函数,如:
URI uri = new URI(
"http",
"search.barnesandnoble.com",
"/booksearch/first book.pdf",
null);
URL url = uri.toURL();
//or String request = uri.toString();
(URI的单参数构造函数不转义非法字符)
上述代码仅转义非法字符-它不会转义非ASCII字符(请参阅fatih的注释)。
toASCIIString
方法可用于获取仅包含US-ASCII字符的字符串:
URI uri = new URI(
"http",
"search.barnesandnoble.com",
"/booksearch/é",
null);
String request = uri.toASCIIString();
对于具有类似
http://www.google.com/ig/api?weather=S圣保罗
,使用构造函数的5参数版本:
URI uri = new URI(
"http",
"www.google.com",
"/ig/api",
"weather=São Paulo",
null);
String request = uri.toASCIIString();
请注意,以上大多数答案都是错误的
urlcoder
类,尽管是名称,但不是这里需要的。不幸的是,Sun对这个类的命名如此令人烦恼URLEncoder
用于将数据作为参数传递,而不是编码URL本身
换句话说,”http://search.barnesandnoble.com/booksearch/first book.pdf“
是URL。参数可以是,例如,”http://search.barnesandnoble.com/booksearch/first book.pdf?parameter1=this¶m2=that“
。这些参数是您将使用的urlcoder
以下两个示例突出了两者之间的差异
根据HTTP标准,下面的代码生成了错误的参数。注:符号(&)和加号(+)的编码不正确
uri = new URI("http", null, "www.google.com", 80,
"/help/me/book name+me/", "MY CRZY QUERY! +&+ :)", null);
// URI: http://www.google.com:80/help/me/book%20name+me/?MY%20CRZY%20QUERY!%20+&+%20:)
下面将生成正确的参数,并对查询进行正确编码。注意空格、符号和加号
uri = new URI("http", null, "www.google.com", 80, "/help/me/book name+me/", URLEncoder.encode("MY CRZY QUERY! +&+ :)", "UTF-8"), null);
// URI: http://www.google.com:80/help/me/book%20name+me/?MY+CRZY+QUERY%2521+%252B%2526%252B+%253A%2529
如果您的URL中有编码的“/”(%2F),则仍然存在问题 RFC 3986-第2.2节说:“如果URI组件的数据与保留字符作为分隔符的用途相冲突,则在形成URI之前,必须对冲突数据进行百分比编码。”(RFC 3986-第2.2节) 但是Tomcat有一个问题: -在Apache Tomcat 6.0.10中修复 重要提示:目录遍历CVE-2007-0450 Tomcat允许“\”、“%2F”和“%5C” [...] . 以下是Java系统属性 已添加到Tomcat以提供 对货物处理的附加控制 URL中的路径分隔符(两个选项 默认为false):
- org.apache.tomcat.util.buf.udeconder.ALLOW_ENCODED_斜杠: 对|错
- org.apache.catalina.connector.CoyoteAdapter.ALLOW_反斜杠: 对|错
set JAVA_OPTS=%JAVA_OPTS% %LOGGING_CONFIG% -Dorg.apache.tomcat.util.buf.UDecoder.ALLOW_ENCODED_SLASH=true
我开发的解决方案比任何其他解决方案都更稳定:
public class URLParamEncoder {
public static String encode(String input) {
StringBuilder resultStr = new StringBuilder();
for (char ch : input.toCharArray()) {
if (isUnsafe(ch)) {
resultStr.append('%');
resultStr.append(toHex(ch / 16));
resultStr.append(toHex(ch % 16));
} else {
resultStr.append(ch);
}
}
return resultStr.toString();
}
private static char toHex(int ch) {
return (char) (ch < 10 ? '0' + ch : 'A' + ch - 10);
}
private static boolean isUnsafe(char ch) {
if (ch > 128 || ch < 0)
return true;
return " %$&+,/:;=?@<>#%".indexOf(ch) >= 0;
}
}
公共类urlpamencoder{
公共静态字符串编码(字符串输入){
StringBuilder resultStr=新建StringBuilder();
for(char ch:input.toCharArray()){
如果(不安全(ch)){
结果tr.append('%');
结果tr.append(toHex(ch/16));
结果tr.append(toHex(ch%16));
}否则{
结果追加(ch);
}
}
返回resultStr.toString();
}
私有静态字符到十六进制(int-ch){
返回(字符)(ch<10?'0'+ch:A'+ch-10);
}
私有静态布尔值不安全(char ch){
如果(ch>128 | | ch<0)
返回true;
返回“%$&+,/:;=?@#%”。indexOf(ch)>=0;
}
}
我创建了一个新项目来帮助构建HTTP URL。黎族
new UrlBuilder("search.barnesandnoble.com", "booksearch/first book.pdf").toString()
public static String encodeURLComponent(final String s)
{
if (s == null)
{
return "";
}
final StringBuilder sb = new StringBuilder();
try
{
for (int i = 0; i < s.length(); i++)
{
final char c = s.charAt(i);
if (((c >= 'A') && (c <= 'Z')) || ((c >= 'a') && (c <= 'z')) ||
((c >= '0') && (c <= '9')) ||
(c == '-') || (c == '.') || (c == '_') || (c == '~'))
{
sb.append(c);
}
else
{
final byte[] bytes = ("" + c).getBytes("UTF-8");
for (byte b : bytes)
{
sb.append('%');
int upper = (((int) b) >> 4) & 0xf;
sb.append(Integer.toHexString(upper).toUpperCase(Locale.US));
int lower = ((int) b) & 0xf;
sb.append(Integer.toHexString(lower).toUpperCase(Locale.US));
}
}
}
return sb.toString();
}
catch (UnsupportedEncodingException uee)
{
throw new RuntimeException("UTF-8 unsupported!?", uee);
}
}
URI uri = new URI(
"http",
null, // this is for userInfo
"www.google.com",
8080, // port number as int
"/ig/api",
"weather=São Paulo",
null);
String request = uri.toASCIIString();
String urlStr = "http://abc.dev.domain.com/0007AC/ads/800x480 15sec h.264.mp4";
URL url = new URL(urlStr);
URI uri = new URI(url.getProtocol(), url.getUserInfo(), url.getHost(), url.getPort(), url.getPath(), url.getQuery(), url.getRef());
url = uri.toURL();
public URL convertToURLEscapingIllegalCharacters(String string){
try {
String decodedURL = URLDecoder.decode(string, "UTF-8");
URL url = new URL(decodedURL);
URI uri = new URI(url.getProtocol(), url.getUserInfo(), url.getHost(), url.getPort(), url.getPath(), url.getQuery(), url.getRef());
return uri.toURL();
} catch (Exception ex) {
ex.printStackTrace();
return null;
}
}
String retVal = "";
try {
retVal = URLEncoder.encode(in_, "UTF8");
} catch (UnsupportedEncodingException ex) {
Log.get().exception(Log.Level.Error, "urlEncode ", ex);
}
return retVal;
UriUtils.encodeUri(input, "UTF-8")
/**
* Encode URL (except :, /, ?, &, =, ... characters)
* @param url to encode
* @param encodingCharset url encoding charset
* @return encoded URL
* @throws UnsupportedEncodingException
*/
public static String encodeUrl (String url, String encodingCharset) throws UnsupportedEncodingException{
return new URLCodec().encode(url, encodingCharset).replace("%3A", ":").replace("%2F", "/").replace("%3F", "?").replace("%3D", "=").replace("%26", "&");
}
String urlToEncode = ""http://www.growup.com/folder/intérieur-à_vendre?o=4";
Utils.encodeUrl (urlToEncode , "UTF-8")
public static URL convertToURLEscapingIllegalCharacters(String toEscape) throws MalformedURLException, URISyntaxException {
URL url = new URL(toEscape);
URI uri = new URI(url.getProtocol(), url.getUserInfo(), url.getHost(), url.getPort(), url.getPath(), url.getQuery(), url.getRef());
//if a % is included in the toEscape string, it will be re-encoded to %25 and we don't want re-encoding, just encoding
return new URL(uri.toString().replace("%25", "%"));
}
android.net.Uri.encode(urlString, ":/");
/**
* Percent-encodes a string so it's suitable for use in a URL Path (not a query string / form encode, which uses + for spaces, etc)
*/
public static String percentEncode(String encodeMe) {
if (encodeMe == null) {
return "";
}
String encoded = encodeMe.replace("%", "%25");
encoded = encoded.replace(" ", "%20");
encoded = encoded.replace("!", "%21");
encoded = encoded.replace("#", "%23");
encoded = encoded.replace("$", "%24");
encoded = encoded.replace("&", "%26");
encoded = encoded.replace("'", "%27");
encoded = encoded.replace("(", "%28");
encoded = encoded.replace(")", "%29");
encoded = encoded.replace("*", "%2A");
encoded = encoded.replace("+", "%2B");
encoded = encoded.replace(",", "%2C");
encoded = encoded.replace("/", "%2F");
encoded = encoded.replace(":", "%3A");
encoded = encoded.replace(";", "%3B");
encoded = encoded.replace("=", "%3D");
encoded = encoded.replace("?", "%3F");
encoded = encoded.replace("@", "%40");
encoded = encoded.replace("[", "%5B");
encoded = encoded.replace("]", "%5D");
return encoded;
}
/**
* Percent-decodes a string, such as used in a URL Path (not a query string / form encode, which uses + for spaces, etc)
*/
public static String percentDecode(String encodeMe) {
if (encodeMe == null) {
return "";
}
String decoded = encodeMe.replace("%21", "!");
decoded = decoded.replace("%20", " ");
decoded = decoded.replace("%23", "#");
decoded = decoded.replace("%24", "$");
decoded = decoded.replace("%26", "&");
decoded = decoded.replace("%27", "'");
decoded = decoded.replace("%28", "(");
decoded = decoded.replace("%29", ")");
decoded = decoded.replace("%2A", "*");
decoded = decoded.replace("%2B", "+");
decoded = decoded.replace("%2C", ",");
decoded = decoded.replace("%2F", "/");
decoded = decoded.replace("%3A", ":");
decoded = decoded.replace("%3B", ";");
decoded = decoded.replace("%3D", "=");
decoded = decoded.replace("%3F", "?");
decoded = decoded.replace("%40", "@");
decoded = decoded.replace("%5B", "[");
decoded = decoded.replace("%5D", "]");
decoded = decoded.replace("%25", "%");
return decoded;
}
@Test
public void testPercentEncode_Decode() {
assertEquals("", percentDecode(percentEncode(null)));
assertEquals("", percentDecode(percentEncode("")));
assertEquals("!", percentDecode(percentEncode("!")));
assertEquals("#", percentDecode(percentEncode("#")));
assertEquals("$", percentDecode(percentEncode("$")));
assertEquals("@", percentDecode(percentEncode("@")));
assertEquals("&", percentDecode(percentEncode("&")));
assertEquals("'", percentDecode(percentEncode("'")));
assertEquals("(", percentDecode(percentEncode("(")));
assertEquals(")", percentDecode(percentEncode(")")));
assertEquals("*", percentDecode(percentEncode("*")));
assertEquals("+", percentDecode(percentEncode("+")));
assertEquals(",", percentDecode(percentEncode(",")));
assertEquals("/", percentDecode(percentEncode("/")));
assertEquals(":", percentDecode(percentEncode(":")));
assertEquals(";", percentDecode(percentEncode(";")));
assertEquals("=", percentDecode(percentEncode("=")));
assertEquals("?", percentDecode(percentEncode("?")));
assertEquals("@", percentDecode(percentEncode("@")));
assertEquals("[", percentDecode(percentEncode("[")));
assertEquals("]", percentDecode(percentEncode("]")));
assertEquals(" ", percentDecode(percentEncode(" ")));
// Get a little complex
assertEquals("[]]", percentDecode(percentEncode("[]]")));
assertEquals("a=d%*", percentDecode(percentEncode("a=d%*")));
assertEquals(") (", percentDecode(percentEncode(") (")));
assertEquals("%21%20%2A%20%27%20%28%20%25%20%29%20%3B%20%3A%20%40%20%26%20%3D%20%2B%20%24%20%2C%20%2F%20%3F%20%23%20%5B%20%5D%20%25",
percentEncode("! * ' ( % ) ; : @ & = + $ , / ? # [ ] %"));
assertEquals("! * ' ( % ) ; : @ & = + $ , / ? # [ ] %", percentDecode(
"%21%20%2A%20%27%20%28%20%25%20%29%20%3B%20%3A%20%40%20%26%20%3D%20%2B%20%24%20%2C%20%2F%20%3F%20%23%20%5B%20%5D%20%25"));
assertEquals("%23456", percentDecode(percentEncode("%23456")));
}
/***
* Replaces any character not specifically unreserved to an equivalent
* percent sequence.
* @param s
* @return
*/
public static String encodeURIcomponent(String s)
{
StringBuilder o = new StringBuilder();
for (char ch : s.toCharArray()) {
if (isSafe(ch)) {
o.append(ch);
}
else {
o.append('%');
o.append(toHex(ch / 16));
o.append(toHex(ch % 16));
}
}
return o.toString();
}
private static char toHex(int ch)
{
return (char)(ch < 10 ? '0' + ch : 'A' + ch - 10);
}
// https://tools.ietf.org/html/rfc3986#section-2.3
public static final HashSet<Character> UnreservedChars = new HashSet<Character>(Arrays.asList(
'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z',
'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z',
'0','1','2','3','4','5','6','7','8','9',
'-','_','.','~'));
public static boolean isSafe(char ch)
{
return UnreservedChars.contains(ch);
}
org.apache.commons.text.StringEscapeUtils.escapeHtml4("my text % & < >");
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-text</artifactId>
<version>1.8</version>
</dependency>