Java servlet请求参数字符编码_Java_Servlets_Character Encoding

Java servlet请求参数字符编码

java servlets character-encoding

Java servlet请求参数字符编码,java,servlets,character-encoding,Java,Servlets,Character Encoding,我有一个Javaservlet，它通过HTTP GET请求从上游系统接收数据。此请求包含一个名为“text”的参数。如果上游系统将此参数设置为： TEST3 please ignore: 在上游系统的日志中显示为： 00 54 00 45 00 53 00 54 00 33 00 20 00 70 00 6c //TEST3 pl 00 65 00 61 00 73 00 65 00 20 00 69 00 67 00 6e //ease ign 00 6f 00 72 00 65 0

我有一个Javaservlet，它通过HTTP GET请求从上游系统接收数据。此请求包含一个名为“text”的参数。如果上游系统将此参数设置为：

TEST3 please ignore:

在上游系统的日志中显示为：

00 54 00 45 00 53 00 54 00 33 00 20 00 70 00 6c   //TEST3 pl
00 65 00 61 00 73 00 65 00 20 00 69 00 67 00 6e   //ease ign
00 6f 00 72 00 65 00 3a                           //ore:

（注释//实际上不会出现在日志中）

在我的servlet中，我使用以下命令读取此参数：

String text = request.getParameter("text");

如果我将

text

的值打印到控制台，它将显示为：

T E S T 3  p l e a s e  i g n o r e :

\u000T\u000E\u000S\u000T\u0003\u0000 \u000p\u000l\u000e\u000a\u000s\u000e\u0000 
\u000i\u000g\u000n\u000o\u000r\u000e\u000:

如果我在调试器中检查

text

的值，它将显示为：

T E S T 3  p l e a s e  i g n o r e :

\u000T\u000E\u000S\u000T\u0003\u0000 \u000p\u000l\u000e\u000a\u000s\u000e\u0000 
\u000i\u000g\u000n\u000o\u000r\u000e\u000:

所以字符编码似乎有问题。上游系统应使用UTF-16。我的猜测是servlet假定为UTF-8，因此读取的字符数是它应该读取的字符数的两倍。对于消息“TEST3请忽略：”每个字符的第一个字节是

。当servlet读取时，这被解释为一个空格，它解释了servlet记录消息时每个字符前面出现的空格

显然，我的目标只是在阅读

文本

请求参数时得到消息“TEST3 please ignore:”。我的猜测是，我可以通过指定请求参数的字符编码来实现这一点，但我不知道如何做到这一点。

看起来它是用

UTF-16LE

（Little-Endian）编码编码的，下面是一个成功打印字符串的类：

import java.io.UnsupportedEncodingException;
import java.math.BigInteger;

public class Test {
    public static void main(String[] args) throws UnsupportedEncodingException {
            String hex = "00 54 00 45 00 53 00 54 00 33 00 20 00 70 00 6c"  +
                            "00 65 00 61 00 73 00 65 00 20 00 69 00 67 00 6e" +
                           "00 6f 00 72 00 65 00 3a"; // + " 00";
            System.out.println(new String(new BigInteger(hex.replaceAll(" ", ""), 16).toByteArray(), "UTF-16LE"));
    }
}

输出：

TEST3 please ignore?

将两个零添加到输入的输出

TEST3 please ignore:

更新

要使用您的

Servlet

实现此功能，您可以尝试：

  String value = request.getParameter("text");
  try {
      value = new String(value.getBytes(), "UTF-16LE");
  } catch(java.io.UnsupportedEncodingException ex) {}

更新

请参阅以下内容，它验证生成的十六进制实际上是UTF-16LE，请尝试为此使用筛选器

public class CustomCharacterEncodingFilter implements Filter {

    public void init(FilterConfig config) throws ServletException {
    }

    public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) 
                                                       throws IOException, ServletException {
        request.setCharacterEncoding("UTF-8");
        response.setCharacterEncoding("UTF-8");
        chain.doFilter(request, response);
    }

    public void destroy() {
    }

这应该为整个应用程序设置正确的编码

new String(req.getParameter("<my request value>").getBytes("ISO-8859-1"),"UTF-8")

新字符串（请求getParameter（“”.getBytes（“ISO-8859-1”），“UTF-8”）

GET参数必须是ASCII或URL编码的，您不能在其中使用特殊字符集。您的web容器是什么？你的html文件字符集是什么？可能会有帮助。@MaurícioLinhares你有这个声明的链接吗？是的-最后一个字符应该是“：”而不是“？”。@Don，那是因为最后一个

在

3a

中丢失了，如果你额外添加它，它会正确解码，或者该字符串的编码器弄乱了，或者你可能忘记复制最后两个零。你是对的，可能是我的复制粘贴错误。顺便说一句，你确定这不是big endian吗？谢谢你的帮助没问题，我不是字符编码专家，但我很确定它是little endian，因为big endian根本不解码字符串：）这解决了我的问题，但我不完全理解为什么…：（[hidden edit]我深入挖掘了一下，发现调用

request.setCharacterEncoding（“UTF-8”）；

是我唯一需要的（而且它更有意义）