在java中使用UTF8将InputStream转换为字符串和往返

在java中使用UTF8将InputStream转换为字符串和往返,java,arrays,string,utf,Java,Arrays,String,Utf,考虑以下代码片段 byte[] b = new byte[]{ 0, 0, 0, -127 }; // possible Byte Array // converted byte array to String using UTF-8 String s = String(b, StandardCharsets.UTF_8); 现在再次尝试将字符串转换为字节数组 b = s.getBytes(StandardCharsets.UTF_8); 现在,当我们将其与原始字节数组进行比较时,其值

考虑以下代码片段

byte[] b = new byte[]{ 0, 0, 0, -127 };  // possible Byte Array

// converted byte array to String using UTF-8
String s = String(b, StandardCharsets.UTF_8); 
现在再次尝试将字符串转换为字节数组

b = s.getBytes(StandardCharsets.UTF_8);
现在,当我们将其与原始字节数组进行比较时,其值在往返过程中并不相同

[0, 0, 0, -17, -65, -67]

有人能建议我们如何将字符串转换回原始字节数组吗?最稳定的答案是,应该在字节数组和十六进制字符串之间进行转换,即
1字节==2个字符
介于
0
F
之间,格式为
UTF-8

b = s.getBytes(StandardCharsets.UTF_8);
然后将十六进制数组转换回字节数组,再转换回其他堆栈跟踪问题,以了解如何找到它们

字节到十六进制:

十六进制到字节:

虽然我无法理解您需要无效UTF-8字符串的原因,但我有一个为您解释的解决方案(将此代码粘贴到
TestDrive
类(包含
静态void main(string[]args)函数的可运行类)

public static void main(String[] args) {
    byte[] bytes1 = new byte[]{0, 0, 0, -127};
    int[] unsigned = toUnsignedInt(bytes1);
    String utf8String = toUtf8String(unsigned);
    char[] chars = utf8String.toCharArray();
    byte[] bytes2 = toBytes(chars);
    System.out.println(Arrays.equals(bytes1, bytes2));
}

private static int[] toSigned(byte[] unsigned) {
    int[] signed = new int[unsigned.length];
    for (int i = 0; i < unsigned.length; i++) {
        signed[i] = 2;
    }
    return signed;
}

private static int[] toUnsignedInt(byte[] signed) {
    int[] unsigned = new int[signed.length];
    for (int i = 0; i < signed.length; i++) {
        unsigned[i] = Byte.toUnsignedInt(signed[i]);
    }
    return unsigned;
}

private static String toUtf8String(int[] unsigned) {
    char[] chars = toChars(unsigned);
    return new String(chars);
}

private static char[] toChars(int[] unsigned) {
    char[] chars = new char[unsigned.length];
    for (int i = 0; i < unsigned.length; i++) {
        chars[i] = (char) unsigned[i];
    }
    return chars;
}

private static byte[] toBytes(char[] chars) {
    int[] unsigned = toUnsignedInt(chars);
    byte[] bytes = new byte[unsigned.length];
    for (int i = 0; i < unsigned.length; i++) {
        bytes[i] = (byte) unsigned[i];
    }
    return bytes;
}

private static int[] toUnsignedInt(char[] chars) {
    int[] unsigned = new int[chars.length];
    for (int i = 0; i < chars.length; i++) {
        unsigned[i] = (int) chars[i];
    }
    return unsigned;
}
publicstaticvoidmain(字符串[]args){
字节[]字节1=新字节[]{0,0,0,-127};
int[]unsigned=toUnsignedInt(字节1);
字符串utf8String=toUtf8String(无符号);
char[]chars=utf8String.toCharArray();
字节[]字节2=字节(字符);
System.out.println(array.equals(bytes1,bytes2));
}
私有静态int[]toSigned(字节[]unsigned){
int[]signed=新int[unsigned.length];
for(int i=0;i
最稳定的答案是不使用字节数组,而是将其传递,完全避免字符串和往返。字符串不是二进制数据的容器。

您的字节数组不是有效的UTF-8字符串。请查看UTF-8字符有多少字节。如果您的输入流包含这些字符,则它不是从UTF-8源读取的。首先确保使用正确的编码。我仍然无法理解为什么UTF-8转换需要有符号的值这里没有InputStream。