Java Android GZIP解压在缓冲区限制处中断unicode字符
我将收到的gzip数据解压缩为字符串。问题当缓冲区大小为512时,它在缓冲区限制点处打断unicode字符。结果我得到了带问号的文本。它发生在非拉丁字母上Java Android GZIP解压在缓冲区限制处中断unicode字符,java,android,unicode,gzip,Java,Android,Unicode,Gzip,我将收到的gzip数据解压缩为字符串。问题当缓冲区大小为512时,它在缓冲区限制点处打断unicode字符。结果我得到了带问号的文本。它发生在非拉丁字母上 …а��БГМц… public static String decompress(byte[] compressed) throws IOException { final int BUFFER_SIZE = 512; ByteArrayInputStream is = new ByteArrayInputS
…а��БГМц…
public static String decompress(byte[] compressed) throws IOException {
final int BUFFER_SIZE = 512;
ByteArrayInputStream is = new ByteArrayInputStream(compressed);
GZIPInputStream gis = new GZIPInputStream(is, BUFFER_SIZE);
StringBuilder string = new StringBuilder();
byte[] data = new byte[BUFFER_SIZE];
int bytesRead;
while ((bytesRead = gis.read(data)) != -1) {
string.append(new String(data, 0, bytesRead));
}
gis.close();
is.close();
return string.toString();
}
您可以将
gzip输入流
包装成InputStreamReader
并读取字符而不是字节。这样做,就不会在缓冲区边界出现潜在无效编码的问题。错误在算法中,假设正在读取的块在UTF-8字节序列边界上结束(和开始)
因此,请按以下步骤操作:
ByteArrayInputStream is = new ByteArrayInputStream(compressed);
GZIPInputStream gis = new GZIPInputStream(is, BUFFER_SIZE);
byte[] data = new byte[BUFFER_SIZE];
int bytesRead;
ByteArrayOutputStream baos = new ByteArrayOutputStream();
while ((bytesRead = gis.read(data)) != -1) {
baos.write(data, 0, bytesRead);
}
gis.close();
is.close();
return baos.toString("UTF-8");