在二维码中存储二进制数据(ZXING Java库)

在二维码中存储二进制数据(ZXING Java库),java,qr-code,zxing,binary-data,Java,Qr Code,Zxing,Binary Data,我的Java程序需要通过QR码发送二进制有效载荷,但我无法让它工作。我尝试了几个二维码库和许多方法,但似乎都有这个问题。我当前的实现使用ZXING 问题是,我尝试过的所有Java库似乎都专注于字符串有效负载,而不支持二进制数据。通常建议将二进制数据编码为Base64。然而,我的数据已经接近二维码的大小限制。由于Base64编码导致了4倍的膨胀,我的数据太大了。我已经花费了大量精力来减少有效负载的大小,目前它由4个字符的散列组成,由新行分隔;Java Deflator类提供的所有内部最大级别压缩。

我的Java程序需要通过QR码发送二进制有效载荷,但我无法让它工作。我尝试了几个二维码库和许多方法,但似乎都有这个问题。我当前的实现使用ZXING

问题是,我尝试过的所有Java库似乎都专注于字符串有效负载,而不支持二进制数据。通常建议将二进制数据编码为Base64。然而,我的数据已经接近二维码的大小限制。由于Base64编码导致了4倍的膨胀,我的数据太大了。我已经花费了大量精力来减少有效负载的大小,目前它由4个字符的散列组成,由新行分隔;Java Deflator类提供的所有内部最大级别压缩。我不能再小一点了


我需要一种以最小的数据膨胀开销将二进制数据存储在二维码中的方法。

我开发了一种解决方案,该解决方案只会导致-8%的存储效率损失。它利用ZXING QR码库的内置压缩优化

解释

ZXING将自动检测您的字符串负载是否为纯字母数字(根据它们自己的定义),如果是,它将自动将2个字母数字字符压缩为11位。ZXING对“字母数字”的定义仅限于大写字母、0-9和一些特殊符号(“/”、“:”,等等)。总的来说,它们的定义允许45个可能的值。然后,它将这些Base45数字中的2个压缩为11位

以45为底的2位数字表示可能的值。11位具有2048个可能状态的最大存储容量。与原始二进制文件相比,这仅损失了1.1%的存储效率

  45 ^ 2 = 2,025
  2 ^ 11 = 2,048
  2,048 - 2,025 = 23
  23 / 2,048 = 0.01123046875 = 1.123%
然而,这是理想/理论效率。我的实现将数据分块处理,使用长缓冲区作为计算缓冲区。但是,由于Java Long是单字节的,因此我们只能使用较低的7个字节。转换代码需要连续的正值;使用最高的第8字节会污染符号位并随机产生负值

真实世界测试:

使用7字节长的随机字节对2KB缓冲区进行编码,我们得到以下结果

  Raw Binary Size:        2,048
  Encoded String Size:    3,218
  QR Code Alphanum Size:  2,213 (after the QR Code compresses 2 base45 digits to 11 bits)
这是一个实际的存储效率损失,仅为8%

  2,213 - 2,048 = 165
  165 / 2,048 = 0.08056640625 = 8.0566%
解决方案

我将其实现为一个自包含的静态实用程序类,因此您只需调用:

//Encode
final byte[] myBinaryData = ...;
final String encodedStr = BinaryToBase45Encoder.encodeToBase45QrPayload(myBinaryData);

//Decode
final byte[] decodedBytes = BinaryToBase45Encoder.decodeBase45QrPayload(encodedStr);
或者,您也可以通过InputStreams执行此操作:

//Encode
final InputStream in_1 = ... ;
final String encodedStr = BinaryToBase45Encoder.encodeToBase45QrPayload(in_1);

//Decode
final InputStream in_2 = ... ;
final byte[] decodedBytes = BinaryToBase45Encoder.decodeBase45QrPayload(in_2);
下面是实现

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.lang.reflect.Field;
import java.util.HashMap;
import java.util.LinkedList;
import java.util.Map;

/**
 * For some reason none of the Java QR Code libraries support binary payloads. At least, none that
 * I could find anyway. The commonly suggested workaround for this is to use Base64 encoding.
 * However, this results in a 4x payload size inflation. If your payload is already near the size
 * limit of QR codes, this is not possible.
 *
 * This class implements an encoder which takes advantage of a built-in compression optimization
 * of the ZXING QR Code library, to enable the storage of Binary data into a QR Code, with a
 * storage efficiency loss of only -8%.
 *
 * The built-in optimization is this: ZXING will automatically detect if your String payload is
 * purely AlphaNumeric (by their own definition), and if so, it will automatically compress 2
 * AlphaNumeric characters into 11 bits.
 *
 *
 * ----------------------
 *
 *
 * The included ALPHANUMERIC_TABLE is the conversion table used by the ZXING library as a reverse
 * index for determining if a given input data should be classified as alphanumeric.
 *
 * See:
 *
 *      com.google.zxing.qrcode.encoder.Encoder.chooseMode(String content, String encoding)
 *
 * which scans through the input string one character at a time and passes them to:
 *
 *      getAlphanumericCode(int code)
 *
 * in the same class, which uses that character as a numeric index into the the
 * ALPHANUMERIC_TABLE.
 *
 * If you examine the values, you'll notice that it ignores / disqualifies certain values, and
 * effectively converts the input into base 45 (0 -> 44; -1 is interpreted by the calling code
 * to mean a failure). This is confirmed in the function:
 *
 *      appendAlphanumericBytes(CharSequence content, BitArray bits)
 *
 * where they pack 2 of these base 45 digits into 11 bits. This presents us with an opportunity.
 * If we can take our data, and convert it into a compatible base 45 alphanumeric representation,
 * then the QR Encoder will automatically pack that data into sub-byte chunks.
 *
 * 2 digits in base 45 is 2,025 possible values. 11 bits has a maximum storage capacity of 2,048
 * possible states. This is only a loss of 1.1% in storage efficiency behind raw binary.
 *
 *      45 ^ 2 = 2,025
 *      2 ^ 11 = 2,048
 *      2,048 - 2,025 = 23
 *      23 / 2,048 = 0.01123046875 = 1.123%
 *
 * However, this is the ideal / theoretical efficiency. This implementation processes data in
 * chunks, using a Long as a computational buffer. However, since Java Long's are singed, we
 * can only use the lower 7 bytes. The conversion code requires continuously positive values;
 * using the highest 8th byte would contaminate the sign bit and randomly produce negative
 * values.
 *
 *
 * Real-World Test:
 *
 * Using a 7 byte Long to encode a 2KB buffer of random bytes, we get the following results.
 *
 *      Raw Binary Size:        2,048
 *      Encoded String Size:    3,218
 *      QR Code Alphanum Size:  2,213 (after the QR Code compresses 2 base45 digits to 11 bits)
 *
 * This is a real-world storage efficiency loss of only 8%.
 *
 *      2,213 - 2,048 = 165
 *      165 / 2,048 = 0.08056640625 = 8.0566%
 */
public class BinaryToBase45Encoder {
    public final static int[] ALPHANUMERIC_TABLE;

    /*
     * You could probably just copy & paste the array literal from the ZXING source code; it's only
     * an array definition. But I was unsure of the licensing issues with posting it on the internet,
     * so I did it this way.
     */
    static {
        final Field SOURCE_ALPHANUMERIC_TABLE;
        int[] tmp;

        //Copy lookup table from ZXING Encoder class
        try {
            SOURCE_ALPHANUMERIC_TABLE = com.google.zxing.qrcode.encoder.Encoder.class.getDeclaredField("ALPHANUMERIC_TABLE");
            SOURCE_ALPHANUMERIC_TABLE.setAccessible(true);
            tmp = (int[]) SOURCE_ALPHANUMERIC_TABLE.get(null);
        } catch (NoSuchFieldException e) {
            e.printStackTrace();//Shouldn't happen
            tmp = null;
        } catch (IllegalAccessException e) {
            e.printStackTrace();//Shouldn't happen
            tmp = null;
        }

        //Store
        ALPHANUMERIC_TABLE = tmp;
    }

    public static final int NUM_DISTINCT_ALPHANUM_VALUES = 45;
    public static final char[] alphaNumReverseIndex = new char[NUM_DISTINCT_ALPHANUM_VALUES];

    static {
        //Build AlphaNum Index
        final int len = ALPHANUMERIC_TABLE.length;
        for (int x = 0; x < len; x++) {
            // The base45 result which the alphanum lookup table produces.
            // i.e. the base45 digit value which String characters are
            // converted into.
            //
            // We use this value to build a reverse lookup table to find
            // the String character we have to send to the encoder, to
            // make it produce the given base45 digit value.
            final int base45DigitValue = ALPHANUMERIC_TABLE[x];

            //Ignore the -1 records
            if (base45DigitValue > -1) {
                //The index into the lookup table which produces the given base45 digit value.
                //
                //i.e. to produce a base45 digit with the numeric value in base45DigitValue, we need
                //to send the Encoder a String character with the numeric value in x.
                alphaNumReverseIndex[base45DigitValue] = (char) x;
            }
        }
    }

    /*
     * The storage capacity of one digit in the number system; i.e. the maximum
     * possible number of distinct values which can be stored in 1 logical digit
     */
    public static final int QR_PAYLOAD_NUMERIC_BASE = NUM_DISTINCT_ALPHANUM_VALUES;

    /*
     * We can't use all 8 bytes, because the Long is signed, and the conversion math
     * requires consistently positive values. If we populated all 8 bytes, then the
     * last byte has the potential to contaminate the sign bit, and break the
     * conversion math. So, we only use the lower 7 bytes, and avoid this problem.
     */
    public static final int LONG_USABLE_BYTES = Long.BYTES - 1;

    //The following mapping was determined by brute-forcing -1 Long (all bits 1), and compressing to base45 until it hit zero.
    public static final int[] BINARY_TO_BASE45_DIGIT_COUNT_CONVERSION = new int[] {0,2,3,5,6,8,9,11,12};
    public static final int NUM_BASE45_DIGITS_PER_LONG = BINARY_TO_BASE45_DIGIT_COUNT_CONVERSION[LONG_USABLE_BYTES];
    public static final Map<Integer, Integer> BASE45_TO_BINARY_DIGIT_COUNT_CONVERSION = new HashMap<>();

    static {
        //Build Reverse Lookup
        int len = BINARY_TO_BASE45_DIGIT_COUNT_CONVERSION.length;
        for (int x=0; x<len; x++) {
            int numB45Digits = BINARY_TO_BASE45_DIGIT_COUNT_CONVERSION[x];
            BASE45_TO_BINARY_DIGIT_COUNT_CONVERSION.put(numB45Digits, x);
        }
    }

    public static String encodeToBase45QrPayload(final byte[] inputData) throws IOException {
        return encodeToBase45QrPayload(new ByteArrayInputStream(inputData));
    }

    public static String encodeToBase45QrPayload(final InputStream in) throws IOException {
        //Init conversion state vars
        final StringBuilder strOut = new StringBuilder();
        int data;
        long buf = 0;

        // Process all input data in chunks of size LONG.BYTES, this allows for economies of scale
        // so we can process more digits of arbitrary size before we hit the wall of the binary
        // chunk size in a power of 2, and have to transmit a sub-optimal chunk of the "crumbs"
        // left over; i.e. the slack space between where the multiples of QR_PAYLOAD_NUMERIC_BASE
        // and the powers of 2 don't quite line up.
        while(in.available() > 0) {
            //Fill buffer
            int numBytesStored = 0;
            while (numBytesStored < LONG_USABLE_BYTES && in.available() > 0) {
                //Read next byte
                data = in.read();

                //Push byte into buffer
                buf = (buf << 8) | data; //8 bits per byte

                //Increment
                numBytesStored++;
            }

            //Write out in lower base
            final StringBuilder outputChunkBuffer = new StringBuilder();
            final int numBase45Digits = BINARY_TO_BASE45_DIGIT_COUNT_CONVERSION[numBytesStored];
            int numB45DigitsProcessed = 0;
            while(numB45DigitsProcessed < numBase45Digits) {
                //Chunk out a digit
                final byte digit = (byte) (buf % QR_PAYLOAD_NUMERIC_BASE);

                //Drop digit data from buffer
                buf = buf / QR_PAYLOAD_NUMERIC_BASE;

                //Write Digit
                outputChunkBuffer.append(alphaNumReverseIndex[(int) digit]);

                //Track output digits
                numB45DigitsProcessed++;
            }

            /*
             * The way this code works, the processing output results in a First-In-Last-Out digit
             * reversal. So, we need to buffer the chunk output, and feed it to the OutputStream
             * backwards to correct this.
             *
             * We could probably get away with writing the bytes out in inverted order, and then
             * flipping them back on the decode side, but just to be safe, I'm always keeping
             * them in the proper order.
             */
            strOut.append(outputChunkBuffer.reverse().toString());
        }

        //Return
        return strOut.toString();
    }

    public static byte[] decodeBase45QrPayload(final String inputStr) throws IOException {
        //Prep for InputStream
        final byte[] buf = inputStr.getBytes();//Use the default encoding (the same encoding that the 'char' primitive uses)

        return decodeBase45QrPayload(new ByteArrayInputStream(buf));
    }

    public static byte[] decodeBase45QrPayload(final InputStream in) throws IOException {
        //Init conversion state vars
        final ByteArrayOutputStream out = new ByteArrayOutputStream();
        int data;
        long buf = 0;
        int x=0;

        // Process all input data in chunks of size LONG.BYTES, this allows for economies of scale
        // so we can process more digits of arbitrary size before we hit the wall of the binary
        // chunk size in a power of 2, and have to transmit a sub-optimal chunk of the "crumbs"
        // left over; i.e. the slack space between where the multiples of QR_PAYLOAD_NUMERIC_BASE
        // and the powers of 2 don't quite line up.
        while(in.available() > 0) {
            //Convert & Fill Buffer
            int numB45Digits = 0;
            while (numB45Digits < NUM_BASE45_DIGITS_PER_LONG && in.available() > 0) {
                //Read in next char
                char c = (char) in.read();

                //Translate back through lookup table
                int digit = ALPHANUMERIC_TABLE[(int) c];

                //Shift buffer up one digit to make room
                buf *= QR_PAYLOAD_NUMERIC_BASE;

                //Append next digit
                buf += digit;

                //Increment
                numB45Digits++;
            }

            //Write out in higher base
            final LinkedList<Byte> outputChunkBuffer = new LinkedList<>();
            final int numBytes = BASE45_TO_BINARY_DIGIT_COUNT_CONVERSION.get(numB45Digits);
            int numBytesProcessed = 0;
            while(numBytesProcessed < numBytes) {
                //Chunk out 1 byte
                final byte chunk = (byte) buf;

                //Shift buffer to next byte
                buf = buf >> 8; //8 bits per byte

                //Write byte to output
                //
                //Again, we need to invert the order of the bytes, so as we chunk them off, push
                //them onto a FILO stack; inverting their order.
                outputChunkBuffer.push(chunk);

                //Increment
                numBytesProcessed++;
            }

            //Write chunk buffer to output stream (in reverse order)
            while (outputChunkBuffer.size() > 0) {
                out.write(outputChunkBuffer.pop());
            }
        }

        //Return
        out.flush();
        out.close();
        return out.toByteArray();
    }
}
import java.io.ByteArrayInputStream;
导入java.io.ByteArrayOutputStream;
导入java.io.IOException;
导入java.io.InputStream;
导入java.lang.reflect.Field;
导入java.util.HashMap;
导入java.util.LinkedList;
导入java.util.Map;
/**
*出于某种原因,Java QR码库都不支持二进制有效载荷。至少没有
*反正我能找到。通常建议的解决方法是使用Base64编码。
*但是,这会导致有效负载尺寸膨胀4倍。如果你的有效载荷已经接近这个尺寸
*二维码的限制,这是不可能的。
*
*这个类实现了一个编码器,它利用了内置的压缩优化
*ZXING QR代码库的一部分,用于将二进制数据存储到QR代码中,具有
*存储效率损失仅为-8%。
*
*内置的优化是这样的:ZXING将自动检测字符串负载是否正确
*纯字母数字(根据他们自己的定义),如果是这样,它将自动压缩2
*将字母数字字符转换为11位。
*
*
* ----------------------
*
*
*随附的字母数字_表是ZXING库作为反向使用的转换表
*用于确定给定输入数据是否应分类为字母数字的索引。
*
*见:
*
*com.google.zxing.qrcode.encoder.encoder.chooseMode(字符串内容,字符串编码)
*
*它一次扫描输入字符串一个字符,并将其传递给:
*
*getAlphanumericCode(整数代码)
*
*在同一个类中,该类将该字符用作
*字母数字表格。
*
*如果检查这些值,您会注意到它会忽略/取消某些值,并且
*有效地将输入转换为基45(0->44;-1由调用代码解释)
*意味着失败)。这在功能中得到确认:
*
*appendAlphanumericBytes(字符序列内容、位数组位)
*
*在这里,他们将其中的2个45位数字打包成11位。这给我们提供了一个机会。
*如果我们能够获取数据,并将其转换为兼容的base 45字母数字表示形式,
*然后QR编码器将自动将数据打包成子字节块。
*
*以45为底的2位数字表示可能的值。11位的最大存储容量为2048
*可能的国家。与原始二进制文件相比,这仅损失了1.1%的存储效率。
*
*      45 ^ 2 = 2,025
*      2 ^ 11 = 2,048
*      2,048 - 2,025 = 23
*      23 / 2,048 = 0.01123046875 = 1.123%
*
*然而,这是理想/理论效率。此实现在中处理数据
*块,使用Long作为计算缓冲区。然而,由于Java Long被烧焦了,我们
*只能使用较低的7个字节。转换代码需要连续的正值;
*使用最高的第8字节会污染符号位并随机产生负数
*价值观。
*
*
*真实世界测试:
*
*使用7字节长的随机字节对2KB缓冲区进行编码,我们得到以下结果。
*
*原始二进制大小:2048
*编码字符串大小:3218
*QR码Alphanum大小:2213(QR码将2个基45位压缩为11位后)
*
*这是一个实际的存储效率损失,仅为8%。
*
*      2,213 - 2,048 = 165
*      165 / 2,048 = 0.08056640625 = 8.0566%
*/
公共类二进制编码器{
公共最终静态int[]字母数字_表;
/*
*你可以
@Test
public void stringEncodingTest() throws IOException {
    //Init test data
    final String testStr = "Some cool input data! !@#$%^&*()_+";

    //Encode
    final String encodedStr = BinaryToBase45Encoder.encodeToBase45QrPayload(testStr.getBytes("UTF-8"));

    //Decode
    final byte[] decodedBytes = BinaryToBase45Encoder.decodeBase45QrPayload(encodedStr);
    final String decodedStr = new String(decodedBytes, "UTF-8");

    //Output
    final boolean matches = testStr.equals(decodedStr);
    assert(matches);
    System.out.println("They match!");
}

@Test
public void binaryEncodingAccuracyTest() throws IOException {
    //Init test data
    final int maxBytes = 10_000;
    for (int x=1; x<=maxBytes; x++) {
        System.out.print("x: " + x + "\t");

        //Encode
        final byte[] inputArray = getTestBytes(x);
        final String encodedStr = BinaryToBase45Encoder.encodeToBase45QrPayload(inputArray);

        //Decode
        final byte[] decodedBytes = BinaryToBase45Encoder.decodeBase45QrPayload(encodedStr);

        //Output
        for (int y=0; y<x; y++) {
            assertEquals(inputArray[y], decodedBytes[y]);
        }
        System.out.println("Passed!");
    }
}

@Test
public void binaryEncodingEfficiencyTest() throws IOException, WriterException, NoSuchMethodException, InvocationTargetException, IllegalAccessException {
    //Init test data
    final byte[] inputData = new byte[2048];
    new Random().nextBytes(inputData);

    //Encode
    final String encodedStr = BinaryToBase45Encoder.encodeToBase45QrPayload(inputData);

    //Write to QR Code Encoder // Have to use Reflection to force access, since the function is not public.
    final BitArray qrCode = new BitArray();
    final Method appendAlphanumericBytes = com.google.zxing.qrcode.encoder.Encoder.class.getDeclaredMethod("appendAlphanumericBytes", CharSequence.class, BitArray.class);
    appendAlphanumericBytes.setAccessible(true);
    appendAlphanumericBytes.invoke(null, encodedStr, qrCode);

    //Output
    final int origSize = inputData.length;
    final int qrSize = qrCode.getSizeInBytes();
    System.out.println("Raw Binary Size:\t\t" + origSize + "\nEncoded String Size:\t" + encodedStr.length() + "\nQR Code Alphanum Size:\t" + qrSize);

    //Calculate Storage Efficiency Loss
    final int delta = origSize - qrSize;
    final double efficiency = ((double) delta) / origSize;
    System.out.println("Storage Efficiency Loss: " + String.format("%.3f", efficiency * 100) + "%");
}

public static byte[] getTestBytes(int numBytes) {
    final Random rand = new Random();
    final ByteArrayOutputStream bos = new ByteArrayOutputStream();
    for (int x=0; x<numBytes; x++) {
        //bos.write(255);// -1 (byte) = 255 (int) = 1111 1111

        byte b = (byte) rand.nextInt();
        bos.write(b);
    }
    return bos.toByteArray();
}