什么';存储原始二进制数据和XML的最标准Java方式是什么?

什么';存储原始二进制数据和XML的最标准Java方式是什么?,java,xml,serialization,xml-serialization,jaxb,Java,Xml,Serialization,Xml Serialization,Jaxb,我需要将大量的二进制数据存储到一个文件中,但我还希望以XML格式读取/写入该文件的头 是的,我可以将二进制数据存储到某个XML值中,然后使用base64编码对其进行序列化。但这不会节省空间 我能否以或多或少标准化的方式“混合”XML数据和原始二进制数据 我在考虑两个选择: 有没有一种使用JAXB实现这一点的方法 或者有没有一种方法可以获取一些现有的XML数据并向其附加二进制数据,从而识别边界 我正在寻找的概念不是被SOAP使用了吗 还是在电子邮件标准中使用?(二进制附件的分离) 我试图实现的

我需要将大量的二进制数据存储到一个文件中,但我还希望以XML格式读取/写入该文件的头

是的,我可以将二进制数据存储到某个XML值中,然后使用base64编码对其进行序列化。但这不会节省空间

我能否以或多或少标准化的方式“混合”XML数据和原始二进制数据

我在考虑两个选择:

  • 有没有一种使用JAXB实现这一点的方法

  • 或者有没有一种方法可以获取一些现有的XML数据并向其附加二进制数据,从而识别边界

  • 我正在寻找的概念不是被SOAP使用了吗

  • 还是在电子邮件标准中使用?(二进制附件的分离)

我试图实现的计划:

[meta-info-about-boundary][XML-data][boundary][raw-binary-data]
谢谢大家!

我不这么认为——XML库通常不是为处理XML+额外数据而设计的

但是,您可能可以使用像特殊的流包装器这样简单的东西——它将公开一个包含流和二进制流(来自特殊的“格式”)的“XML”。然后JAXB(或任何其他XML库)可以处理“XML”流,而二进制流是分开的

还记得要考虑“二进制”文件和“文本”文件


愉快的编码。

您可以利用AttachmentMarshaller和AttachmentUnmarshaller实现这一点。这是JAXB/JAX-WS用来作为附件传递二进制内容的网桥。你可以利用同样的机制做你想做的事情


概念验证

下面是如何实现它的。这应该适用于任何JAXB impl(它适用于我,以及参考实现)

消息格式

[xml_length][xml][attach1_length][attach1]...[attachN_length][attachN]
@XmlRootElement
public class Book {
   private String title;
   private String author;
   private int year;

   //getter and setters...
}
byte[] bin = some binary data...

Book b = new Book();
b.setAuthor("John");
b.setTitle("wild stuff");
b.setYear(2012);

MultiPart multiPart = new MultiPart();
    multiPart.bodyPart(new BodyPart(b, MediaType.APPLICATION_XML_TYPE));
    multiPart.bodyPart(new BodyPart(bin, MediaType.APPLICATION_OCTET_STREAM_TYPE));

    response = service.path("rest").path("multipart").
            type(MultiPartMediaTypes.MULTIPART_MIXED).
            post(ClientResponse.class, multiPart);
根目录

[xml_length][xml][attach1_length][attach1]...[attachN_length][attachN]
@XmlRootElement
public class Book {
   private String title;
   private String author;
   private int year;

   //getter and setters...
}
byte[] bin = some binary data...

Book b = new Book();
b.setAuthor("John");
b.setTitle("wild stuff");
b.setYear(2012);

MultiPart multiPart = new MultiPart();
    multiPart.bodyPart(new BodyPart(b, MediaType.APPLICATION_XML_TYPE));
    multiPart.bodyPart(new BodyPart(bin, MediaType.APPLICATION_OCTET_STREAM_TYPE));

    response = service.path("rest").path("multipart").
            type(MultiPartMediaTypes.MULTIPART_MIXED).
            post(ClientResponse.class, multiPart);
这是一个具有多个字节[]属性的对象

import javax.xml.bind.annotation.XmlRootElement;

@XmlRootElement
public class Root {

    private byte[] foo;
    private byte[] bar;

    public byte[] getFoo() {
        return foo;
    }

    public void setFoo(byte[] foo) {
        this.foo = foo;
    }

    public byte[] getBar() {
        return bar;
    }

    public void setBar(byte[] bar) {
        this.bar = bar;
    }

}
演示

此类用于演示如何使用MessageWriter和MessageReader:

import java.io.FileInputStream;
import java.io.FileOutputStream;
import javax.xml.bind.JAXBContext;

public class Demo {

    public static void main(String[] args) throws Exception {
        JAXBContext jc = JAXBContext.newInstance(Root.class);

        Root root = new Root();
        root.setFoo("HELLO WORLD".getBytes());
        root.setBar("BAR".getBytes());

        MessageWriter writer = new MessageWriter(jc);
        FileOutputStream outStream = new FileOutputStream("file.xml");
        writer.write(root, outStream);
        outStream.close();

        MessageReader reader = new MessageReader(jc);
        FileInputStream inStream = new FileInputStream("file.xml");
        Root root2 = (Root) reader.read(inStream);
        inStream.close();

        System.out.println(new String(root2.getFoo()));
        System.out.println(new String(root2.getBar()));
    }

}
MessageWriter

@POST
@Consumes(MultiPartMediaTypes.MULTIPART_MIXED)
public Response post(MultiPart multiPart) {
    for(BodyPart part : multiPart.getBodyParts()) {
        System.out.println(part.getMediaType());
    }

    return Response.status(Response.Status.ACCEPTED).
            entity("Attachements processed successfully.").
            type(MediaType.TEXT_PLAIN).build();

}
负责将信息写入所需格式:

import java.io.ByteArrayOutputStream;
import java.io.ObjectOutputStream;
import java.io.OutputStream;
import java.util.ArrayList;
import java.util.List;

import javax.activation.DataHandler;
import javax.xml.bind.JAXBContext;
import javax.xml.bind.Marshaller;
import javax.xml.bind.attachment.AttachmentMarshaller;

public class MessageWriter {

    private JAXBContext jaxbContext;

    public MessageWriter(JAXBContext jaxbContext) {
        this.jaxbContext = jaxbContext;
    }

    /**
     * Write the message in the following format:
     * [xml_length][xml][attach1_length][attach1]...[attachN_length][attachN] 
     */
    public void write(Object object, OutputStream stream) {
        try {
            Marshaller marshaller = jaxbContext.createMarshaller();
            marshaller.setProperty(Marshaller.JAXB_FRAGMENT, true);
            BinaryAttachmentMarshaller attachmentMarshaller = new BinaryAttachmentMarshaller();
            marshaller.setAttachmentMarshaller(attachmentMarshaller);
            ByteArrayOutputStream xmlStream = new ByteArrayOutputStream();
            marshaller.marshal(object, xmlStream);
            byte[] xml = xmlStream.toByteArray();
            xmlStream.close();

            ObjectOutputStream messageStream = new ObjectOutputStream(stream);

            messageStream.write(xml.length); //[xml_length]
            messageStream.write(xml); // [xml]

            for(Attachment attachment : attachmentMarshaller.getAttachments()) {
                messageStream.write(attachment.getLength()); // [attachX_length]
                messageStream.write(attachment.getData(), attachment.getOffset(), attachment.getLength());  // [attachX]
            }

            messageStream.flush();
        } catch(Exception e) {
            throw new RuntimeException(e);
        }
    }

    private static class BinaryAttachmentMarshaller extends AttachmentMarshaller {

        private static final int THRESHOLD = 10;

        private List<Attachment> attachments = new ArrayList<Attachment>();

        public List<Attachment> getAttachments() {
            return attachments;
        }

        @Override
        public String addMtomAttachment(DataHandler data, String elementNamespace, String elementLocalName) {
            return null;
        }

        @Override
        public String addMtomAttachment(byte[] data, int offset, int length, String mimeType, String elementNamespace, String elementLocalName) {
            if(data.length < THRESHOLD) {
                return null;
            }
            int id = attachments.size() + 1;
            attachments.add(new Attachment(data, offset, length));
            return "cid:" + String.valueOf(id);
        }

        @Override
        public String addSwaRefAttachment(DataHandler data) {
            return null;
        }

        @Override
        public boolean isXOPPackage() {
            return true;
        }

    }

    public static class Attachment {

        private byte[] data;
        private int offset;
        private int length;

        public Attachment(byte[] data, int offset, int length) {
            this.data = data;
            this.offset = offset;
            this.length = length;
        }

        public byte[] getData() {
            return data;
        }

        public int getOffset() {
            return offset;
        }

        public int getLength() {
            return length;
        }

    }

}
import java.io.ByteArrayOutputStream;
导入java.io.ObjectOutputStream;
导入java.io.OutputStream;
导入java.util.ArrayList;
导入java.util.List;
导入javax.activation.DataHandler;
导入javax.xml.bind.JAXBContext;
导入javax.xml.bind.Marshaller;
导入javax.xml.bind.attachment.AttachmentMarshaller;
公共类消息编写器{
私有JAXBContext JAXBContext;
公共消息编写器(JAXBContext JAXBContext){
this.jaxbContext=jaxbContext;
}
/**
*用以下格式编写消息:
*[xml_length][xml][attach1_length][attach1]…[attachN_length][attachN]
*/
公共无效写入(对象对象、输出流){
试一试{
Marshaller=jaxbContext.createMarshaller();
setProperty(marshaller.JAXB_片段,true);
BinaryAttachmentMarshaller attachmentMarshaller=新的BinaryAttachmentMarshaller();
marshaller.setAttachmentMarshaller(attachmentMarshaller);
ByteArrayOutputStream xmlStream=新建ByteArrayOutputStream();
marshaller.marshall(对象,xmlStream);
字节[]xml=xmlStream.toByteArray();
xmlStream.close();
ObjectOutputStream messageStream=新的ObjectOutputStream(流);
messageStream.write(xml.length);//[xml\u length]
messageStream.write(xml);//[xml]
for(附件:attachmentMarshaller.getAttachments()){
messageStream.write(attachment.getLength());/[attachX_length]
messageStream.write(attachment.getData()、attachment.getOffset()、attachment.getLength());//[attachX]
}
messageStream.flush();
}捕获(例外e){
抛出新的运行时异常(e);
}
}
私有静态类BinaryAttachmentMarshaller扩展了AttachmentMarshaller{
私有静态最终整数阈值=10;
私人名单

我遵循了Blaise Doughan提出的概念,但没有附件封送员:

我让
XmlAdapter
byte[]
转换为
URI
-引用并返回,而引用指向单独的文件,其中存储原始数据。然后将XML文件和所有二进制文件放入zip

它与OpenOffice和ODF格式的方法类似,ODF格式实际上是一个包含少量XML和二进制文件的zip格式

(在示例代码中,没有写入实际的二进制文件,也没有创建zip。)

Bindings.java 输出

storage://myZipVFS/1
storage://myZipVFS/2

这不是JAXB本机支持的,因为您不希望将二进制数据序列化为XML,但在使用JAXB时通常可以在更高的级别上完成。 我这样做的方式是使用Web服务(SOAP和REST)使用MIME多部分/混合消息(检查)。最初设计用于电子邮件,非常适合发送带有二进制数据的xml,大多数Web服务框架(如axis或jersey)以几乎透明的方式支持它

下面是一个使用Jersey with发送XML对象和REST webservice的二进制文件的示例

XML对象

[xml_length][xml][attach1_length][attach1]...[attachN_length][attachN]
@XmlRootElement
public class Book {
   private String title;
   private String author;
   private int year;

   //getter and setters...
}
byte[] bin = some binary data...

Book b = new Book();
b.setAuthor("John");
b.setTitle("wild stuff");
b.setYear(2012);

MultiPart multiPart = new MultiPart();
    multiPart.bodyPart(new BodyPart(b, MediaType.APPLICATION_XML_TYPE));
    multiPart.bodyPart(new BodyPart(bin, MediaType.APPLICATION_OCTET_STREAM_TYPE));

    response = service.path("rest").path("multipart").
            type(MultiPartMediaTypes.MULTIPART_MIXED).
            post(ClientResponse.class, multiPart);
客户端

[xml_length][xml][attach1_length][attach1]...[attachN_length][attachN]
@XmlRootElement
public class Book {
   private String title;
   private String author;
   private int year;

   //getter and setters...
}
byte[] bin = some binary data...

Book b = new Book();
b.setAuthor("John");
b.setTitle("wild stuff");
b.setYear(2012);

MultiPart multiPart = new MultiPart();
    multiPart.bodyPart(new BodyPart(b, MediaType.APPLICATION_XML_TYPE));
    multiPart.bodyPart(new BodyPart(bin, MediaType.APPLICATION_OCTET_STREAM_TYPE));

    response = service.path("rest").path("multipart").
            type(MultiPartMediaTypes.MULTIPART_MIXED).
            post(ClientResponse.class, multiPart);
服务器

@POST
@Consumes(MultiPartMediaTypes.MULTIPART_MIXED)
public Response post(MultiPart multiPart) {
    for(BodyPart part : multiPart.getBodyParts()) {
        System.out.println(part.getMediaType());
    }

    return Response.status(Response.Status.ACCEPTED).
            entity("Attachements processed successfully.").
            type(MediaType.TEXT_PLAIN).build();

}
我尝试发送一个110917字节的文件。使用wireshark,您可以看到数据直接通过HTTP发送,如下所示:

Hypertext Transfer Protocol
   POST /org.etics.test.rest.server/rest/multipart HTTP/1.1\r\n
   Content-Type: multipart/mixed; boundary=Boundary_1_353042220_1343207087422\r\n
   MIME-Version: 1.0\r\n
   User-Agent: Java/1.7.0_04\r\n
   Host: localhost:8080\r\n
   Accept: text/html, image/gif, image/jpeg\r\n
   Connection: keep-alive\r\n
   Content-Length: 111243\r\n
   \r\n
   [Full request URI: http://localhost:8080/org.etics.test.rest.server/rest/multipart]

   MIME Multipart Media Encapsulation, Type: multipart/mixed, Boundary: "Boundary_1_353042220_1343207087422"
     [Type: multipart/mixed]
     First boundary: --Boundary_1_353042220_1343207087422\r\n
        Encapsulated multipart part:  (application/xml)
        Content-Type: application/xml\r\n\r\n
        eXtensible Markup Language
          <?xml
          <book>
            <author>
              John
            </author>
            <title>
              wild stuff
            </title>
            <year>
              2012
            </year>
          </book>
     Boundary: \r\n--Boundary_1_353042220_1343207087422\r\n
        Encapsulated multipart part:  (application/octet-stream)
        Content-Type: application/octet-stream\r\n\r\n
        Media Type
          Media Type: application/octet-stream (110917 bytes)
     Last boundary: \r\n--Boundary_1_353042220_1343207087422--\r\n
超文本传输协议 POST/org.etics.test.rest.server/rest/multipart HTTP/1.1\r\n 内容类型:多部分/混合;边界=边界\u 1\u 353042220\u 1343207087422\r\n MIME版本:1.0\r\n 用户代理:Java/1.7.0\u 04\r\n 主机:本地主机:8080\r\n 接受:text/html、image/gif、image/jpeg\r\n 连接:保持活动状态\r\n 内容-