Hadoop Storm jms Spoot收集Avro消息并发送下游消息？_Hadoop_Jms_Message Queue_Apache Storm_Avro

Hadoop Storm jms Spoot收集Avro消息并发送下游消息？

hadoop jms apache-storm

Hadoop Storm jms Spoot收集Avro消息并发送下游消息？,hadoop,jms,message-queue,apache-storm,avro,Hadoop,Jms,Message Queue,Apache Storm,Avro,我是Avro格式的新手。我正在尝试使用Storm JMS spout从JMS队列收集Avro消息，并使用hdfs bolt将它们发送到hdfs 队列正在发送avro，但我无法使用HDFS螺栓以avro格式获取它们如何正确收集avro消息并将其发送到下游，而不会在hdfs中出现编码错误。现有hdfs Bolt不支持写入avro文件。我们需要进行以下更改来克服这一问题。在这个示例代码中，我使用的是从我的喷口获取JMS消息，并将这些JMS字节消息转换为AVRO并将其发送到HDFS 此代码可以作为修改

我是Avro格式的新手。我正在尝试使用Storm JMS spout从JMS队列收集Avro消息，并使用hdfs bolt将它们发送到hdfs

队列正在发送avro，但我无法使用HDFS螺栓以avro格式获取它们

如何正确收集avro消息并将其发送到下游，而不会在hdfs中出现编码错误。

现有hdfs Bolt不支持写入avro文件。我们需要进行以下更改来克服这一问题。在这个示例代码中，我使用的是从我的喷口获取JMS消息，并将这些JMS字节消息转换为AVRO并将其发送到HDFS

此代码可以作为修改AbstractHdfsBolt中方法的示例

public void execute(Tuple tuple) {          
        try {               
            long length = bytesMessage.getBodyLength();
            byte[] bytes = new byte[(int)length];
            ///////////////////////////////////////
            bytesMessage.readBytes(bytes);
            String replyMessage = new String(bytes, "UTF-8");

            datumReader = new SpecificDatumReader<IndexedRecord>(schema);
            decoder = DecoderFactory.get().binaryDecoder(bytes, null);

            result = datumReader.read(null, decoder);                               
            synchronized (this.writeLock) {                 
                dataFileWriter.append(result);                                      
                dataFileWriter.sync();
                this.offset += bytes.length;                    
               if (this.syncPolicy.mark(tuple, this.offset)) {
                   if (this.out instanceof HdfsDataOutputStream) {
                        ((HdfsDataOutputStream) this.out).hsync(EnumSet.of(SyncFlag.UPDATE_LENGTH));
                    } else {
                        this.out.hsync();
                        this.out.flush();
                    }
                    this.syncPolicy.reset();
                }
               dataFileWriter.flush();
            }

            if(this.rotationPolicy.mark(tuple, this.offset)){
                rotateOutputFile(); // synchronized
                this.offset = 0;
                this.rotationPolicy.reset();
            }
        } catch (IOException | JMSException e) {
            LOG.warn("write/sync failed.", e);
            this.collector.fail(tuple);
        } 
    }

@Override
void closeOutputFile() throws IOException {
    this.out.close();
}

@Override
Path createOutputFile() throws IOException {
    Path path = new Path(this.fileNameFormat.getPath(), this.fileNameFormat.getName(this.rotation, System.currentTimeMillis()));
    this.out = this.fs.create(path);
    dataFileWriter.create(schema, out);
    return path;
}

@Override
void doPrepare(Map conf, TopologyContext topologyContext,OutputCollector collector) throws IOException {
    // TODO Auto-generated method stub
     LOG.info("Preparing HDFS Bolt...");
     try {

            schema = new Schema.Parser().parse(new File("/home/*******/********SchemafileName.avsc"));
        } catch (IOException e1) {              
            e1.printStackTrace();
        }
     this.fs = FileSystem.get(URI.create(this.fsUrl), hdfsConfig);
     datumWriter = new SpecificDatumWriter<IndexedRecord>(schema);
     dataFileWriter = new DataFileWriter<IndexedRecord>(datumWriter);
     JMSAvroUtils JASV = new JMSAvroUtils();         
}

public void execute（Tuple-Tuple）{
试试{
long length=bytesMessage.getBodyLength（）；
字节[]字节=新字节[（int）长度]；
///////////////////////////////////////
bytesMessage.readBytes（字节）；
String replyMessage=新字符串（字节，“UTF-8”）；
datumReader=新的SpecificDatumReader（模式）；
decoder=DecoderFactory.get（）.binaryDecoder（字节，null）；
结果=datumReader.read（空，解码器）；
已同步（此.writeLock）{
dataFileWriter.append（结果）；
dataFileWriter.sync（）；
this.offset+=bytes.length；
if（this.syncPolicy.mark（tuple，this.offset））{
if（HdfsDataOutputStream的this.out实例）{
（（HdfsDataOutputStream）this.out）.hsync（EnumSet.of（SyncFlag.UPDATE_LENGTH））；
}否则{
this.out.hsync（）；
this.out.flush（）；
}
此参数为.syncPolicy.reset（）；
}
dataFileWriter.flush（）；
}
if（this.rotationPolicy.mark（tuple，this.offset））{
rotateOutputFile（）；//已同步
这个偏移量=0；
这个.rotationPolicy.reset（）；
}
}捕获（IOException | jmscexception e）{
LOG.warn（“写入/同步失败”，e）；
this.collector.fail（tuple）；
} 
}
@凌驾
void closeOutputFile（）引发IOException{
this.out.close（）；
}
@凌驾
路径createOutputFile（）引发IOException{
路径路径=新路径（this.fileNameFormat.getPath（），this.fileNameFormat.getName（this.rotation，System.currentTimeMillis（））；
this.out=this.fs.create（路径）；
创建（schema，out）；
返回路径；
}
@凌驾
void doPrepare（映射配置、TopologyContext TopologyContext、OutputCollector收集器）引发IOException{
//TODO自动生成的方法存根
LOG.info（“准备HDFS螺栓…”）；
试一试{
schema=new schema.Parser（）.parse（新文件（“/home/************/*************SchemafileName.avsc”）；
}捕获（IOE1）{
e1.printStackTrace（）；
}
this.fs=FileSystem.get（URI.create（this.fsUrl），hdfsConfig）；
datumWriter=新的SpecificDatumWriter（模式）；
dataFileWriter=新的dataFileWriter（datumWriter）；
JMSAvroUtils JASV=新JMSAvroUtils（）；
}

您应该添加问题中的异常消息。嗨，Joshua，我在storm中没有收到任何异常。我能够从JMS读取数据并将其放置在hdfs中，但在使用hdfs bolt读取放置在hdfs中的.avro文件时，我尝试使用HIVE读取该文件时出错。这是错误：java.io.IOException:java.io.IOException:不是数据文件。我认为Storm需要一些类似的东西来在HDFS BOLT中Flume Avroevent序列化程序。看起来Storm需要将中的元组序列化为HDFS BOLT中的avro元组的机制。