在自定义java类中反序列化MapWritable
我目前正在尝试反序列化一个自定义对象,其中一个字段是可映射可写的,另一个是字符串。序列化似乎可以正常工作,但无法验证是否正确地重新创建了对象。他是我的字段和write()readFields()方法: 我在MapReduce工作中不断遇到此异常:在自定义java类中反序列化MapWritable,java,serialization,mapreduce,deserialization,writable,Java,Serialization,Mapreduce,Deserialization,Writable,我目前正在尝试反序列化一个自定义对象,其中一个字段是可映射可写的,另一个是字符串。序列化似乎可以正常工作,但无法验证是否正确地重新创建了对象。他是我的字段和write()readFields()方法: 我在MapReduce工作中不断遇到此异常: java.lang.Exception: java.io.EOFException at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
java.lang.Exception: java.io.EOFException
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readUTF(DataInputStream.java:609)
at java.io.DataInputStream.readUTF(DataInputStream.java:564)
at org.apache.hadoop.io.AbstractMapWritable.readFields(AbstractMapWritable.java:207)
at org.apache.hadoop.io.MapWritable.readFields(MapWritable.java:167)
at decisiontree.data.ExchangeDataSample.readFields(ExchangeDataSample.java:98)
at org.apache.hadoop.io.ArrayWritable.readFields(ArrayWritable.java:96)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
at org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:146)
at org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
at org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)
at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1688)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1637)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1489)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:723)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
我非常感谢你的帮助。谢谢。您收到此异常,因为您阅读时未检查文件结尾。尝试将readFields方法更改为:
@Override
public void readFields(DataInput in) throws IOException {
values.clear();
byte[] b = new byte[1024];
//checks for end of file
if(((DataInputStream) in).read(b)!=-1){
values.readFields(in);
labelColumn = in.readLine();
}
}
谢谢您的帮助,但是我仍然在“value.readFields(in)”中遇到相同的错误。这是因为它已经在if语句中被读取了吗?然后又是-1。我是不是应该试试看?此外,是否不需要显式重新设置可映射可写值?我不太熟悉序列化/反序列化复杂结构。我用try/catch包装,但只是尝试在in.readLine之后打印字符串“labelColumn”,显示了一堆乱码数据,这些数据应该是MapWritable对象中的数据。如果我不能让它工作,我可能不得不尝试Java Json或其他东西。@Alexamos是的,这应该是没有反序列化的内容。这里有一些奇怪的地方,但如果你想要一个快速的解决方案,我建议你切换到JSON,这确实更容易。看看这个库:我最终为此使用了jackson JSON库。我对你的回答投了赞成票,因为是你的评论让我这么做的。谢谢听到这个消息很高兴!;)
@Override
public void readFields(DataInput in) throws IOException {
values.clear();
byte[] b = new byte[1024];
//checks for end of file
if(((DataInputStream) in).read(b)!=-1){
values.readFields(in);
labelColumn = in.readLine();
}
}