Java SequenceFile不是在hadoop中创建的
我正在写一个MapReduce作业来测试一些计算。我将我的输入分割成映射,这样每个映射都会做微积分的一部分,结果将是一个(X,y)对的列表,我想将其刷新到一个SequenceFile中 映射部分进行得很顺利,但当减速器启动时,我得到以下错误:Java SequenceFile不是在hadoop中创建的,java,hadoop,mapreduce,Java,Hadoop,Mapreduce,我正在写一个MapReduce作业来测试一些计算。我将我的输入分割成映射,这样每个映射都会做微积分的一部分,结果将是一个(X,y)对的列表,我想将其刷新到一个SequenceFile中 映射部分进行得很顺利,但当减速器启动时,我得到以下错误:线程“main”java.io.FileNotFoundException中的异常:文件不存在:hdfs://172.16.199.132:9000/user/hduser/FractalJob_1452257628594_410365359/out/red
线程“main”java.io.FileNotFoundException中的异常:文件不存在:hdfs://172.16.199.132:9000/user/hduser/FractalJob_1452257628594_410365359/out/reduce-输出
另一个观察结果是,只有当我使用了多于map时,才会出现此错误
已更新这是我的映射器和还原器代码
public static class RasterMapper extends Mapper<IntWritable, IntWritable, IntWritable, IntWritable> {
private int imageS;
private static Complex mapConstant;
@Override
public void setup(Context context) throws IOException {
imageS = context.getConfiguration().getInt("image.size", -1);
mapConstant = new Complex(context.getConfiguration().getDouble("constant.re", -1),
context.getConfiguration().getDouble("constant.im", -1));
}
@Override
public void map(IntWritable begin, IntWritable end, Context context) throws IOException, InterruptedException {
for (int x = (int) begin.get(); x < end.get(); x++) {
for (int y = 0; y < imageS; y++) {
float hue = 0, brighness = 0;
int icolor = 0;
Complex z = new Complex(2.0 * (x - imageS / 2) / (imageS / 2),
1.33 * (y - imageS / 2) / (imageS / 2));
icolor = startCompute(generateZ(z), 0);
if (icolor != -1) {
brighness = 1f;
}
hue = (icolor % 256) / 255.0f;
Color color = Color.getHSBColor(hue, 1f, brighness);
try {
context.write(new IntWritable(x + y * imageS), new IntWritable(color.getRGB()));
} catch (Exception e) {
e.printStackTrace();
}
}
}
}
private static Complex generateZ(Complex z) {
return (z.times(z)).plus(mapConstant);
}
private static int startCompute(Complex z, int color) {
if (z.abs() > 4) {
return color;
} else if (color >= 255) {
return -1;
} else {
color = color + 1;
return startCompute(generateZ(z), color);
}
}
}
public static class ImageReducer extends Reducer<IntWritable, IntWritable, WritableComparable<?>, Writable> {
private SequenceFile.Writer writer;
@Override
protected void cleanup(Context context) throws IOException, InterruptedException {
writer.close();
}
@Override
public void setup(Context context) throws IOException, InterruptedException {
Configuration conf = context.getConfiguration();
Path outDir = new Path(conf.get(FileOutputFormat.OUTDIR));
Path outFile = new Path(outDir, "pixels-out");
Option optPath = SequenceFile.Writer.file(outFile);
Option optKey = SequenceFile.Writer.keyClass(IntWritable.class);
Option optVal = SequenceFile.Writer.valueClass(IntWritable.class);
Option optCom = SequenceFile.Writer.compression(CompressionType.NONE);
try {
writer = SequenceFile.createWriter(conf, optCom, optKey, optPath, optVal);
} catch (Exception e) {
e.printStackTrace();
}
}
@Override
public void reduce (IntWritable key, Iterable<IntWritable> value, Context context) throws IOException, InterruptedException {
try{
writer.append(key, value.iterator().next());
} catch (Exception e) {
e.printStackTrace();
}
}
}
仔细查看日志,我发现问题出在我将数据输入地图的方式上。我将图像大小拆分为几个序列文件,以便地图可以从中读取并计算该区域像素的颜色
这是我创建文件的方式:
try {
int offset = 0;
// generate an input file for each map task
for (int i = 0; i < mapNr; ++i) {
final Path file = new Path(input, "part" + i);
final IntWritable begin = new IntWritable(offset);
final IntWritable end = new IntWritable(offset + imgSize / mapNr);
offset = (int) end.get();
Option optPath = SequenceFile.Writer.file(file);
Option optKey = SequenceFile.Writer.keyClass(IntWritable.class);
Option optVal = SequenceFile.Writer.valueClass(IntWritable.class);
Option optCom = SequenceFile.Writer.compression(CompressionType.NONE);
SequenceFile.Writer writer = SequenceFile.createWriter(conf, optCom, optKey, optPath, optVal);
try {
writer.append(begin, end);
} catch (Exception e) {
e.printStackTrace();
} finally {
writer.close();
}
System.out.println("Wrote input for Map #" + i);
}
您不必担心创建自己的序列文件。MapReduce的输出格式可以自动执行此操作 因此,在驱动程序类中,您将使用:
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(IntWritable.class);
job.setOutputFormatClass(SequenceFileOutputFormat.class);
然后在减速机中你会写:
context.write(key, values.iterator().next());
并删除所有设置
方法
顺便说一句,看起来你根本不需要减速器。如果您没有在reducer中进行任何计算,也没有对分组进行任何操作(我认为您没有),那么为什么不直接删除它呢job.setOutputFormatClass(SequenceFileOutputFormat.class)
将把映射器输出写入序列文件
如果只需要一个输出文件,请设置
job.setNumReduceTasks(1);
如果您的最终数据不大于1块大小,您将获得所需的输出
值得注意的是,您当前每个键只输出一个值-您应该确保您需要这样做,如果不需要,还应该在reducer中包含一个循环来迭代这些值。很抱歉重新发布了这个问题,请更新这个问题更新了这个问题。一个reducer应该将所有内容强制放在一个文件中。嘿,很抱歉响应太晚了。我照你说的做了,但记录器仍然说作业失败,因为任务失败。failedMaps:1 failedReduces:0。和
线程“main”java.io.FileNotFoundException中的异常。
我将在上面添加配置,也许这就是问题所在。
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(IntWritable.class);
job.setOutputFormatClass(SequenceFileOutputFormat.class);
context.write(key, values.iterator().next());
job.setNumReduceTasks(1);