Scala 在我的MapReduce作业中未调用Reducer任务

Scala 在我的MapReduce作业中未调用Reducer任务,scala,hadoop,mapreduce,Scala,Hadoop,Mapreduce,这是一个字数减少的工作。我有自己的输入格式 作业执行人: val job = new Job(new Configuration()) job.setMapperClass(classOf[CountMapper]) job.setReducerClass(classOf[CountReducer]) job.setJobName("tarun-test-1") job.setInputFormatClass(classOf[MyInputFormat]) FileInputFormat.s

这是一个字数减少的工作。我有自己的输入格式

作业执行人:

val job = new Job(new Configuration())

job.setMapperClass(classOf[CountMapper])
job.setReducerClass(classOf[CountReducer])

job.setJobName("tarun-test-1")
job.setInputFormatClass(classOf[MyInputFormat])
FileInputFormat.setInputPaths(job, new Path(args(0)))
FileOutputFormat.setOutputPath(job, new Path(args(1)))

job.setOutputKeyClass(classOf[Text])
job.setOutputValueClass(classOf[LongWritable])

job.setNumReduceTasks(1)

println("status: " + job.waitForCompletion(true))
制图员:

class CountMapper extends Mapper[LongWritable, Text, Text, LongWritable] {

    private val valueOut = new LongWritable(1L)

    override def map(k: LongWritable, v: Text, context: Mapper[LongWritable, Text, Text, LongWritable]#Context): Unit = {
        val str = v.toString
        str.split(",").foreach(word => {
            val keyOut = new Text(word.toLowerCase.trim)
            context.write(keyOut, valueOut)
        })
    }
}
减速器:

class CountReducer extends Reducer[Text, LongWritable, Text, LongWritable] {

    override def reduce(k: Text, values: Iterable[LongWritable], context: Reducer[Text, LongWritable, Text, LongWritable]#Context): Unit = {
        println("Inside reduce method..")
        val valItr = values.iterator()
        var sum = 0L
        while (valItr.hasNext) {
            sum = sum + valItr.next().get()
        }

        context.write(k, new LongWritable(sum))
        println("done reducing.")
    }
}
正在调用映射程序,RecordReader正在根据日志正确读取拆分。但是,未调用reducer。

请尝试设置:
job.mapOutputKeyClass和job.MapOutputValueClass

你有自己的输入格式是什么意思?它在哪里?你说不调用reduce是什么意思?你怎么知道的?有什么输入/输出吗?柜台?错误?日志?MyInputFormat是我自己的InputFormat。InputFormat按预期工作,我看到RecordReader正确读取映射器的输入(键、值)。我将日志记录添加到映射任务中,它正在按预期记录事情。但是,不会打印reduce日志,最终状态为false。如果OutputKeyClass和OutputValueClass相同,则MapOutputKeyClass和MapOutputValueClass不是必需的。