Hadoop 使用TotalOrderPartitioner MapReduce时遇到错误_Hadoop_Mapreduce_Hadoop Partitioning

Hadoop 使用TotalOrderPartitioner MapReduce时遇到错误

hadoop mapreduce

Hadoop 使用TotalOrderPartitioner MapReduce时遇到错误,hadoop,mapreduce,hadoop-partitioning,Hadoop,Mapreduce,Hadoop Partitioning,我已经写了下面的程序。我在没有使用TotalOrderPartitioner的情况下运行了它，并且运行得很好。所以我认为Mapper或Reducer类本身不存在任何问题但是，当我包含TotalOrderPartitioner的代码，即编写分区文件，然后将其放入DistributedCache时，我得到了以下错误：真的不知道如何进行 [train@sandboxTOTALORDERPARTITIONER]$hadoop jar totalorderpart.jar average.averag

我已经写了下面的程序。我在没有使用TotalOrderPartitioner的情况下运行了它，并且运行得很好。所以我认为Mapper或Reducer类本身不存在任何问题

但是，当我包含TotalOrderPartitioner的代码，即编写分区文件，然后将其放入DistributedCache时，我得到了以下错误：真的不知道如何进行

[train@sandboxTOTALORDERPARTITIONER]$hadoop jar totalorderpart.jar average.average作业部分

//Countries是输入目录，totpart是输出目录

2018年1月16日04:14:00信息输入。文件输入格式：到的总输入路径进程：4 16/01/18 04:14:00信息分区。输入采样器：使用6 样本18/01/16 04:14:00信息zlib.ZlibFactory:已成功加载& 已初始化本机zlib库16/01/18 04:14:00信息 compress.codepool：获得全新的压缩机[.deflate] java.io.IOException:错误的密钥类： org.apache.hadoop.io.LongWritable不是类 org.apache.hadoop.io.Text位于 org.apache.hadoop.io.SequenceFile$RecordCompressWriter.append（SequenceFile.java:1380）在 org.apache.hadoop.mapreduce.lib.partition.InputSampler.writePartitionFile（InputSampler.java:340）运行（AverageJob.java:132） org.apache.hadoop.util.ToolRunner.run（ToolRunner.java:70）位于 average.AverageJob.main（AverageJob.java:146）位于 sun.reflect.NativeMethodAccessorImpl.invoke0（本机方法）位于 invoke（NativeMethodAccessorImpl.java:39）在 reflect.DelegatingMethodAccessorImpl.invoke（DelegatingMethodAccessorImpl.java:25）位于java.lang.reflect.Method.invoke（Method.java:597） org.apache.hadoop.util.RunJar.main（RunJar.java:212）

我的代码

package average;

import java.io.IOException;
import java.net.URI;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.util.StringUtils;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.partition.InputSampler;
import org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;


public class AverageJob extends Configured implements Tool {

public enum Counters {MAP, COMINE, REDUCE};

public static class AverageMapper extends Mapper<LongWritable, Text, Text, Text> {

    private Text mapOutputKey = new Text();
    private Text mapOutputValue = new Text();
    @Override
    protected void map(LongWritable key, Text value, Context context)
            throws IOException, InterruptedException {

        String[] words = StringUtils.split(value.toString(), '\\', ',');
        mapOutputKey.set(words[1].trim());

        StringBuilder moValue = new StringBuilder();
        moValue.append(words[9].trim()).append(",1");
        mapOutputValue.set(moValue.toString());
        context.write(mapOutputKey, mapOutputValue);

        context.getCounter(Counters.MAP).increment(1);
    }
}

public static class AverageCombiner extends Reducer<Text, Text, Text, Text> {

    private Text combinerOutputValue = new Text();

    @Override
    protected void reduce(Text key, Iterable<Text> values, Context context)
            throws IOException, InterruptedException {

        int count=0;
        long sum=0;
        for(Text value: values)
        {
            String[] strValues = StringUtils.split(value.toString(), ','); 
            sum+= Long.parseLong(strValues[0]);
            count+= Integer.parseInt(strValues[1]);
        }
        combinerOutputValue.set(sum + "," + count);
        context.write(key, combinerOutputValue);

        context.getCounter(Counters.COMINE).increment(1);
    }
}


public static class AverageReducer extends Reducer<Text, Text, Text, DoubleWritable> {


    private DoubleWritable reduceOutputKey = new DoubleWritable();

    @Override
    protected void reduce(Text key, Iterable<Text> values, Context context)
            throws IOException, InterruptedException {

        int count=0;
        double sum=0;
        for(Text value: values)
        {
            String[] strValues = StringUtils.split(value.toString(), ',');
            sum+= Double.parseDouble(strValues[0]);
            count+= Integer.parseInt(strValues[1]);
        }

        reduceOutputKey.set(sum/count);
        context.write(key, reduceOutputKey);

        context.getCounter(Counters.REDUCE).increment(1);
    }

}


@Override
public int run(String[] args) throws Exception {

    Configuration conf = getConf();
    Job job = Job.getInstance(conf);
    job.setJarByClass(getClass());

    Path in = new Path(args[0]);
    Path out = new Path(args[1]);
    FileInputFormat.setInputPaths(job, in);
    FileOutputFormat.setOutputPath(job, out);

    job.setInputFormatClass(TextInputFormat.class);
    job.setOutputFormatClass(TextOutputFormat.class);

    job.setMapOutputKeyClass(Text.class);
    job.setMapOutputValueClass(Text.class);

    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(DoubleWritable.class);

    job.setMapperClass(AverageMapper.class);
    job.setCombinerClass(AverageCombiner.class);

    job.setPartitionerClass(TotalOrderPartitioner.class);

    job.setReducerClass(AverageReducer.class);

    job.setNumReduceTasks(6);

    InputSampler.Sampler<Text, Text> sampler = new InputSampler.RandomSampler<Text, Text>(0.2, 6, 5);
    InputSampler.writePartitionFile(job, sampler);

    String partitionFile = TotalOrderPartitioner.getPartitionFile(conf);
    URI partitionUri = new URI(partitionFile + "#" + TotalOrderPartitioner.DEFAULT_PATH);
    job.addCacheFile(partitionUri);

    return job.waitForCompletion(true)?0:1;
}

public static void main(String[] args) {

    int result=0;
    try
    {
        result = ToolRunner.run(new Configuration(), new AverageJob(), args);
        System.exit(result);
    }
    catch (Exception e)
    {
        e.printStackTrace();            
    }
}
}

包平均值；
导入java.io.IOException；
导入java.net.URI；
导入org.apache.hadoop.conf.Configuration；
导入org.apache.hadoop.conf.Configured；
导入org.apache.hadoop.fs.Path；
导入org.apache.hadoop.util.StringUtils；
导入org.apache.hadoop.util.Tool；
导入org.apache.hadoop.util.ToolRunner；
导入org.apache.hadoop.io.*；
导入org.apache.hadoop.mapreduce.lib.input.TextInputFormat；
导入org.apache.hadoop.mapreduce.lib.output.TextOutputFormat；
导入org.apache.hadoop.mapreduce.Job；
导入org.apache.hadoop.mapreduce.Mapper；
导入org.apache.hadoop.mapreduce.Reducer；
导入org.apache.hadoop.mapreduce.lib.partition.InputSampler；
导入org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner；
导入org.apache.hadoop.mapreduce.lib.input.FileInputFormat；
导入org.apache.hadoop.mapreduce.lib.output.FileOutputFormat；
公共类AverageJob扩展配置的实现工具{
公共枚举计数器{MAP，COMINE，REDUCE}；
公共静态类AverageApper扩展映射器{
私有文本mapOutputKey=新文本（）；
私有文本mapOutputValue=新文本（）；
@凌驾
受保护的void映射（可长写键、文本值、上下文）
抛出IOException、InterruptedException{
String[]words=StringUtils.split（value.toString（），“\\”，“，”）；
mapOutputKey.set（单词[1].trim（））；
StringBuilder moValue=新的StringBuilder（）；
moValue.append（单词[9].trim（））.append（“，1”）；
mapOutputValue.set（moValue.toString（））；
write（mapOutputKey，mapOutputValue）；
getCounter（Counters.MAP）.increment（1）；
}
}
公共静态类AverageCombiner扩展了Reducer{
私有文本组合器OutputValue=新文本（）；
@凌驾
受保护的void reduce（文本键、Iterable值、上下文）
抛出IOException、InterruptedException{
整数计数=0；
长和=0；
用于（文本值：值）
{
字符串[]strValues=StringUtils.split（value.toString（），'，'）；
sum+=Long.parseLong（strValues[0]）；
count+=Integer.parseInt（strValues[1]）；
}
combinerOutputValue.set（总和+”，“+计数）；
write（键，组合输出值）；
getCounter（Counters.COMINE）.increment（1）；
}
}
公共静态类AverageReducer扩展了Reducer{
私有DoubleWritable reduceOutputKey=新的DoubleWritable（）；
@凌驾
受保护的void reduce（文本键、Iterable值、上下文）
抛出IOException、InterruptedException{
整数计数=0；
双和=0；
用于（文本值：值）
{
字符串[]strValues=StringUtils.split（value.toString（），'，'）；
sum+=Double.parseDouble（strValues[0]）；
count+=Integer.parseInt（strValues[1]）；
}
reduceOutputKey.set（总和/计数）；
write（key，reduceOutputKey）；
getCounter（Counters.REDUCE）.increment（1）；
}
}
@凌驾
公共int运行（字符串[]args）引发异常{
配置conf=getConf（）；
Job Job=Job.getInstance（conf）；
setJarByClass（getClass（））；
路径输入=新路径（args[0]）；
路径输出=新路径（args[1]）；
setInputPath（作业，在中）；
setOutputPath（作业，输出）；
setInputFormatClass（TextInputFormat.class）；
setOutputFormatClass（TextOutputFormat.class）；
job.setMapOutputKeyClass（Text.class）；
job.setMapOutputValueClass（Text.class）；
job.setOutputKeyClass（Text.class）；
job.setOutputValueClass（DoubleWritable.class）；
setMapperClass（AverageMapper.class）；
job.setCombinerClass（AverageCombiner.class）；
job.setPartitionerClass（TotalOrderPartitioner.class）；
job.setReducerClass（AverageEducer.class）；
job.setNumReduceTasks（6）；
输入采样器。采样器采样器=新的输入采样器。随机采样器（0.2,6,5）；
InputSampler.writePartitionFile（作业，采样器）；
String partitionFile=TotalOrderPartitioner.getPartitionFile（conf）；
URI partitionUri=新URI（partitionFile+“#”+TotalOrderPartitioner.DEFAULT\u PA