如何使用Java Hadoop map reduce按降序对数据集中的列进行排序?

如何使用Java Hadoop map reduce按降序对数据集中的列进行排序?,java,sorting,hadoop,mapreduce,hadoop-partitioning,Java,Sorting,Hadoop,Mapreduce,Hadoop Partitioning,我的数据文件是: Utsav Chatterjee Dangerous Soccer Coldplay 4 Rodney Purtle Awesome Football Maroon5 3 Michael Gross Amazing Basketball Iron Maiden 6 Emmanuel Ezeigwe Cool Pool Metallica 5 John Doe Boring Golf Linkin Park 8

我的数据文件是:

Utsav   Chatterjee  Dangerous   Soccer  Coldplay    4
Rodney  Purtle  Awesome Football    Maroon5 3
Michael Gross   Amazing Basketball  Iron Maiden 6
Emmanuel    Ezeigwe Cool    Pool    Metallica   5
John    Doe Boring  Golf    Linkin Park 8
David   Bekham  Godlike Soccer  Justin Beiber   89
Abhishek    Kumar   Geek    Cricket Abhishek Kumar  7
Abhishek    Singh   Geek    Cricket Abhishek Kumar  7
我希望在调用hadoop jar时将列号作为参数传递,并且我需要根据特定列按降序对整个数据集进行排序。通过将所需列设置为mapper输出中的键,我可以按升序轻松地完成这项工作。但是,我无法按降序完成此任务

我的Mapper和Reducer代码是:

public static class Map extends Mapper<LongWritable,Text,Text,Text>{
        public static void map(LongWritable key, Text value, Context context)
        throws IOException,InterruptedException 
        {
            Configuration conf = context.getConfiguration();
            String param = conf.get("columnRef");
            int colref = Integer.parseInt(param);
            String line = value.toString();
            String[] parts = line.split("\t");
            context.write(new Text(parts[colref]), value);
            }
        }

    public static class Reduce extends Reducer<Text,Text,Text,Text>{
        public void reduce(Text key, Iterable<Text> value, Context context)
        throws IOException,InterruptedException 
        {
            for (Text text : value) {
                context.write(text,null );
            }
        }
    }
我可能是比较仪出了问题。有人能帮我吗?当我运行此操作时,选择索引为5的列(最后一列数字)作为此排序的基础,我仍然以升序获得结果

驾驶员等级:

public static void main(String[] args) throws Exception {

        Configuration conf= new Configuration();
        conf.set("columnRef", args[2]);

        Job job = new Job(conf, "Sort");

        job.setJarByClass(Sort.class);
        job.setMapperClass(Map.class);
        job.setReducerClass(Reduce.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(Text.class);
        job.setSortComparatorClass(DescendingKeyComparator.class);
        job.setInputFormatClass(TextInputFormat.class);
        job.setOutputFormatClass(TextOutputFormat.class);

        Path outputPath = new Path(args[1]);

        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        outputPath.getFileSystem(conf).delete(outputPath);

        System.exit(job.waitForCompletion(true) ? 0 : 1);
    }
任何关于如何完成这项任务的建议(降序)都将对我非常有帮助!!
感谢您的驾驶员课程,以下代码行:
job.setSortComparatorClass(下降键Comparator.class)


您已将类设置为DegendingKeyComparator.class。改为将其设置为sortComparator.class。它应该可以工作。

我认为它没有得到排序的原因是排序时使用的值是作为映射器输出的文本类型。应该是LongWritable/IntWritable。
public static void main(String[] args) throws Exception {

        Configuration conf= new Configuration();
        conf.set("columnRef", args[2]);

        Job job = new Job(conf, "Sort");

        job.setJarByClass(Sort.class);
        job.setMapperClass(Map.class);
        job.setReducerClass(Reduce.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(Text.class);
        job.setSortComparatorClass(DescendingKeyComparator.class);
        job.setInputFormatClass(TextInputFormat.class);
        job.setOutputFormatClass(TextOutputFormat.class);

        Path outputPath = new Path(args[1]);

        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        outputPath.getFileSystem(conf).delete(outputPath);

        System.exit(job.waitForCompletion(true) ? 0 : 1);
    }