Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
我想知道有什么方法我只能选择MapReduce(Hadoop)中每一行的最高值_Hadoop_Mapreduce_Hdfs - Fatal编程技术网

我想知道有什么方法我只能选择MapReduce(Hadoop)中每一行的最高值

我想知道有什么方法我只能选择MapReduce(Hadoop)中每一行的最高值,hadoop,mapreduce,hdfs,Hadoop,Mapreduce,Hdfs,下面是表示标题、月份以及标题(键)和月份(键)组合的值(总和)。我只想选择标题、月份和价值中价值最高的一行,例如,“Fly 08(09,11)4或Go 06 45,正如您在我的实际输出中所看到的。如果可能,请告知我。如果您有任何疑问,请告诉我,我将尝试澄清 Fly,07,1 Fly,08,4 Fly,09,4 Fly,10,1 Fly,11,4 Fly,12,2 Gentle Ben,05,2 Gentle Ben,06,3 Gentle Ben,07,2 Gentle Ben,08,2 Gen

下面是表示标题、月份以及标题(键)和月份(键)组合的值(总和)。我只想选择标题、月份和价值中价值最高的一行,例如,“Fly 08(09,11)4或Go 06 45,正如您在我的实际输出中所看到的。如果可能,请告知我。如果您有任何疑问,请告诉我,我将尝试澄清

Fly,07,1
Fly,08,4
Fly,09,4
Fly,10,1
Fly,11,4
Fly,12,2
Gentle Ben,05,2
Gentle Ben,06,3
Gentle Ben,07,2
Gentle Ben,08,2
Gentle Ben,09,2
German aircraft guns and cannons,11,1
Go,04,20
Go,05,29
Go,06,45
Go,07,24
Go,08,28
Go,09,37

您可以在映射器中读取值,并在减速器中计算最大值,如下所示:

public class MaxTileValue {

    public static class MaxTileValueMapper extends Mapper<LongWritable, Text, Text, IntWritable> {

        public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
            String row[] = value.toString().split(",");
            if (row.length == 3) {
                String tile = row[0];
                String val = row[2];
                context.write(new Text(tile), new IntWritable(Integer.parseInt(val)));
            }
        }
    }

    public static class MaxTileValueReducer extends Reducer<Text, IntWritable, Text, IntWritable> {

        protected void reduce(Text key, Iterable<IntWritable> values, Context context)
                throws IOException, InterruptedException {
            int max = StreamSupport.stream(values.spliterator(), false)
                    .mapToInt(IntWritable::get)
                    .max()
                    .orElse(0);
            context.write(key, new IntWritable(max));
        }
    }

    public static void main(String[] args) throws Exception {
        Job job = Job.getInstance(new Configuration(), "MaxTileValue");

        job.setMapperClass(MaxTileValueMapper.class);
        job.setReducerClass(MaxTileValueReducer.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);

        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        System.exit(job.waitForCompletion(true) ? 0 : 1);
    }
}
上述mapreduce作业在此csv上的输出为:

Fly     4
Gentle Ben      3
German aircraft guns and cannons        1
Go      45

您需要将第一列作为键值发送给reducer,将其余两列作为值发送给reducer,以便所有以相同键值开头的行都应转到相同的reducer以获得最大值。在reducer中,迭代每行并检查最终值。如果没有多行具有最大值,则第二列中只有一个值se附加所有这些值。下面是您的知识代码

public class MaxValueGroupedMapper extends Mapper<LongWritable, Text, Text, Text> {

@Override
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {

    String lines = value.toString();
    String[] val = lines.split(",");

    context.write(new Text(val[0]), new Text(val[1] + "," + val[2]));

 }
}

public class MaxValueGroupedReducer extends Reducer<Text,Text,Text,Text>{

@Override
public void reduce(Text key,Iterable<Text> values, Context context) throws IOException, InterruptedException{

    int max = 0;
    String val = null;
    Iterator it = (Iterator) values.iterator();

    for(Text txt : values){

        String st[] = txt.toString().split(",");
        int data = new Integer(st[1]);
        if(data > max){
            max = data;
            val = st[0];
        }else if (data == max){
            val  = val +"," + st[0];
        }
    }
    Text output = new Text(val+","+max);

    context.write(key, output);
 }
}

public class MaxValueGroupedDriver {

public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {

    Configuration conf = new Configuration();
    conf.set("mapreduce.job.queuename", "default");
    Job job = new Job(conf,"MaxValue");

    job.setJarByClass(MaxValueGroupedDriver.class);
    job.setMapperClass(MaxValueGroupedMapper.class);
    job.setReducerClass(MaxValueGroupedReducer.class);

    job.setMapOutputKeyClass(Text.class);
    job.setMapOutputValueClass(Text.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(Text.class);

    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));

    job.waitForCompletion(true);
}
}

先生,有没有办法只选择一个标题(行),它在几行中的值最高?是的,你可以这样做..你可以将还原数设置为1,然后在你的map类中将键更改为上下文中的某个伪值。write例如:-context.write(新文本(“A”),value)…这将确保,所有行都将转到一个还原。对不起,我仍然不明白重点,只添加一个上下文是否正确。写吗?先生,请您再简单解释一下,好吗?不,只需替换当前上下文。用这个来写。
public class MaxValueGroupedMapper extends Mapper<LongWritable, Text, Text, Text> {

@Override
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {

    String lines = value.toString();
    String[] val = lines.split(",");

    context.write(new Text(val[0]), new Text(val[1] + "," + val[2]));

 }
}

public class MaxValueGroupedReducer extends Reducer<Text,Text,Text,Text>{

@Override
public void reduce(Text key,Iterable<Text> values, Context context) throws IOException, InterruptedException{

    int max = 0;
    String val = null;
    Iterator it = (Iterator) values.iterator();

    for(Text txt : values){

        String st[] = txt.toString().split(",");
        int data = new Integer(st[1]);
        if(data > max){
            max = data;
            val = st[0];
        }else if (data == max){
            val  = val +"," + st[0];
        }
    }
    Text output = new Text(val+","+max);

    context.write(key, output);
 }
}

public class MaxValueGroupedDriver {

public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {

    Configuration conf = new Configuration();
    conf.set("mapreduce.job.queuename", "default");
    Job job = new Job(conf,"MaxValue");

    job.setJarByClass(MaxValueGroupedDriver.class);
    job.setMapperClass(MaxValueGroupedMapper.class);
    job.setReducerClass(MaxValueGroupedReducer.class);

    job.setMapOutputKeyClass(Text.class);
    job.setMapOutputValueClass(Text.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(Text.class);

    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));

    job.waitForCompletion(true);
}
}
Fly 08,09,10,4
Gentle Ben,06,3
German aircraft guns and cannons,11,1
Go,06,45