如何使用mapreduce和java查找hadoop中的双字数_Hadoop_Mapreduce

如何使用mapreduce和java查找hadoop中的双字数

hadoop mapreduce

如何使用mapreduce和java查找hadoop中的双字数,hadoop,mapreduce,Hadoop,Mapreduce,我需要java中的mapreduce代码来找出hadoop中的双字数解决方案输入： “你叫什么名字？你想从我这里得到什么？你知道挣钱的最好方法是努力工作你的目标是什么？” 双W.C.输出：什么是2 这是你的2 你的名字1 你想要什么更快的反应是非常可观的提前感谢。下面的代码适合我 package hadoop; import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apa

我需要java中的mapreduce代码来找出hadoop中的双字数解决方案

输入：

“你叫什么名字？你想从我这里得到什么？你知道挣钱的最好方法是努力工作你的目标是什么？”

双W.C.输出：什么是2 这是你的2 你的名字1 你想要什么

更快的反应是非常可观的

提前感谢。

下面的代码适合我

package hadoop;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class doubleWc {
public static class doubMapper extends   Mapper<LongWritable,Text,Text,IntWritable>
{
    Text outkey=new Text();
    IntWritable outvalue=new IntWritable();
    public void map(LongWritable key,Text values,Context context) throws IOException, InterruptedException
    {
        String []cols=values.toString().split(",");
        for(int i=0;i<(cols.length) - 1 ;i++)
        {
            outkey.set(cols[i]+","+cols[i+1]);
            outvalue.set(1);
            context.write(outkey, outvalue);
        }
    }
}
public static class douReducer extends Reducer<Text,IntWritable,Text,IntWritable>
{
    IntWritable outvalue=new IntWritable();
    public void reduce(Text key,Iterable<IntWritable> values,Context context) throws IOException, InterruptedException
    {
        int sum=0;
        for(IntWritable t:values)
        {
            sum=sum+t.get();
        }
        outvalue.set(sum);
        context.write(key, outvalue);
    }
}
public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
    Configuration conf=new Configuration();
    @SuppressWarnings("deprecation")
    Job job=new Job(conf,"double program");

    job.setJarByClass(doubleWc.class);
    job.setMapperClass(doubMapper.class);
    job.setReducerClass(douReducer.class);

    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);

    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));

    System.exit(job.waitForCompletion(true)?1:0);

}

}

packagehadoop；
导入java.io.IOException；
导入org.apache.hadoop.conf.Configuration；
导入org.apache.hadoop.fs.Path；
导入org.apache.hadoop.io.IntWritable；
导入org.apache.hadoop.io.LongWritable；
导入org.apache.hadoop.io.Text；
导入org.apache.hadoop.mapreduce.Job；
导入org.apache.hadoop.mapreduce.Mapper；
导入org.apache.hadoop.mapreduce.Reducer；
导入org.apache.hadoop.mapreduce.lib.input.FileInputFormat；
导入org.apache.hadoop.mapreduce.lib.output.FileOutputFormat；
公务舱{
公共静态类doubMapper扩展映射器
{
Text outkey=新文本（）；
IntWritable outvalue=新的IntWritable（）；
公共void映射（LongWritable键、文本值、上下文）引发IOException、InterruptedException
{
字符串[]cols=values.toString（）.split（“，”）；
对于（int i=0；i您的解决方案很好。只需添加一些解释映射和Reduce函数：）此代码运行正常，没有任何错误，但提供了空输出文件..！工作正常..！在如下更改映射器代码后：公共类DoubleMapper1扩展映射器{Text outkey=new Text（）；IntWritable outvalue=new IntWritable（）；public void map（LongWritable键、文本值、上下文上下文）抛出IOException、InterruptedException{String[]cols=values.toString（）.split（\\w+）；for（int i=0；iya..我的输入文件是“，”分隔的这就是我使用split（“，”）的原因。谢谢