Java 将多个参数发送到reducer MapReduce

Java 将多个参数发送到reducer MapReduce,java,hadoop,mapreduce,Java,Hadoop,Mapreduce,我已经编写了一段代码,它执行类似于SQL GroupBy的操作 我获取的数据集如下所示: 25078868141920090906200937200909619,周日,周末,在线,早上,外出,语音,25078,按秒付费,成功发布服务,17,0,1,21.25635-10-112-30455 公共类MyMap扩展映射器{ 公共void映射(LongWritable键、文本值、上下文)引发IOException { 字符串行=value.toString(); String[]属性=line.s

我已经编写了一段代码,它执行类似于SQL GroupBy的操作

我获取的数据集如下所示:


25078868141920090906200937200909619,周日,周末,在线,早上,外出,语音,25078,按秒付费,成功发布服务,17,0,1,21.25635-10-112-30455


公共类MyMap扩展映射器{
公共void映射(LongWritable键、文本值、上下文)引发IOException
{
字符串行=value.toString();
String[]属性=line.split(“,”);
double rs=double.parseDouble(属性[17]);
字符串梳=新字符串();
comb=属性[5].concat(属性[8].concat(属性[10]);
write(新文本(comb)、新双写(rs));
}
} 
公共类MyReduce扩展了Reducer{
受保护的void reduce(文本键、迭代器值、上下文)
抛出IOException、InterruptedException{
双和=0;
迭代器iter=values.Iterator();
while(iter.hasNext())
{
double val=iter.next().get();
sum=sum+val;
}
write(key,新的DoubleWritable(sum));
};
}

在映射器中,as的值将第17个参数发送到reducer进行求和。现在我还要总结第14个参数,如何将其发送到reducer?

如果您的数据类型相同,那么创建一个ArrayWritable类应该可以做到这一点。该类应类似于:

public class DblArrayWritable extends ArrayWritable 
{ 
    public DblArrayWritable() 
    { 
        super(DoubleWritable.class); 
    }
}
然后,映射器类看起来像:

public class MyMap extends Mapper<LongWritable, Text, Text, DblArrayWritable> 
{
  public void map(LongWritable key, Text value, Context context) throws IOException 
  {

    String line = value.toString();
    String[] attribute=line.split(",");
    DoubleWritable[] values = new DoubleWritable[2];
    values[0] = Double.parseDouble(attribute[14]);
    values[1] = Double.parseDouble(attribute[17]);

    String comb=new String();
    comb=attribute[5].concat(attribute[8].concat(attribute[10]));

    context.write(new Text(comb),new DblArrayWritable.set(values));

  }
}
您可以通过简单地连接这些值并将它们作为文本传递给reducer来处理这个问题,然后reducer将再次拆分它们

另一个选择是实现您自己的可写类。下面是一个如何工作的示例:

public static class PairWritable implements Writable 
{
   private Double myDouble;
   private String myString;

    // TODO :-  Override the Hadoop serialization/Writable interface methods
    @Override
    public void readFields(DataInput in) throws IOException {
            myLong = in.readDouble();
            myString = in.readUTF();
    }

    @Override
    public void write(DataOutput out) throws IOException {
            out.writeDouble(myLong);
            out.writeUTF(myString);
    }

    //End of Implementation

    //Getter and Setter methods for myLong and mySring variables
    public void set(Double d, String s) {
        myDouble = d;
        myString = s;
    }

    public Long getLong() {
        return myDouble;
    }
    public String getString() {
        return myString;
    }

}
public class ObjArrayWritable extends ArrayWritable 
{ 
    public ObjArrayWritable() 
    { 
        super(Object.class); 
    }
}
public static class PairWritable implements Writable 
{
   private Double myDouble;
   private String myString;

    // TODO :-  Override the Hadoop serialization/Writable interface methods
    @Override
    public void readFields(DataInput in) throws IOException {
            myLong = in.readDouble();
            myString = in.readUTF();
    }

    @Override
    public void write(DataOutput out) throws IOException {
            out.writeDouble(myLong);
            out.writeUTF(myString);
    }

    //End of Implementation

    //Getter and Setter methods for myLong and mySring variables
    public void set(Double d, String s) {
        myDouble = d;
        myString = s;
    }

    public Long getLong() {
        return myDouble;
    }
    public String getString() {
        return myString;
    }

}