Java 可在Hadoop中设置为可写？_Java_Hadoop_Mapreduce_Writable

Java 可在Hadoop中设置为可写？

java hadoop mapreduce

Java 可在Hadoop中设置为可写？,java,hadoop,mapreduce,writable,Java,Hadoop,Mapreduce,Writable,我正在尝试在Hadoop中创建一个SetWritable。这是我的实现。我刚刚开始使用MapReduce，但我不知道该怎么做。我写了下面的代码，但它不工作自定义可写（需要设置）： public class TextPair implements Writable { private Text first; public HashSet<String> valueSet = new HashSet<String>(); public TextPa

我正在尝试在Hadoop中创建一个SetWritable。这是我的实现。我刚刚开始使用MapReduce，但我不知道该怎么做。我写了下面的代码，但它不工作

自定义可写（需要设置）：

public class TextPair implements Writable {

    private Text first;
    public HashSet<String> valueSet = new HashSet<String>();
    public TextPair() {

    }

    @Override
    public void write(DataOutput out) throws IOException {
        out.writeInt(valueSet.size());
        Iterator<String> it = valueSet.iterator();
        while (it.hasNext()) {
            this.first = new Text(it.next());
            first.write(out);
        }
    }

    @Override
    public void readFields(DataInput in) throws IOException {
        Iterator<String> it = valueSet.iterator();
        while (it.hasNext()) {
            this.first = new Text(it.next());
            first.readFields(in);
        }
    }

}

公共类TextPair实现可写{
私人文本优先；
public HashSet valueSet=新HashSet（）；
公共文本对（）{
}
@凌驾
public void write（DataOutput out）引发IOException{
out.writeInt（valueSet.size（））；
迭代器it=valueSet.Iterator（）；
while（it.hasNext（））{
this.first=新文本（it.next（））；
首先，写出；
}
}
@凌驾
public void readFields（DataInput in）引发IOException{
迭代器it=valueSet.Iterator（）；
while（it.hasNext（））{
this.first=新文本（it.next（））；
首先，读取字段（in）；
}
}
}

映射程序代码：

public class TokenizerMapper extends Mapper<Object, Text, Text, TextPair> {

    ArrayList<String> al = new ArrayList<String>();
    TextPair tp = new TextPair();

    public void map(Object key, Text value, Context context) throws IOException, InterruptedException {

        String [] val = value.toString().substring(2,value.toString().length()).split(" ");

        for(String v: val) {
            tp.valueSet.add(v);
        }
        String [] vals = value.toString().split(" ");

        for(int i=0; i<vals.length-1; i++) {
            setKey(vals[0],vals[i+1]);
            System.out.println(getKey());
            context.write(new Text(getKey()), tp); 
        }
    }

    public void setKey(String first,String second) {

        al.clear();
        al.add(first);
        al.add(second);

        java.util.Collections.sort(al);
    }

    public String getKey() {

        String tp = al.get(0)+al.get(1);
        return tp;
    }
 }

公共类令牌映射器扩展映射器{
ArrayList al=新的ArrayList（）；
TextPair tp=新的TextPair（）；
公共void映射（对象键、文本值、上下文上下文）引发IOException、InterruptedException{
字符串[]val=value.toString（）.substring（2，value.toString（）.length（））.split（“”）；
for（字符串v:val）{
tp.valueSet.add（v）；
}
字符串[]VAL=value.toString（）.split（“”）；
对于（int i=0；i我想说您在读写方面有问题。您需要知道集合有多大，并使用它来读取正确数量的文本对象
我将您的版本更改为一组文本对象，因为它们可以轻松地读写
public class TextWritable implements Writable {

    private Set<Text> values;

    public TextPair() {
        values = new HashSet<Text>();
    }

    @Override
    public void write(DataOutput out) throws IOException {

        // Write out the size of the Set
        out.writeInt(valueSet.size());

        // Write out each Text object
        for(Text t : values) {
            t.write(out);
        }
    }

    @Override
    public void readFields(DataInput in) throws IOException {

        // Make sure we have a HashSet to fill up
        values = new HashSet<Text>();

        // Get the number of elements in the set
        int size = in.readInt();

        // Read the correct number of Text objects
        for(int i=0; i<size; i++) {
            Text t = new Text();
            t.readFields(in);
            values.add(t);
        }
    }
}

公共类TextWritable实现可写{
私有设定值；
公共文本对（）{
值=新的HashSet（）；
}
@凌驾
public void write（DataOutput out）引发IOException{
//写出集合的大小
out.writeInt（valueSet.size（））；
//写出每个文本对象
对于（文本t：值）{
t、 写出；
}
}
@凌驾
public void readFields（DataInput in）引发IOException{
//确保我们有一个哈希集要填充
值=新的HashSet（）；
//获取集合中的元素数
int size=in.readInt（）；
//读取正确数量的文本对象
对于（int i=0；i