Hadoop MapReduce迭代reduce调用的输入值_Hadoop_Mapreduce

Hadoop MapReduce迭代reduce调用的输入值

hadoop mapreduce

Hadoop MapReduce迭代reduce调用的输入值,hadoop,mapreduce,Hadoop,Mapreduce,我正在测试一个简单的mapreduce应用程序，但我在尝试理解迭代reduce调用的输入值时会发生什么时，遇到了一点困难这是一段行为异常的代码 public void reduce(Text key, Iterable<E> values, Context context) throws IOException, InterruptedException{ Iterator<E> iterator = values.iterator(); E

我正在测试一个简单的mapreduce应用程序，但我在尝试理解迭代reduce调用的输入值时会发生什么时，遇到了一点困难

这是一段行为异常的代码

public void reduce(Text key, Iterable<E> values, Context context)
    throws IOException, InterruptedException{

    Iterator<E> iterator = values.iterator();
    E first = (E)statesIter.next();

    while(statesIter.hasNext()){
        E state = statesIter.next();

        System.out.println(first.toString());
        // some other stuff
    }
    // some other stuff
}

public void reduce（文本键、Iterable值、上下文）
抛出IOException、InterruptedException{
迭代器迭代器=值。迭代器（）；
E first=（E）statesIter.next（）；
while（statesIter.hasNext（））{
E state=statesIter.next（）；
System.out.println（first.toString（））；
//一些其他的东西
}
//一些其他的东西
}

所以没什么奇怪的。。除了每个println调用实际上打印不同的字符串。因此，每次调用

next（）

方法时，

first

引用的对象都会发生变化

那么为什么会有这种奇怪的行为呢？

这有点违反直觉，但事实上——Hadoop重用了键/值，如果你想保留它们，你应该克隆它们。

这有点违反直觉，但实际上——Hadoop重用了键/值，如果你想保留它们，你应该克隆它们