Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/350.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java 为什么在拆分字符串并将其重新连接在一起之后,我会从reducer函数中获得不同的输出?_Java_Hadoop_Split_Reducers - Fatal编程技术网

Java 为什么在拆分字符串并将其重新连接在一起之后,我会从reducer函数中获得不同的输出?

Java 为什么在拆分字符串并将其重新连接在一起之后,我会从reducer函数中获得不同的输出?,java,hadoop,split,reducers,Java,Hadoop,Split,Reducers,我知道这是一个奇怪的问题。让我举几个例子。我正在编写一个reducer函数,它本质上连接了它接收到的迭代器的值。迭代器中的字符串格式为“%s,%s,%s”。当我这样编写代码时: public void reduce(Text key, Iterator<Text> values, OutputCollector<Text, Text> output, Reporter reporter) throws IOException { StringBuilder

我知道这是一个奇怪的问题。让我举几个例子。我正在编写一个reducer函数,它本质上连接了它接收到的
迭代器的值。迭代器中的字符串格式为“%s,%s,%s”。当我这样编写代码时:

public void reduce(Text key, Iterator<Text> values, OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
        StringBuilder indexValue = new StringBuilder();
        while (values.hasNext()) {
            String data = values.next().toString();
            indexValue.append(data);
        }

        output.collect(key, new Text(indexValue.toString()));
}
public void reduce(Text key, Iterator<Text> values, OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
        StringBuilder indexValue = new StringBuilder();
        while (values.hasNext()) {
            String data = values.next().toString();
            String [] parts = data.split(",");
            indexValue.append(parts[0] + "," + parts[1] + "," + parts[2]);
        }

        output.collect(key, new Text(indexValue.toString()));
}

你的问题让我困惑。另外,如果您使用的是字符串生成器,为什么要连接字符串?有点违背了StringBuilder的目的。我同意串接字符串违背了使用StringBuilder的目的,但这仍然不会影响输出。我的问题归结为-为什么这两个代码段给了我完全不同的输出?我必须查看原始数据。我进行了编辑以显示映射器函数和原始数据的示例。@SedrickJefferson我不同意连接会破坏
StringBuilder
的目的。串联是一种非常便宜的小规模操作。生成器的主要好处是当字符串随着循环的进行而增长时。
public void map(LongWritable key, Text value, OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
        String line = value.toString();
        String [] parts = line.split("\t");

        int frequency = Integer.parseInt(parts[1]);
        String [] documentDataParts = parts[0].split(",");

        String term = documentDataParts[0];
        String bookFilename = documentDataParts[1];
        String chunk = documentDataParts[2];

        String documentData = bookFilename + "," + chunk + "," + frequency;
        output.collect(new Text(term), new Text(documentData));
}
Ages,LesMiserablesbyVictorHugo.txt,5545 1
Aggeus,LeviathanbyThomasHobbes.txt,1268 1
Aggravateth,LeviathanbyThomasHobbes.txt,995     1
Aggravateth,LeviathanbyThomasHobbes.txt,999     1
Aggravation,LeviathanbyThomasHobbes.txt,1015    1
Aggravation,LeviathanbyThomasHobbes.txt,1691    1
Aggregate,LeviathanbyThomasHobbes.txt,1293      1
Agier,LesMiserablesbyVictorHugo.txt,2790        1
Agincourt,LesMiserablesbyVictorHugo.txt,1510    1
Agn,LesMiserablesbyVictorHugo.txt,5114  1
Agnes,LesMiserablesbyVictorHugo.txt,6450        1
Agnese,LesMiserablesbyVictorHugo.txt,580        1
Agnus,UlyssesbyJamesJoyce.txt,1827      1