Java流：收集而不是合并所有值_Java_Java 8_Java Stream

Java流：收集而不是合并所有值

java java-8

Java流：收集而不是合并所有值,java,java-8,java-stream,Java,Java 8,Java Stream,我有一个存储桶列表，每个存储桶都包含一些记录。我使用流对每个存储桶中这些记录的值求和。然而，我遇到了一个问题，在我的collect之后，总和是不正确的。以下是我迄今为止的处理声明： List<StatAccumulator> results = statData.stream().map( list -> list.stream().parallel() .collect( () -> new

我有一个存储桶列表，每个存储桶都包含一些记录。我使用流对每个存储桶中这些记录的值求和。然而，我遇到了一个问题，在我的

collect

之后，总和是不正确的。以下是我迄今为止的处理声明：

List<StatAccumulator> results = statData.stream().map(
            list -> list.stream().parallel()
            .collect(
                () -> new StatAccumulator(metrics, groups),
                StatAccumulator::containerize,
                StatAccumulator::combine
            )
        ).collect(Collectors.toList());

List results=statData.stream（）.map(
list->list.stream（）.parallel（）
.收集(
（）->新的Statacumerator（度量、组），
Statacumerator：：集装箱运输，
联合收割机
)
).collect（Collectors.toList（））；

Statacumerator只是一个容器类，它存储我为每个记录求和的每个值

public class StatAccumulator {
    public StatRecord result;
    private final List<String> metrics;
    private final List<String> groups;
    private Long count;

    public StatAccumulator(List<String> metrics, List<String> groups) {
        this.metrics = metrics;
        this.groups = groups;
    }

    public void containerize(StatRecord initial) {
        //logger.info(initial.toString());
        this.result = new StatRecord(
            initial.v1,
            initial.v2
        );
        this.count = 1l;
    }

    public void combine(StatAccumulator other) {
        result.v1+= other.result.v1;
        result.v2+= other.result.v2;

        this.count += other.count;
        logger.info("Current Combined: "+this.result.v1.toString());
    }
}

公共类statacumerator{
公共记录结果；
私有最终列表度量；
私人最终名单组；
私人长计数；
公共统计累积器（列表度量、列表组）{
这个。度量=度量；
这个组=组；
}
公共空集装箱运输（StatRecord首字母）{
//logger.info（initial.toString（））；
this.result=新的StatRecord(
initial.v1，
初始版本：1.v2
);
此计数=1l；
}
公用联合收割机（Statacumerator其他）{
result.v1+=other.result.v1；
result.v2+=other.result.v2；
this.count+=other.count；
info（“当前组合：+this.result.v1.toString（））；
}
}

为了简单起见，我只使用一个bucket，只跟踪一个值。在进入这个处理步骤之前，我输出每个记录的所有值，并在Excel中求和，以得到预期的结果（~28k），但我倾向于得到~5k的实际结果。所以，我已经确认了所有的数据都在输入，但并不是所有的数据都出来了。有人知道我为什么会丢失结果吗？

您的集装箱运输方法不正确。应该是

public class StatAccumulator {
    public StatRecord result = new StatRecord(0, 0);
    private final List<String> metrics;
    private final List<String> groups;
    private long count;

    public StatAccumulator(List<String> metrics, List<String> groups) {
        this.metrics = metrics;
        this.groups = groups;
    }

    public void containerize(StatRecord other) {
        //logger.info(initial.toString());
        this.result.v1 += other.v1,
        this.result.v2 += other.v2
        this.count++;
    }

    public void combine(StatAccumulator other) {
        result.v1+= other.result.v1;
        result.v2+= other.result.v2;
        this.count += other.count;
        logger.info("Current Combined: "+this.result.v1.toString());
    }
}

公共类statacumerator{
公共统计记录结果=新统计记录（0,0）；
私有最终列表度量；
私人最终名单组；
私人长计数；
公共统计累积器（列表度量、列表组）{
这个。度量=度量；
这个组=组；
}
公共空集装箱运输（StatRecord其他）{
//logger.info（initial.toString（））；
this.result.v1+=other.v1，
this.result.v2+=other.v2
这个.count++；
}
公用联合收割机（Statacumerator其他）{
result.v1+=other.result.v1；
result.v2+=other.result.v2；
this.count+=other.count；
info（“当前组合：+this.result.v1.toString（））；
}
}

容器化

用于从初始状态累积结果。这是流是连续的时使用的唯一方法

combine

仅在流并行时使用，以组合两个“子流”的累积结果。

this.count=1l

应该是

这个。count++

哦，我想我的问题是我以为对流的每个元素都调用了containerize，但是每个线程只调用一次，对吗？不对。每个线程调用了很多次：每个线程处理的子流元素调用一次。对不起，我说错了。我想说的是，我认为collect函数使用了

containerize

为流中的每个元素初始化一个新的

statacumerator

，基本上将每个元素映射到一个容器。回想起来，

collect

希望构建容器并在不同的步骤中初始化……而且由于

metrics

和

groups

完全没有使用，将它们从

statacumerator

类中删除可以替换供应商

（）->新的statacumerator（metrics，groups）

使用更简单的

statacumerator:：new

…它们实际上并没有被使用，我在发布之前对该类进行了清理，只是没有删除它们。