当在同一数据和任务上再次处理时,Java8流是否会重用自身?
我试图用代码检查当在同一数据和任务上再次处理时,Java8流是否会重用自身?,java,list,lambda,java-8,java-stream,Java,List,Lambda,Java 8,Java Stream,我试图用代码检查stream()和parallelStream()对同一操作所花费的时间 代码1 List <String> list = new ArrayList < String > (); for (int i = 1; i <= 30000; i++) list.add("a"); for (int i = 1; i <= 20000; i++) list.add("b"); for (int i = 1; i <= 10000; i++)
stream()
和parallelStream()
对同一操作所花费的时间
代码1
List <String> list = new ArrayList < String > ();
for (int i = 1; i <= 30000; i++)
list.add("a");
for (int i = 1; i <= 20000; i++)
list.add("b");
for (int i = 1; i <= 10000; i++)
list.add("c");
//part 1
long start = System.currentTimeMillis();
Map <String, Long> countListSequence = list.stream()
.collect(Collectors.groupingBy(e -> e, Collectors.counting()));
long end = System.currentTimeMillis();
System.out.println("Time taken in by stream() " + (end - start) + " millisec data " + countListSequence);
//part 2
long start1 = System.currentTimeMillis();
Map <String, Long> countListparallel = list.parallelStream()
.collect(Collectors.groupingBy(e -> e, Collectors.counting()));
long end1 = System.currentTimeMillis();
System.out.println("Time taken by parallelStream() " + (end1 - start1) + " millisec data " + countListparallel);
Time taken in by stream() 109 millisec data {a=30000, b=20000, c=10000}
Time taken by parallelStream() 16 millisec data {a=30000, b=20000, c=10000}
但是如果我先更改顺序,请使用parallelStream()
然后使用stream()
类似的命令
代码2
//part 1
long start1 = System.currentTimeMillis();
Map <String, Long> countListparallel = list.parallelStream()
.collect(Collectors.groupingBy(e -> e, Collectors.counting()));
long end1 = System.currentTimeMillis();
System.out.println("Time taken by parallelStream() " + (end1 - start1) + " millisec data " + countListparallel);
//part 2
long start = System.currentTimeMillis();
Map <String, Long> countListSequence = list.stream()
.collect(Collectors.groupingBy(e -> e, Collectors.counting()));
long end = System.currentTimeMillis();
System.out.println("Time taken in by stream() " + (end - start) + " millisec data " + countListSequence);
Time taken by parallelStream() 109 millisec data {a=30000, b=20000, c=10000}
Time taken in by stream() 15 millisec data {a=30000, b=20000, c=10000}
我的问题是,为什么代码2中的第二部分stream()
比parallelStream()
显示与代码1不同的行为所花费的时间要少
不仅在stream()
和parallelStream()
的情况下,我还尝试了stream()
和stream()
的相同情况。我也得到了同样的情况,第二个流比第一个流占用的时间少
代码3
//part 1
long start1 = System.currentTimeMillis();
Map <String, Long> countListparallel = list.stream()
.collect(Collectors.groupingBy(e -> e, Collectors.counting()));
long end1 = System.currentTimeMillis();
System.out.println("Time taken by stream() 1 " + (end1 - start1) + " millisec data " + countListparallel);
//part 2
long start = System.currentTimeMillis();
Map <String, Long> countListSequence = list.stream()
.collect(Collectors.groupingBy(e -> e, Collectors.counting()));
long end = System.currentTimeMillis();
System.out.println("Time taken in by stream() " + (end - start) + " millisec data " + countListSequence);
Time taken by stream() 1 107 millisec data {a=30000, b=20000, c=10000}
Time taken in by stream() 14 millisec data {a=30000, b=20000, c=10000}
第二次打印显示的时间比第一次少。stream
计算数据重复使用本身也是如此,如果我相信这是怎么可能的,因为创建的对象countListSequence
和countListparallel
是不同的。我不明白每个代码样本中的第二部分比第一部分花费的时间少。
我是不是在说溪流的事
谢谢,这一点也不奇怪,因为JVM中的JIT编译器优化了已经运行了大量次的代码。因此,程序中稍后的代码比程序中较早的代码运行得更快是完全正常的
如果您想在这里获得实际有用的数据,请使用JMH之类的工具编写一个基准测试,该工具可用于JIT预热。这可能是由于第一次调用所涉及的lambdas造成的:这需要时间。用JMH创建一个合适的基准,并比较结果。