Java 8流并行性能和CPU资源消耗与串行相比似乎非常差
在试用JDK 8流媒体功能时,我决定试用并行/串行流媒体性能测试。我试图通过在一个单位正方形上投掷随机飞镖,并检查单位圆内有多少着陆点,来求解π的值。我找到了ApacheSpark的例子 这是代码Java 8流并行性能和CPU资源消耗与串行相比似乎非常差,java,parallel-processing,java-8,java-stream,microbenchmark,Java,Parallel Processing,Java 8,Java Stream,Microbenchmark,在试用JDK 8流媒体功能时,我决定试用并行/串行流媒体性能测试。我试图通过在一个单位正方形上投掷随机飞镖,并检查单位圆内有多少着陆点,来求解π的值。我找到了ApacheSpark的例子 这是代码 package org.sample; import java.util.concurrent.TimeUnit; import java.util.stream.IntStream; import org.openjdk.jmh.annotations.Bench
package org.sample;
import java.util.concurrent.TimeUnit;
import java.util.stream.IntStream;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Param;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(1)
@State(Scope.Benchmark)
public class MyBenchmark {
@Param({
"1000000",
"10000000"
}) int MAX_COUNT;
@Benchmark
public double parallelPiTest() {
long count = IntStream.range(1, MAX_COUNT).parallel().filter(i -> {
double x= Math.random();
double y= Math.random();
return (x*x + y* y) < 1.0 ;
}).count();
double pi = 4 * count * 1.0 /MAX_COUNT;
return pi;
}
@Benchmark
public double sequentialPiTest() {
long count = IntStream.range(1, MAX_COUNT).filter(i -> {
double x= Math.random();
double y= Math.random();
return (x*x + y* y) < 1.0 ;
}).count();
double pi = 4 * count * 1.0 /MAX_COUNT;
return pi;
}
package org.sample;
导入java.util.concurrent.TimeUnit;
导入java.util.stream.IntStream;
导入org.openjdk.jmh.annotations.Benchmark;
导入org.openjdk.jmh.annotations.BenchmarkMode;
导入org.openjdk.jmh.annotations.Fork;
导入org.openjdk.jmh.annotations.Measurement;
导入org.openjdk.jmh.annotations.Mode;
导入org.openjdk.jmh.annotations.OutputTimeUnit;
导入org.openjdk.jmh.annotations.Param;
导入org.openjdk.jmh.annotations.Scope;
导入org.openjdk.jmh.annotations.State;
导入org.openjdk.jmh.annotations.Warmup;
@基准模式(模式平均时间)
@输出时间单位(时间单位纳秒)
@预热(迭代次数=5,时间=1,时间单位=timeUnit.s)
@测量(迭代次数=5,时间=1,时间单位=时间单位。秒)
@叉子(1)
@国家(范围、基准)
公共类MyBenchmark{
@Param({
"1000000",
"10000000"
})int MAX_计数;
@基准
公共双并行测试(){
long count=IntStream.range(1,最大计数).parallel().filter(i->{
double x=Math.random();
双y=Math.random();
返回值(x*x+y*y)<1.0;
}).count();
双pi=4*计数*1.0/最大计数;
返回pi;
}
@基准
公共双序列测试(){
长计数=IntStream.range(1,最大计数).filter(i->{
double x=Math.random();
双y=Math.random();
返回值(x*x+y*y)<1.0;
}).count();
双pi=4*计数*1.0/最大计数;
返回pi;
}
在我的8核机器(windows 7笔记本电脑)上进行简单测试时,并行执行时间几乎是串行执行时间的5倍,所有内核的CPU利用率几乎为100%。另一方面,串行使用了大约20%的一个内核!由于混淆了结果,我使用JMH测试了基准测试(上面的代码)还有JunitBenchmarks。结果几乎与串行执行一致,总是比并行执行性能高5倍。我也尝试了100次迭代,但结果仍然与下面5次迭代的结果相似。我是否缺少一些基本的东西
JMH基准结果:
C:\Users\local\lunaeeworkspace\benchmarktest>mvn clean install
"******::" C:\Progra~1\Java\jdk1.8.0_20
[INFO] Scanning for projects...
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building Auto-generated JMH benchmark 1.0
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ benchmarktest ---
[INFO] Deleting C:\Users\local\lunaeeworkspace\benchmarktest\target
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ benchmarktest ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory C:\Users\local\lunaeeworkspace\benchmarktest\src\main\resources
[INFO]
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ benchmarktest ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 1 source file to C:\Users\local\lunaeeworkspace\benchmarktest\target\classes
[INFO]
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ benchmarktest ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory C:\Users\local\lunaeeworkspace\benchmarktest\src\test\resources
[INFO]
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ benchmarktest ---
[INFO] No sources to compile
[INFO]
[INFO] --- maven-surefire-plugin:2.17:test (default-test) @ benchmarktest ---
[INFO] No tests to run.
[INFO]
[INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ benchmarktest ---
[INFO] Building jar: C:\Users\local\lunaeeworkspace\benchmarktest\target\benchmarktest-1.0.jar
[INFO]
[INFO] --- maven-shade-plugin:2.2:shade (default) @ benchmarktest ---
[INFO] Including org.openjdk.jmh:jmh-core:jar:1.1 in the shaded jar.
[INFO] Including net.sf.jopt-simple:jopt-simple:jar:4.6 in the shaded jar.
[INFO] Including org.apache.commons:commons-math3:jar:3.2 in the shaded jar.
[INFO] Replacing C:\Users\local\lunaeeworkspace\benchmarktest\target\benchmarks.jar with C:\Users\local\lunaeework
space\benchmarktest\target\benchmarktest-1.0-shaded.jar
[INFO]
[INFO] --- maven-install-plugin:2.5.1:install (default-install) @ benchmarktest ---
[INFO] Installing C:\Users\local\lunaeeworkspace\benchmarktest\target\benchmarktest-1.0.jar to C:\Users\local\.m2\
repository\org\sample\benchmarktest\1.0\benchmarktest-1.0.jar
[INFO] Installing C:\Users\local\lunaeeworkspace\benchmarktest\pom.xml to C:\Users\local\.m2\repository\org\sample
\benchmarktest\1.0\benchmarktest-1.0.pom
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 16.070 s
[INFO] Finished at: 2014-09-15T13:25:03-07:00
[INFO] Final Memory: 22M/221M
[INFO] ------------------------------------------------------------------------
C:\Users\local\lunaeeworkspace\benchmarktest>java -jar target/benchmarks.jar
# VM invoker: C:\Program Files\Java\jre1.8.0_20\bin\java.exe
# VM options: <none>
# Warmup: 5 iterations, 1 s each
# Measurement: 5 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: org.sample.MyBenchmark.parallelPiTest
# Parameters: (MAX_COUNT = 1000000)
# Run progress: 0.00% complete, ETA 00:00:40
# Fork: 1 of 1
# Warmup Iteration 1: 2810219990.000 ns/op
# Warmup Iteration 2: 679604930.000 ns/op
# Warmup Iteration 3: 708517299.500 ns/op
# Warmup Iteration 4: 613861141.500 ns/op
# Warmup Iteration 5: 747273386.500 ns/op
Iteration 1: 636085288.500 ns/op
Iteration 2: 726300915.500 ns/op
Iteration 3: 720032270.000 ns/op
Iteration 4: 758523073.500 ns/op
Iteration 5: 776964284.500 ns/op
Result: 723581166.400 ¦(99.9%) 208666306.733 ns/op [Average]
Statistics: (min, avg, max) = (636085288.500, 723581166.400, 776964284.500), stdev = 54189977.210
Confidence interval (99.9%): [514914859.667, 932247473.133]
# VM invoker: C:\Program Files\Java\jre1.8.0_20\bin\java.exe
# VM options: <none>
# Warmup: 5 iterations, 1 s each
# Measurement: 5 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: org.sample.MyBenchmark.parallelPiTest
# Parameters: (MAX_COUNT = 10000000)
# Run progress: 25.00% complete, ETA 00:00:52
# Fork: 1 of 1
# Warmup Iteration 1: 9589247518.000 ns/op
# Warmup Iteration 2: 8049867519.000 ns/op
# Warmup Iteration 3: 7864790757.000 ns/op
# Warmup Iteration 4: 7766442122.000 ns/op
# Warmup Iteration 5: 7723210219.000 ns/op
Iteration 1: 7525308107.000 ns/op
Iteration 2: 8067847130.000 ns/op
Iteration 3: 7647547652.000 ns/op
Iteration 4: 6964833740.000 ns/op
Iteration 5: 7471811305.000 ns/op
Result: 7535469586.800 ¦(99.9%) 1523035846.762 ns/op [Average]
Statistics: (min, avg, max) = (6964833740.000, 7535469586.800, 8067847130.000), stdev = 395527572.797
Confidence interval (99.9%): [6012433740.038, 9058505433.562]
# VM invoker: C:\Program Files\Java\jre1.8.0_20\bin\java.exe
# VM options: <none>
# Warmup: 5 iterations, 1 s each
# Measurement: 5 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: org.sample.MyBenchmark.sequentialPiTest
# Parameters: (MAX_COUNT = 1000000)
# Run progress: 50.00% complete, ETA 00:01:37
# Fork: 1 of 1
# Warmup Iteration 1: 208653523.167 ns/op
# Warmup Iteration 2: 171440852.571 ns/op
# Warmup Iteration 3: 176369103.714 ns/op
# Warmup Iteration 4: 172637171.571 ns/op
# Warmup Iteration 5: 168770237.714 ns/op
Iteration 1: 171262591.714 ns/op
Iteration 2: 168976818.714 ns/op
Iteration 3: 174889950.143 ns/op
Iteration 4: 171272031.714 ns/op
Iteration 5: 167857761.571 ns/op
Result: 170851830.771 ¦(99.9%) 10391714.091 ns/op [Average]
Statistics: (min, avg, max) = (167857761.571, 170851830.771, 174889950.143), stdev = 2698695.149
Confidence interval (99.9%): [160460116.681, 181243544.862]
# VM invoker: C:\Program Files\Java\jre1.8.0_20\bin\java.exe
# VM options: <none>
# Warmup: 5 iterations, 1 s each
# Measurement: 5 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: org.sample.MyBenchmark.sequentialPiTest
# Parameters: (MAX_COUNT = 10000000)
# Run progress: 75.00% complete, ETA 00:00:37
# Fork: 1 of 1
# Warmup Iteration 1: 1898167075.000 ns/op
# Warmup Iteration 2: 1734706264.000 ns/op
# Warmup Iteration 3: 1705265893.000 ns/op
# Warmup Iteration 4: 1704804614.000 ns/op
# Warmup Iteration 5: 1781362794.000 ns/op
Iteration 1: 1725992648.000 ns/op
Iteration 2: 1721125803.000 ns/op
Iteration 3: 1714455544.000 ns/op
Iteration 4: 1719110033.000 ns/op
Iteration 5: 1719564255.000 ns/op
Result: 1720049656.600 ¦(99.9%) 15980153.846 ns/op [Average]
Statistics: (min, avg, max) = (1714455544.000, 1720049656.600, 1725992648.000), stdev = 4149995.207
Confidence interval (99.9%): [1704069502.754, 1736029810.446]
# Run complete. Total time: 00:02:10
Benchmark (MAX_COUNT) Mode Samples Score Score error Units
o.s.MyBenchmark.parallelPiTest 1000000 avgt 5 723581166.400 208666306.733 ns/op
o.s.MyBenchmark.parallelPiTest 10000000 avgt 5 7535469586.800 1523035846.762 ns/op
o.s.MyBenchmark.sequentialPiTest 1000000 avgt 5 170851830.771 10391714.091 ns/op
o.s.MyBenchmark.sequentialPiTest 10000000 avgt 5 1720049656.600 15980153.846 ns/op
C:\Users\local\lunaeeworkspace\benchmarktest>mvn干净安装
“******:”C:\Progra~1\Java\jdk1.8.0_20
[信息]正在扫描项目。。。
[信息]
[信息]------------------------------------------------------------------------
[信息]构建自动生成的JMH基准1.0
[信息]------------------------------------------------------------------------
[信息]
[信息]---maven clean插件:2.5:clean(默认清洁)@benchmarktest---
[信息]删除C:\Users\local\lunaeeworkspace\benchmarktest\target
[信息]
[信息]---maven资源插件:2.6:资源(默认资源)@benchmarktest---
[信息]使用“UTF-8”编码复制筛选的资源。
[信息]跳过不存在的资源目录C:\Users\local\lunaeeworkspace\benchmarktest\src\main\resources
[信息]
[信息]---maven编译器插件:3.1:compile(默认编译)@benchmarktest---
[信息]检测到更改-重新编译模块!
[INFO]正在将1个源文件编译为C:\Users\local\lunaeeworkspace\benchmarktest\target\classes
[信息]
[信息]---maven资源插件:2.6:testResources(默认testResources)@benchmarktest---
[信息]使用“UTF-8”编码复制筛选的资源。
[信息]跳过不存在的资源目录C:\Users\local\lunaeeworkspace\benchmarktest\src\test\resources
[信息]
[信息]---maven编译器插件:3.1:testCompile(默认testCompile)@benchmarktest---
[信息]没有要编译的源
[信息]
[信息]---maven surefire插件:2.17:test(默认测试)@benchmarktest---
[信息]没有要运行的测试。
[信息]
[信息]---maven jar插件:2.4:jar(默认jar)@benchmarktest---
[信息]构建jar:C:\Users\local\lunaeeworkspace\benchmarktest\target\benchmarktest-1.0.jar
[信息]
[信息]---maven shade插件:2.2:shade(默认)@benchmarktest---
[信息]包括org.openjdk.jmh:jmh core:jar:1.1。
[信息]在着色的jar中包括net.sf.jopt-simple:jopt-simple:jar:4.6。
[信息]包括org.apache.commons:commons-math3:jar:3.2。
[信息]将C:\Users\local\lunaeeworkspace\benchmarktest\target\benchmarks.jar替换为C:\Users\local\lunaeework
space\benchmarktest\target\benchmarktest-1.0-shaded.jar
[信息]
[信息]---maven安装插件:2.5.1:安装(默认安装)@benchmarktest---
[信息]正在将C:\Users\local\lunaeeworkspace\benchmarktest\target\benchmarktest-1.0.jar安装到C:\Users\local\.m2\
repository\org\sample\benchmarktest\1.0\benchmarktest-1.0.jar
[信息]正在将C:\Users\local\lunaeeworkspace\benchmarktest\pom.xml安装到C:\Users\local\.m2\repository\org\sample
\benchmarktest\1.0\benchmarktest-1.0.pom
[信息]------------------------------------------------------------------------
[信息]建立成功
[信息]------------------------------------------------------------------------
[信息]总时间:16.070秒
[信息]完成时间:2014-09-15T13:25:03-07:00
[信息]最终内存:22M/221M
[信息]------------------------------------------------------------------------
C:\Users\local\lunaeeworkspace\benchmarktest>java-jar目标/
ThreadLocalRandom r = ThreadLocalRandom.current();
double x = r.nextDouble(1);
double y = r.nextDouble(1);