Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/performance/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java 降低新JVM的性能_Java_Performance_Jvm_Jmh - Fatal编程技术网

Java 降低新JVM的性能

Java 降低新JVM的性能,java,performance,jvm,jmh,Java,Performance,Jvm,Jmh,在较新的JVM上,对for循环中数组中的所有元素求和的性能(吞吐量)比在Java1.8.0JDK的JVM上慢。我执行了JHM基准测试(下图)。在每次测试之前,源代码由提供的javac.exe编译并由java.exe运行,这两个二进制文件都由选定的JDK提供。测试是在Windows 10上执行的,由powershell脚本启动,没有任何程序在后台运行(没有其他JVM)。这台计算机配备了32GB的RAM,因此没有使用硬盘上的虚拟内存 阵列中的10M元素: 阵列中的100M元素: 我的测试的源代码

在较新的JVM上,对for循环中数组中的所有元素求和的性能(吞吐量)比在Java1.8.0JDK的JVM上慢。我执行了JHM基准测试(下图)。在每次测试之前,源代码由提供的javac.exe编译并由java.exe运行,这两个二进制文件都由选定的JDK提供。测试是在Windows 10上执行的,由powershell脚本启动,没有任何程序在后台运行(没有其他JVM)。这台计算机配备了32GB的RAM,因此没有使用硬盘上的虚拟内存

阵列中的10M元素:

阵列中的100M元素:

我的测试的源代码:

@Param({"10000000", "100000000"})
public static int ELEMENTS;

public static void main(String[] args) throws RunnerException, IOException {
    File outputFile = new File(args[0]);

    int javaMajorVersion = Integer.parseInt(System.getProperty("java.version").split("\\.")[0]);

    ChainedOptionsBuilder builder = new OptionsBuilder()
            .include(IteratingBenchmark.class.getSimpleName())
            .mode(Mode.Throughput)
            .forks(2)
            .measurementTime(TimeValue.seconds(10))
            .measurementIterations(50)
            .warmupTime(TimeValue.seconds(2))
            .warmupIterations(10)
            .resultFormat(ResultFormatType.SCSV)
            .result(outputFile.getAbsolutePath());

    if (javaMajorVersion > 8) {
        builder = builder.jvmArgs("-Xms20g", "-Xmx20g", "--enable-preview");
    } else {
        builder = builder.jvmArgs("-Xms20g", "-Xmx20g");
    }

    new Runner(builder.build()).run();
}

@Benchmark
public static void cStyleForLoop(Blackhole bh, MockData data) {
    long sum = 0;
    for (int i = 0; i < data.randomInts.length; i++) {
        sum += data.randomInts[i];
    }

    bh.consume(sum);
}

@State(Scope.Thread)
public static class MockData {
    private int[] randomInts = new int[ELEMENTS];

    @Setup(Level.Iteration)
    public void setup() {
        Random r = new Random();
        this.randomInts = Stream.iterate(r.nextInt(), i -> i + r.nextInt(1022) + 1).mapToInt(Integer::intValue).limit(ELEMENTS).toArray();
    }
}
JDK 1.8.0_241:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;331,446104;5,563589;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;33,757268;0,431403;"ops/s";100000000

JDK 11.0.2:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;322,728461;4,823611;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;31,075948;0,062830;"ops/s";100000000

JDK 12.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;322,914782;4,450969;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;31,095232;0,075051;"ops/s";100000000

JDK 13.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;325,103055;4,933257;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;31,228403;0,067954;"ops/s";100000000

JDK 14.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;300,861148;0,443404;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;29,863602;0,035781;"ops/s";100000000

OpenJDK 14.0.2:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;300,781930;0,481579;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;29,873509;0,033055;"ops/s";100000000

OpenJDK 15:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;343,530895;0,445551;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;34,287083;0,035028;"ops/s";100000000
JDK 1.8.0_241:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;33,760346;0,089646;"ops/s";100000000

JDK 11.0.2:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;31,075120;0,086171;"ops/s";100000000

JDK 12.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;31,173939;0,044176;"ops/s";100000000

JDK 13.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;31,219283;0,062329;"ops/s";100000000

JDK 14.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;29,808609;0,072664;"ops/s";100000000

OpenJDK 14.0.2:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;29,845817;0,074315;"ops/s";100000000

OpenJDK 15:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;34,310620;0,087412;"ops/s";100000000
有没有任何有效的解释,为什么Java的新版本比1.8慢(OpenJDK 15除外)?

更新1:

@Param({"10000000", "100000000"})
public static int ELEMENTS;

public static void main(String[] args) throws RunnerException, IOException {
    File outputFile = new File(args[0]);

    int javaMajorVersion = Integer.parseInt(System.getProperty("java.version").split("\\.")[0]);

    ChainedOptionsBuilder builder = new OptionsBuilder()
            .include(IteratingBenchmark.class.getSimpleName())
            .mode(Mode.Throughput)
            .forks(2)
            .measurementTime(TimeValue.seconds(10))
            .measurementIterations(50)
            .warmupTime(TimeValue.seconds(2))
            .warmupIterations(10)
            .resultFormat(ResultFormatType.SCSV)
            .result(outputFile.getAbsolutePath());

    if (javaMajorVersion > 8) {
        builder = builder.jvmArgs("-Xms20g", "-Xmx20g", "--enable-preview");
    } else {
        builder = builder.jvmArgs("-Xms20g", "-Xmx20g");
    }

    new Runner(builder.build()).run();
}

@Benchmark
public static void cStyleForLoop(Blackhole bh, MockData data) {
    long sum = 0;
    for (int i = 0; i < data.randomInts.length; i++) {
        sum += data.randomInts[i];
    }

    bh.consume(sum);
}

@State(Scope.Thread)
public static class MockData {
    private int[] randomInts = new int[ELEMENTS];

    @Setup(Level.Iteration)
    public void setup() {
        Random r = new Random();
        this.randomInts = Stream.iterate(r.nextInt(), i -> i + r.nextInt(1022) + 1).mapToInt(Integer::intValue).limit(ELEMENTS).toArray();
    }
}
JDK 1.8.0_241:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;331,446104;5,563589;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;33,757268;0,431403;"ops/s";100000000

JDK 11.0.2:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;322,728461;4,823611;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;31,075948;0,062830;"ops/s";100000000

JDK 12.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;322,914782;4,450969;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;31,095232;0,075051;"ops/s";100000000

JDK 13.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;325,103055;4,933257;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;31,228403;0,067954;"ops/s";100000000

JDK 14.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;300,861148;0,443404;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;29,863602;0,035781;"ops/s";100000000

OpenJDK 14.0.2:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;300,781930;0,481579;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;29,873509;0,033055;"ops/s";100000000

OpenJDK 15:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;343,530895;0,445551;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;34,287083;0,035028;"ops/s";100000000
JDK 1.8.0_241:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;33,760346;0,089646;"ops/s";100000000

JDK 11.0.2:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;31,075120;0,086171;"ops/s";100000000

JDK 12.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;31,173939;0,044176;"ops/s";100000000

JDK 13.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;31,219283;0,062329;"ops/s";100000000

JDK 14.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;29,808609;0,072664;"ops/s";100000000

OpenJDK 14.0.2:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;29,845817;0,074315;"ops/s";100000000

OpenJDK 15:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;34,310620;0,087412;"ops/s";100000000
我对不同的Xmx/Xms值运行相同的测试(对于每个测试Xmx==Xms),结果如下:



更新2:

@Param({"10000000", "100000000"})
public static int ELEMENTS;

public static void main(String[] args) throws RunnerException, IOException {
    File outputFile = new File(args[0]);

    int javaMajorVersion = Integer.parseInt(System.getProperty("java.version").split("\\.")[0]);

    ChainedOptionsBuilder builder = new OptionsBuilder()
            .include(IteratingBenchmark.class.getSimpleName())
            .mode(Mode.Throughput)
            .forks(2)
            .measurementTime(TimeValue.seconds(10))
            .measurementIterations(50)
            .warmupTime(TimeValue.seconds(2))
            .warmupIterations(10)
            .resultFormat(ResultFormatType.SCSV)
            .result(outputFile.getAbsolutePath());

    if (javaMajorVersion > 8) {
        builder = builder.jvmArgs("-Xms20g", "-Xmx20g", "--enable-preview");
    } else {
        builder = builder.jvmArgs("-Xms20g", "-Xmx20g");
    }

    new Runner(builder.build()).run();
}

@Benchmark
public static void cStyleForLoop(Blackhole bh, MockData data) {
    long sum = 0;
    for (int i = 0; i < data.randomInts.length; i++) {
        sum += data.randomInts[i];
    }

    bh.consume(sum);
}

@State(Scope.Thread)
public static class MockData {
    private int[] randomInts = new int[ELEMENTS];

    @Setup(Level.Iteration)
    public void setup() {
        Random r = new Random();
        this.randomInts = Stream.iterate(r.nextInt(), i -> i + r.nextInt(1022) + 1).mapToInt(Integer::intValue).limit(ELEMENTS).toArray();
    }
}
JDK 1.8.0_241:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;331,446104;5,563589;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;33,757268;0,431403;"ops/s";100000000

JDK 11.0.2:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;322,728461;4,823611;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;31,075948;0,062830;"ops/s";100000000

JDK 12.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;322,914782;4,450969;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;31,095232;0,075051;"ops/s";100000000

JDK 13.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;325,103055;4,933257;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;31,228403;0,067954;"ops/s";100000000

JDK 14.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;300,861148;0,443404;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;29,863602;0,035781;"ops/s";100000000

OpenJDK 14.0.2:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;300,781930;0,481579;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;29,873509;0,033055;"ops/s";100000000

OpenJDK 15:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;343,530895;0,445551;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;34,287083;0,035028;"ops/s";100000000
JDK 1.8.0_241:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;33,760346;0,089646;"ops/s";100000000

JDK 11.0.2:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;31,075120;0,086171;"ops/s";100000000

JDK 12.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;31,173939;0,044176;"ops/s";100000000

JDK 13.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;31,219283;0,062329;"ops/s";100000000

JDK 14.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;29,808609;0,072664;"ops/s";100000000

OpenJDK 14.0.2:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;29,845817;0,074315;"ops/s";100000000

OpenJDK 15:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;34,310620;0,087412;"ops/s";100000000
  • 首先,我将
    Level.Iteration
    更改为
    Level.Trial
  • 其次,我强制G1垃圾收集器
  • 第三,Xmx/Xms被设置为8GB
结果:

原始数据:

@Param({"10000000", "100000000"})
public static int ELEMENTS;

public static void main(String[] args) throws RunnerException, IOException {
    File outputFile = new File(args[0]);

    int javaMajorVersion = Integer.parseInt(System.getProperty("java.version").split("\\.")[0]);

    ChainedOptionsBuilder builder = new OptionsBuilder()
            .include(IteratingBenchmark.class.getSimpleName())
            .mode(Mode.Throughput)
            .forks(2)
            .measurementTime(TimeValue.seconds(10))
            .measurementIterations(50)
            .warmupTime(TimeValue.seconds(2))
            .warmupIterations(10)
            .resultFormat(ResultFormatType.SCSV)
            .result(outputFile.getAbsolutePath());

    if (javaMajorVersion > 8) {
        builder = builder.jvmArgs("-Xms20g", "-Xmx20g", "--enable-preview");
    } else {
        builder = builder.jvmArgs("-Xms20g", "-Xmx20g");
    }

    new Runner(builder.build()).run();
}

@Benchmark
public static void cStyleForLoop(Blackhole bh, MockData data) {
    long sum = 0;
    for (int i = 0; i < data.randomInts.length; i++) {
        sum += data.randomInts[i];
    }

    bh.consume(sum);
}

@State(Scope.Thread)
public static class MockData {
    private int[] randomInts = new int[ELEMENTS];

    @Setup(Level.Iteration)
    public void setup() {
        Random r = new Random();
        this.randomInts = Stream.iterate(r.nextInt(), i -> i + r.nextInt(1022) + 1).mapToInt(Integer::intValue).limit(ELEMENTS).toArray();
    }
}
JDK 1.8.0_241:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;331,446104;5,563589;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;33,757268;0,431403;"ops/s";100000000

JDK 11.0.2:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;322,728461;4,823611;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;31,075948;0,062830;"ops/s";100000000

JDK 12.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;322,914782;4,450969;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;31,095232;0,075051;"ops/s";100000000

JDK 13.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;325,103055;4,933257;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;31,228403;0,067954;"ops/s";100000000

JDK 14.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;300,861148;0,443404;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;29,863602;0,035781;"ops/s";100000000

OpenJDK 14.0.2:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;300,781930;0,481579;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;29,873509;0,033055;"ops/s";100000000

OpenJDK 15:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;343,530895;0,445551;"ops/s";10000000
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;100;34,287083;0,035028;"ops/s";100000000
JDK 1.8.0_241:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;33,760346;0,089646;"ops/s";100000000

JDK 11.0.2:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;31,075120;0,086171;"ops/s";100000000

JDK 12.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;31,173939;0,044176;"ops/s";100000000

JDK 13.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;31,219283;0,062329;"ops/s";100000000

JDK 14.0.1:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;29,808609;0,072664;"ops/s";100000000

OpenJDK 14.0.2:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;29,845817;0,074315;"ops/s";100000000

OpenJDK 15:
"Benchmark";"Mode";"Threads";"Samples";"Score";"Score Error (99,9%)";"Unit";"Param: ELEMENTS"
"benchmark.IteratingBenchmark.cStyleForLoop";"thrpt";1;15;34,310620;0,087412;"ops/s";100000000
更新3:
我制作了一个包含基准源代码的脚本,该脚本使用我使用的JMH参数执行基准测试,自动生成png格式的绘图。 此外,我在其他机器(Linux)上执行了基准测试。
Linux机器的结果似乎更乐观:

不幸的是,在我的Windows机器上,结果仍然显示性能下降(不包括JDK 15)

更新4:
-XX:-UseCountedLoopSafepoints的结果

即使从中复制基准测试并使用相同的参数运行,我仍然无法再现结果。在我的环境中,JDK 14的执行速度与JDK 8一样快(甚至快一点)。因此,在这个答案中,我将根据编译代码的反汇编分析两个版本之间的差异

首先,让我们从同一供应商处获取最新的OpenJDK版本。
这里我比较了Windows 64位的和

JMH得分如下:

Benchmark                         (ELEMENTS)   Mode  Cnt    Score   Error  Units
IteratingBenchmark.cStyleForLoop    10000000  thrpt   30  263,137 ± 0,484  ops/s  # JDK 8
IteratingBenchmark.cStyleForLoop    10000000  thrpt   30  264,406 ± 0,788  ops/s  # JDK 14
现在,让我们使用内置的
-prof xperfasm
探查器运行JMH,以查看基准测试最热门部分的反汇编。预计约99.5%的CPU时间花费在C2编译的
cStyleForLoop
方法中

JDK 8上最热的区域

....[Hottest Region 1]..............................................................................
C2, level 4, codes.dbg.IteratingBenchmark::cStyleForLoop, version 574 (71 bytes) 

             0x0000028c5607fc5f: add     r10d,0fffffff9h
             0x0000028c5607fc63: lea     rax,[r12+rcx*8]
             0x0000028c5607fc67: mov     ebx,80000000h
             0x0000028c5607fc6c: cmp     r9d,r10d
             0x0000028c5607fc6f: cmovl   r10d,ebx
             0x0000028c5607fc73: mov     r9d,1h
             0x0000028c5607fc79: cmp     r10d,1h
         ╭   0x0000028c5607fc7d: jle     28c5607fccch
         │   0x0000028c5607fc7f: nop                       ;*lload_2
         │                                                 ; - codes.dbg.IteratingBenchmark::cStyleForLoop@15 (line 25)
  0,07%  │↗  0x0000028c5607fc80: movsxd  rbx,dword ptr [rax+r9*4+10h]
  0,06%  ││  0x0000028c5607fc85: add     rbx,r8
  8,93%  ││  0x0000028c5607fc88: movsxd  rcx,r9d
  0,41%  ││  0x0000028c5607fc8b: movsxd  r8,dword ptr [rax+rcx*4+2ch]
 25,02%  ││  0x0000028c5607fc90: movsxd  rdi,dword ptr [rax+rcx*4+14h]
  0,10%  ││  0x0000028c5607fc95: movsxd  rsi,dword ptr [rax+rcx*4+18h]
  8,56%  ││  0x0000028c5607fc9a: movsxd  rbp,dword ptr [rax+rcx*4+28h]
  0,58%  ││  0x0000028c5607fc9f: movsxd  r13,dword ptr [rax+rcx*4+1ch]
  0,41%  ││  0x0000028c5607fca4: movsxd  r14,dword ptr [rax+rcx*4+20h]
  0,20%  ││  0x0000028c5607fca9: movsxd  rcx,dword ptr [rax+rcx*4+24h]
  8,85%  ││  0x0000028c5607fcae: add     rdi,rbx
  0,38%  ││  0x0000028c5607fcb1: add     rsi,rdi
  0,15%  ││  0x0000028c5607fcb4: add     r13,rsi
  8,57%  ││  0x0000028c5607fcb7: add     r14,r13
 13,76%  ││  0x0000028c5607fcba: add     rcx,r14
  5,51%  ││  0x0000028c5607fcbd: add     rbp,rcx
  8,50%  ││  0x0000028c5607fcc0: add     r8,rbp            ;*ladd
         ││                                                ; - codes.dbg.IteratingBenchmark::cStyleForLoop@24 (line 25)
  8,95%  ││  0x0000028c5607fcc3: add     r9d,8h            ;*iinc
         ││                                                ; - codes.dbg.IteratingBenchmark::cStyleForLoop@26 (line 24)
  0,40%  ││  0x0000028c5607fcc7: cmp     r9d,r10d
         │╰  0x0000028c5607fcca: jl      28c5607fc80h      ;*if_icmpge
         │                                                 ; - codes.dbg.IteratingBenchmark::cStyleForLoop@12 (line 24)
         ↘   0x0000028c5607fccc: cmp     r9d,edx
             0x0000028c5607fccf: jnl     28c5607fce4h
             0x0000028c5607fcd1: nop                       ;*lload_2
                                                           ; - codes.dbg.IteratingBenchmark::cStyleForLoop@15 (line 25)
             0x0000028c5607fcd4: movsxd  r10,dword ptr [rax+r9*4+10h]
             0x0000028c5607fcd9: add     r8,r10            ;*ladd
                                                           ; - codes.dbg.IteratingBenchmark::cStyleForLoop@24 (line 25)
....................................................................................................
JDK 14上最热的区域

....[Hottest Region 1]..............................................................................
c2, level 4, codes.dbg.IteratingBenchmark::cStyleForLoop, version 622 (147 bytes) 

                                                                         ; - codes.dbg.IteratingBenchmark::cStyleForLoop@23 (line 25)
               0x000001e844438f72:   mov     r11d,r10d
               0x000001e844438f75:   add     r11d,0fffffff9h
               0x000001e844438f79:   lea     rax,[r12+r9*8]
               0x000001e844438f7d:   mov     ebx,1h
               0x000001e844438f82:   cmp     r11d,1h
               0x000001e844438f86:   jle     1e8444390c0h                ;*goto {reexecute=0 rethrow=0 return_oop=0}
                                                                         ; - codes.dbg.IteratingBenchmark::cStyleForLoop@29 (line 24)
         ╭     0x000001e844438f8c:   jmp     1e844438ffah
         │     0x000001e844438f8e:   nop
  0,04%  │↗    0x000001e844438f90:   mov     rsi,r8                      ;*lload_2 {reexecute=0 rethrow=0 return_oop=0}
         ││                                                              ; - codes.dbg.IteratingBenchmark::cStyleForLoop@15 (line 25)
  0,04%  ││ ↗  0x000001e844438f93:   movsxd  rdx,dword ptr [rax+rbx*4+10h]
  8,41%  ││ │  0x000001e844438f98:   movsxd  rbp,dword ptr [rax+rbx*4+14h]
  1,23%  ││ │  0x000001e844438f9d:   movsxd  r13,dword ptr [rax+rbx*4+18h]
  0,03%  ││ │  0x000001e844438fa2:   movsxd  r8,dword ptr [rax+rbx*4+2ch]
 23,87%  ││ │  0x000001e844438fa7:   movsxd  r11,dword ptr [rax+rbx*4+28h]
  8,22%  ││ │  0x000001e844438fac:   movsxd  r9,dword ptr [rax+rbx*4+24h]
  1,25%  ││ │  0x000001e844438fb1:   movsxd  rcx,dword ptr [rax+rbx*4+20h]
  0,14%  ││ │  0x000001e844438fb6:   movsxd  r14,dword ptr [rax+rbx*4+1ch]
  0,28%  ││ │  0x000001e844438fbb:   add     rdx,rsi
  7,82%  ││ │  0x000001e844438fbe:   add     rbp,rdx
  1,14%  ││ │  0x000001e844438fc1:   add     r13,rbp
  0,17%  ││ │  0x000001e844438fc4:   add     r14,r13
 14,57%  ││ │  0x000001e844438fc7:   add     rcx,r14
 11,05%  ││ │  0x000001e844438fca:   add     r9,rcx
  5,26%  ││ │  0x000001e844438fcd:   add     r11,r9
  6,32%  ││ │  0x000001e844438fd0:   add     r8,r11                      ;*ladd {reexecute=0 rethrow=0 return_oop=0}
         ││ │                                                            ; - codes.dbg.IteratingBenchmark::cStyleForLoop@24 (line 25)
  8,45%  ││ │  0x000001e844438fd3:   add     ebx,8h                      ;*iinc {reexecute=0 rethrow=0 return_oop=0}
         ││ │                                                            ; - codes.dbg.IteratingBenchmark::cStyleForLoop@26 (line 24)
  1,15%  ││ │  0x000001e844438fd6:   cmp     ebx,edi
         │╰ │  0x000001e844438fd8:   jl      1e844438f90h                ;*if_icmpge {reexecute=0 rethrow=0 return_oop=0}
         │  │                                                            ; - codes.dbg.IteratingBenchmark::cStyleForLoop@12 (line 24)
         │  │  0x000001e844438fda:   mov     r11,qword ptr [r15+110h]    ; ImmutableOopMap {rax=Oop xmm0=Oop xmm1=Oop }
         │  │                                                            ;*goto {reexecute=1 rethrow=0 return_oop=0}
         │  │                                                            ; - (reexecute) codes.dbg.IteratingBenchmark::cStyleForLoop@29 (line 24)
  0,00%  │  │  0x000001e844438fe1:   test    dword ptr [r11],eax         ;*goto {reexecute=0 rethrow=0 return_oop=0}
         │  │                                                            ; - codes.dbg.IteratingBenchmark::cStyleForLoop@29 (line 24)
         │  │                                                            ;   {poll}
  0,02%  │  │  0x000001e844438fe4:   cmp     ebx,dword ptr [rsp]
         │ ╭│  0x000001e844438fe7:   jnl     1e844439028h
  0,00%  │ ││  0x000001e844438fe9:   mov     rsi,r8
         │ ││  0x000001e844438fec:   vmovq   r8,xmm0
         │ ││  0x000001e844438ff1:   vmovq   rdx,xmm1
  0,01%  │ ││  0x000001e844438ff6:   mov     r11d,dword ptr [rsp]
         ↘ ││  0x000001e844438ffa:   mov     ecx,r10d
           ││  0x000001e844438ffd:   sub     ecx,ebx
           ││  0x000001e844438fff:   add     ecx,0fffffff9h
  0,00%    ││  0x000001e844439002:   mov     r9d,1f40h
           ││  0x000001e844439008:   cmp     r9d,ecx
           ││  0x000001e84443900b:   mov     edi,1f40h
           ││  0x000001e844439010:   cmovnle edi,ecx
  0,02%    ││  0x000001e844439013:   add     edi,ebx
           ││  0x000001e844439015:   vmovq   xmm0,r8
           ││  0x000001e84443901a:   vmovq   xmm1,rdx
           ││  0x000001e84443901f:   mov     dword ptr [rsp],r11d
  0,01%    │╰  0x000001e844439023:   jmp     1e844438f93h
           ↘   0x000001e844439028:   vmovq   rdx,xmm1
               0x000001e84443902d:   cmp     ebx,r10d
               0x000001e844439030:   jnl     1e844439043h
               0x000001e844439032:   nop                                 ;*lload_2 {reexecute=0 rethrow=0 return_oop=0}
                                                                         ; - codes.dbg.IteratingBenchmark::cStyleForLoop@15 (line 25)
               0x000001e844439034:   movsxd  r11,dword ptr [rax+rbx*4+10h]
               0x000001e844439039:   add     r8,r11                      ;*ladd {reexecute=0 rethrow=0 return_oop=0}
                                                                         ; - codes.dbg.IteratingBenchmark::cStyleForLoop@24 (line 25)
               0x000001e84443903c:   inc     ebx                         ;*iinc {reexecute=0 rethrow=0 return_oop=0}
                                                                         ; - codes.dbg.IteratingBenchmark::cStyleForLoop@26 (line 24)
....................................................................................................
如我们所见,循环体在两个JDK上的编译方式类似:

  • 展开8个循环迭代
  • 数组中有8个无边界检查加载,后面是8个
    add
    指令
  • 加载顺序略有不同,但所有地址都共享相同或相邻的缓存线
关键区别在于,在JDK14上,循环迭代被分成两个嵌套块。这是JDK10中出现的优化结果。此优化的思想是将计数循环拆分为热内部部分(不使用safepoint轮询),以及外部部分(使用safepoint轮询指令)

C2 JIT将循环转换为

    for (int i = 0; i < array.length; i += 8000) {
        for (int j = 0; j < 8000; j += 8) {
            int ix = i + j;
            int v0 = array[ix];
            int v1 = array[ix + 1];
            ...
            int v7 = array[ix + 7];
            sum += v0 + v1 + ... + v7;
        }
        safepoint_poll();
    }
for(int i=0;i
请注意,JDK 8版本在计数循环中根本没有safepoint轮询。一方面,这可以使循环运行得更快。但另一方面,这实际上对低延迟应用程序不利,因为暂停时间可能会随着整个循环的持续时间而增加

JDK 14在循环中插入一个safepoint轮询。这可能是您观察到的速度减慢的原因之一,但我不太相信这一点,因为由于循环条带挖掘优化,safepoint轮询在8000次迭代中只执行一次


要验证这一点,您可以使用
-XX:-UseCountedLoopSafepoints
JVM选项禁用安全点轮询。在本例中,JDK14编译版本看起来与JDK8One几乎相同。基准分数也会如此。

有没有原因使你的随机整数不是真正随机的,而是递增的(除非发生溢出)?没有。我想要一个重复值最少的数组。这种使用随机步长递增值的方法似乎很快就能填充数组。为什么重复数很重要?只需使用this.randomInts=r.ints(ELEMENTS.toArray()。给他们所有相同的种子是否会产生影响,这将是有趣的。我不希望实际的int值对性能有任何影响。这在以前是很重要的,因为我在另一个测试中使用了它。当我注意到不同JVM版本之间的奇怪性能时,我尽可能地清除代码并发布在这里。我同意这很有趣。我稍后会检查。基准分数似乎受到了GC开销的影响,GC开销是由
设置中不必要的分配造成的。1) 使用
Level.Trial
代替
Level.Iteration
。2) 在所有JDK版本上设置相同的GC(JDK 8和JDK 9中的默认GC不同+)非常好的描述,很好的@apangin。不幸的是,由于xperf错误(xperf.exe-error:0x00007FFFDFA8F88B处的指令引用了0xffffffffffffff处的内存。无法读取内存)。我认为性能下降只发生在我的机器上,因为它与其他机器相似或在增加。你的回答对我来说已经足够了。谢谢你抽出时间。