Java 为什么通过列表进行迭代<;字符串>;比拆分字符串和在StringBuilder上迭代慢?

Java 为什么通过列表进行迭代<;字符串>;比拆分字符串和在StringBuilder上迭代慢?,java,string,list,loops,stringbuilder,Java,String,List,Loops,Stringbuilder,我想知道为什么在StringBuilder 这是我的代码: package nl.testing.startingpoint; import java.text.DecimalFormat; import java.text.NumberFormat; import java.util.ArrayList; import java.util.List; public class Main { public static void main(String args[]) {

我想知道为什么在
StringBuilder

这是我的代码:

package nl.testing.startingpoint;

import java.text.DecimalFormat;
import java.text.NumberFormat;
import java.util.ArrayList;
import java.util.List;

public class Main {

    public static void main(String args[]) {
        NumberFormat formatter = new DecimalFormat("#0.00000");

        List<String> a = new ArrayList<String>();
        StringBuffer b = new StringBuffer();        

        for (int i = 0;i <= 10000; i++)
        {
            a.add("String:" + i);
            b.append("String:" + i + " ");
        }

        long startTime = System.currentTimeMillis();
        for (String aInA : a) 
        {
            System.out.println(aInA);
        }
        long endTime   = System.currentTimeMillis();

        long startTimeB = System.currentTimeMillis();
        for (String part : b.toString().split(" ")) {

            System.out.println(part);
        }
        long endTimeB   = System.currentTimeMillis();

        System.out.println("Execution time from StringBuilder is " + formatter.format((endTimeB - startTimeB) / 1000d) + " seconds");
        System.out.println("Execution time List is " + formatter.format((endTime - startTime) / 1000d) + " seconds");

    }
}
包nl.testing.startingpoint;
导入java.text.DecimalFormat;
导入java.text.NumberFormat;
导入java.util.ArrayList;
导入java.util.List;
公共班机{
公共静态void main(字符串参数[]){
NumberFormat格式化程序=新的十进制格式(“0.00000”);
列表a=新的ArrayList();
StringBuffer b=新的StringBuffer();
对于(int i=0;i(这是一个完全修改过的答案。原因见1。感谢您让我再看一眼!注意他/她也有。)


请注意您的结果,Java中的微型基准测试非常棘手,您的基准测试代码正在执行I/O等操作;有关更多信息,请参阅此问题及其答案:

事实上,据我所知,你的结果误导了你(最初也是我)。虽然
字符串
数组上的
for
循环的增强速度要比
数组列表
上的快得多(
)(
.toString().split(“”)
开销似乎仍占主导地位,并使该版本比
ArrayList版本慢。明显慢

让我们使用一个经过全面设计和测试的微基准标记工具来确定哪个更快:

我使用的是Linux,下面是我如何设置它的(
$
只是指示一个命令提示符;之后是您键入的内容):

1.首先,我安装了Maven,因为我通常不安装它:

$ sudo apt-get install maven 3.在生成的项目中,我删除了默认的
src/main/java/org/sample/MyBenchmark.java
,并在该文件夹中创建了三个用于基准测试的文件:

Common.java
:真无聊:

package org.sample;

public class Common {
    public static final int LENGTH = 10001;
}
本来我以为那里需要更多

TestList.java

package org.sample;

import java.util.List;
import java.util.ArrayList;
import java.text.NumberFormat;
import java.text.DecimalFormat;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Scope;

public class TestList {

    // This state class lets us set up our list once and reuse it for tests in this test thread
    @State(Scope.Thread)
    public static class TestState {
        public final List<String> list;

        public TestState() {
            // Your code for creating the list
            NumberFormat formatter = new DecimalFormat("#0.00000");
            List<String> a = new ArrayList<String>();
            for (int i = 0; i < Common.LENGTH; ++i)
            {
                a.add("String:" + i);
            }
            this.list = a;
        }
    }

    // This is the test method JHM will run for us
    @Benchmark
    public void test(TestState state) {
        // Grab the list
        final List<String> strings = state.list;

        // Loop through it -- note that I'm doing work within the loop, but not I/O since
        // we don't want to measure I/O, we want to measure loop performance
        int l = 0;
        for (String s : strings) {
            l += s == null ? 0 : 1;
        }

        // I always do things like this to ensure that the test is doing what I expected
        // it to do, and so that I actually use the result of the work from the loop
        if (l != Common.LENGTH) {
            throw new RuntimeException("Test error");
        }
    }
}
package org.sample;

import java.text.NumberFormat;
import java.text.DecimalFormat;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Scope;

@State(Scope.Thread)
public class TestStringSplit {

    // This state class lets us set up our list once and reuse it for tests in this test thread
    @State(Scope.Thread)
    public static class TestState {
        public final StringBuffer sb;

        public TestState() {
            NumberFormat formatter = new DecimalFormat("#0.00000");

            StringBuffer b = new StringBuffer();        

            for (int i = 0; i < Common.LENGTH; ++i)
            {
                b.append("String:" + i + " ");
            }

            this.sb = b;
        }
    }

    // This is the test method JHM will run for us
    @Benchmark
    public void test(TestState state) {
        // Grab the StringBuffer, convert to string, split it into an array
        final String[] strings = state.sb.toString().split(" ");

        // Loop through it -- note that I'm doing work within the loop, but not I/O since
        // we don't want to measure I/O, we want to measure loop performance
        int l = 0;
        for (String s : strings) {
            l += s == null ? 0 : 1;
        }

        // I always do things like this to ensure that the test is doing what I expected
        // it to do, and so that I actually use the result of the work from the loop
        if (l != Common.LENGTH) {
            throw new RuntimeException("Test error");
        }
    }
}
4.现在我们进行了测试,我们构建了项目:

$ mvn clean install 循环列表版本每秒执行的操作数超过65k次,而拆分和循环阵列版本的操作数不到5000次/秒

因此,由于执行
.toString().split(“”
)的成本,您最初对
列表
版本会更快的期望是正确的。这样做并循环结果明显比使用
列表


关于
String[]
List
上增强的
for
:通过
String[]
比通过
List
循环要快得多,因此
.toString().split(“”)
一定花费了我们很多。为了只测试循环部分,我在前面的
TestList
类中使用了JMH,这个
TestArray
类:

package org.sample;

import java.util.List;
import java.util.ArrayList;
import java.text.NumberFormat;
import java.text.DecimalFormat;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Scope;

public class TestArray {

    // This state class lets us set up our list once and reuse it for tests in this test thread
    @State(Scope.Thread)
    public static class TestState {
        public final String[] array;

        public TestState() {
            // Create an array with strings like the ones in the list
            NumberFormat formatter = new DecimalFormat("#0.00000");
            String[] a = new String[Common.LENGTH];
            for (int i = 0; i < Common.LENGTH; ++i)
            {
                a[i] = "String:" + i;
            }
            this.array = a;
        }
    }

    // This is the test method JHM will run for us
    @Benchmark
    public void test(TestState state) {
        // Grab the list
        final String[] strings = state.array;

        // Loop through it -- note that I'm doing work within the loop, but not I/O since
        // we don't want to measure I/O, we want to measure loop performance
        int l = 0;
        for (String s : strings) {
            l += s == null ? 0 : 1;
        }

        // I always do things like this to ensure that the test is doing what I expected
        // it to do, and so that I actually use the result of the work from the loop
        if (l != Common.LENGTH) {
            throw new RuntimeException("Test error");
        }
    }
}
编译后,使用
javap-c示例
,我们可以查看两个
useXYZ
函数的字节码;我将每个函数的循环部分用黑体表示,并将它们与每个函数的其余部分稍微隔开:

useArray

public static void useArray(java.lang.String[]); Code: 0: getstatic #15 // Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #18 // String Using array: 5: invokevirtual #17 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: aload_0 9: astore_1 10: aload_1 11: arraylength 12: istore_2 13: iconst_0 14: istore_3 15: iload_3 16: iload_2 17: if_icmpge 39 20: aload_1 21: iload_3 22: aaload 23: astore 4 25: getstatic #15 // Field java/lang/System.out:Ljava/io/PrintStream; 28: aload 4 30: invokevirtual #17 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 33: iinc 3, 1 36: goto 15 39: return 因此,我们可以看到
useArray
直接在数组上运行,并且可以看到
useList
Iterator
方法的两个调用

当然,在大多数情况下,这并不重要。除非您已经确定要优化的代码是一个瓶颈,否则不要担心这些事情



1此答案已从其原始版本进行了彻底修改,因为我在原始版本中假设拆分后的循环数组版本速度更快的断言是正确的。我完全没有检查该断言,而是直接分析了增强的
for
循环在数组上如何比列表更快。M很糟糕。再次感谢您让我仔细观察。

如果是
拆分
您直接在数组上操作,因此速度相当快。
ArrayList
在内部使用数组,但在其周围添加了一些代码,因此它必须比在纯数组上迭代慢

但是说我根本不会使用这种微基准——在JIT运行之后,结果可能会有所不同


更重要的是,做可读性更强的事情,在遇到问题时担心性能,而不是在问题出现之前——更干净的代码在一开始就更好。

由于各种优化和JIT编译,java的基准测试很困难

很抱歉,您无法从测试中得出任何结论。至少您必须创建两个不同的程序,每个场景一个,然后分别运行。我扩展了您的代码,并编写了以下内容:

NumberFormat formatter = new DecimalFormat("#0.00000");

List<String> a = new ArrayList<String>();
StringBuffer b = new StringBuffer();

for (int i = 0;i <= 10000; i++)
{
    a.add("String:" + i);
    b.append("String:" + i + " ");
}

long startTime = System.currentTimeMillis();
for (String aInA : a)
{
    System.out.println(aInA);
}
long endTime   = System.currentTimeMillis();

long startTimeB = System.currentTimeMillis();
for (String part : b.toString().split(" ")) {

    System.out.println(part);
}
long endTimeB   = System.currentTimeMillis();

long startTimeC = System.currentTimeMillis();
for (String aInA : a)
{
    System.out.println(aInA);
}
long endTimeC   = System.currentTimeMillis();

System.out.println("Execution time List is " + formatter.format((endTime - startTime) / 1000d) + " seconds");
System.out.println("Execution time from StringBuilder is " + formatter.format((endTimeB - startTimeB) / 1000d) + " seconds");
System.out.println("Execution time List second time is " + formatter.format((endTimeC - startTimeC) / 1000d) + " seconds");
此外,如果我删除循环中的System.out.println语句,而只是将字符串附加到StringBuilder,则执行时间以毫秒为单位,而不是以几十毫秒为单位,这告诉我,拆分与列表循环不能对一个方法占用另一个方法两倍的时间负责

通常,IO相对较慢,因此您的代码将大部分时间用于执行println语句

编辑: 好的,我现在已经完成了我的家庭作业。受到@StephenC提供的链接的启发,我使用JMH创建了一个基准测试。 正在进行基准测试的方法如下:

public void loop() {
            for (String part : b.toString().split(" ")) {
                bh.consume(part);
            }
        }



    public void loop() {
        for (String aInA : a)
        {
            bh.consume(aInA);
        }
结果是:

Benchmark                          Mode  Cnt    Score   Error  Units
BenchmarkLoop.listLoopBenchmark    avgt  200   55,992 ± 0,436  us/op
BenchmarkLoop.stringLoopBenchmark  avgt  200  290,515 ± 0,975  us/op

因此,对我来说,列表版本似乎更快,这与你最初的直觉一致。

你有链接到我可以阅读材料的来源吗?这让我感到可疑。我预计System.out.println的IO将占用绝大多数时间,这将导致两种场景的时间非常相似。我我不相信你从基准测试中得到的结果。它是有缺陷的。问题1)没有JVM预热,2)你没有衡量你认为自己是什么。阅读所有这些……你就会开始理解我指的是什么。 Benchmark Mode Cnt Score Error Units TestArray.test thrpt 40 568328.087 ± 580.946 ops/s TestList.test thrpt 40 62069.305 ± 3793.680 ops/s
import java.util.List;
import java.util.ArrayList;

public class Example {
    public static final void main(String[] args) throws Exception {
        String[] array = new String[10];
        List<String> list = new ArrayList<String>(array.length);
        for (int n = 0; n < array.length; ++n) {
            array[n] = "foo" + System.currentTimeMillis();
            list.add(array[n]);
        }

        useArray(array);
        useList(list);

        System.out.println("Done");
    }

    public static void useArray(String[] array) {
        System.out.println("Using array:");
        for (String s : array) {
            System.out.println(s);
        }
    }

    public static void useList(List<String> list) {
        System.out.println("Using list:");
        for (String s : list) {
            System.out.println(s);
        }
    }
}
public static void useArray(java.lang.String[]); Code: 0: getstatic #15 // Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #18 // String Using array: 5: invokevirtual #17 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: aload_0 9: astore_1 10: aload_1 11: arraylength 12: istore_2 13: iconst_0 14: istore_3 15: iload_3 16: iload_2 17: if_icmpge 39 20: aload_1 21: iload_3 22: aaload 23: astore 4 25: getstatic #15 // Field java/lang/System.out:Ljava/io/PrintStream; 28: aload 4 30: invokevirtual #17 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 33: iinc 3, 1 36: goto 15 39: return public static void useList(java.util.List); Code: 0: getstatic #15 // Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #19 // String Using list: 5: invokevirtual #17 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: aload_0 9: invokeinterface #20, 1 // InterfaceMethod java/util/List.iterator:()Ljava/util/Iterator; 14: astore_1 15: aload_1 16: invokeinterface #21, 1 // InterfaceMethod java/util/Iterator.hasNext:()Z 21: ifeq 44 24: aload_1 25: invokeinterface #22, 1 // InterfaceMethod java/util/Iterator.next:()Ljava/lang/Object; 30: checkcast #2 // class java/lang/String 33: astore_2 34: getstatic #15 // Field java/lang/System.out:Ljava/io/PrintStream; 37: aload_2 38: invokevirtual #17 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 41: goto 15 44: return
NumberFormat formatter = new DecimalFormat("#0.00000");

List<String> a = new ArrayList<String>();
StringBuffer b = new StringBuffer();

for (int i = 0;i <= 10000; i++)
{
    a.add("String:" + i);
    b.append("String:" + i + " ");
}

long startTime = System.currentTimeMillis();
for (String aInA : a)
{
    System.out.println(aInA);
}
long endTime   = System.currentTimeMillis();

long startTimeB = System.currentTimeMillis();
for (String part : b.toString().split(" ")) {

    System.out.println(part);
}
long endTimeB   = System.currentTimeMillis();

long startTimeC = System.currentTimeMillis();
for (String aInA : a)
{
    System.out.println(aInA);
}
long endTimeC   = System.currentTimeMillis();

System.out.println("Execution time List is " + formatter.format((endTime - startTime) / 1000d) + " seconds");
System.out.println("Execution time from StringBuilder is " + formatter.format((endTimeB - startTimeB) / 1000d) + " seconds");
System.out.println("Execution time List second time is " + formatter.format((endTimeC - startTimeC) / 1000d) + " seconds");
Execution time List is 0.04300 seconds
Execution time from StringBuilder is 0.03200 seconds
Execution time List second time is 0.01900 seconds
public void loop() {
            for (String part : b.toString().split(" ")) {
                bh.consume(part);
            }
        }



    public void loop() {
        for (String aInA : a)
        {
            bh.consume(aInA);
        }
Benchmark                          Mode  Cnt    Score   Error  Units
BenchmarkLoop.listLoopBenchmark    avgt  200   55,992 ± 0,436  us/op
BenchmarkLoop.stringLoopBenchmark  avgt  200  290,515 ± 0,975  us/op