在Java中删除字符串中第一个单词的最佳方法

在Java中删除字符串中第一个单词的最佳方法,java,string,performance,Java,String,Performance,去除字符串中第一个标记的最快方法是什么?到目前为止,我已经尝试过: String parentStringValue = this.stringValue.split(" ", 2)[1]; 而且它的内存和速度都非常低效(在15字长的字符串中重复数百万次)。假设字符串由空格分隔的标记组成。可以使用string.substring和string.indexOf的组合 大致如下: // TODO check indexOf does not return -1 this.stringValue.s

去除字符串中第一个标记的最快方法是什么?到目前为止,我已经尝试过:

String parentStringValue = this.stringValue.split(" ", 2)[1];

而且它的内存和速度都非常低效(在15字长的字符串中重复数百万次)。假设字符串由空格分隔的标记组成。

可以使用
string.substring
string.indexOf
的组合

大致如下:

// TODO check indexOf does not return -1
this.stringValue.substring(this.stringValue.indexOf(" ") + 1)

为此,可以使用
String.substring
String.indexOf
的组合

大致如下:

// TODO check indexOf does not return -1
this.stringValue.substring(this.stringValue.indexOf(" ") + 1)

无需拆分和创建数组,只需使用子字符串即可

String str="I want to remove I";
String parentStringValue = str.substring(str.indexOf(" ")+1);
System.out.println(parentStringValue);
输出:

want to remove I

无需拆分和创建数组,只需使用子字符串即可

String str="I want to remove I";
String parentStringValue = str.substring(str.indexOf(" ")+1);
System.out.println(parentStringValue);
输出:

want to remove I
试试这个:

  String s = "This is a test";

  System.out.println(s.replaceFirst("\\w+\\s", ""));
试试这个:

  String s = "This is a test";

  System.out.println(s.replaceFirst("\\w+\\s", ""));

如果您不反对使用,那么可以使用
StringUtils

这意味着您不必满足返回-1的String.indexOf:

String parentStringValue = StringUtils.substringAfter(yourString, " ");

如果您不反对使用,那么可以使用
StringUtils

这意味着您不必满足返回-1的String.indexOf:

String parentStringValue = StringUtils.substringAfter(yourString, " ");

在执行字符串操作时,请尝试使用StringBuffer或StringBuilder,以免留下大量新的未使用对象并导致内存效率低下,因为如您所述,重复了数百万次

在执行字符串操作时,请尝试使用StringBuffer或StringBuilder,以免留下大量新的未使用对象和导致内存效率低下,因为正如您所提到的,StringBuilder与子字符串(x)重复了数百万次

StringBuilder与子字符串(x)vs
split(x)
vs Regex 答案已编辑:主要缺陷已纠正

在纠正了我的基准测试中的一些相当严重的缺陷之后(正如Jay Askren在评论中指出的)。
StringBuilder
方法以显著优势成为最快的方法(尽管这假设
StringBuilder
对象是预先创建的),子字符串排在第二位
split()
以比
StringBuilder
方法慢10倍的速度排在倒数第二

  ArrayList<String> strings = new ArrayList<String>();
  ArrayList<StringBuilder> stringBuilders = new ArrayList<StringBuilder>();
  for(int i = 0; i < 1000; i++) strings.add("Remove the word remove from String "+i);
  for(int i = 0; i < 1000; i++) stringBuilders.add(new StringBuilder(i+" Remove the word remove from String "+i));
  Pattern pattern = Pattern.compile("\\w+\\s");

  // StringBuilder method
  before = System.currentTimeMillis();
  for(int i = 0; i < 5000; i++){
      for(StringBuilder s : stringBuilders){
          s.delete(0, s.indexOf(" ") + 1);
      }
  }
  after = System.currentTimeMillis() - before;
  System.out.println("StringBuilder Method Took "+after);

  // Substring method
  before = System.currentTimeMillis();
  for(int i = 0; i < 5000; i++){
      for(String s : strings){
          String newvalue = s.substring(s.indexOf(" ") + 1);
      }
  }
  after = System.currentTimeMillis() - before;
  System.out.println("Substring Method Took "+after); 

  //Split method
  before = System.currentTimeMillis();
  for(int i = 0; i < 5000; i++){
      for(String s : strings){
          String newvalue = s.split(" ", 2)[1];
          System.out.print("");
      }
  }
  after = System.currentTimeMillis() - before;
  System.out.println("Your Method Took "+after);

  // Regex method
  before = System.currentTimeMillis();
  for(int i = 0; i < 5000; i++){
      for(String s : strings){
          String newvalue = pattern.matcher(s).replaceFirst("");
      }
  }
  after = System.currentTimeMillis() - before;
  System.out.println("Regex Method Took "+after);
值得一提的是,使用长度大于1的
字符串调用
split()

StringBuilder vs
substring(x)
vs
split(x)
vs Regex 答案已编辑:主要缺陷已纠正

在纠正了我的基准测试中的一些相当严重的缺陷之后(正如Jay Askren在评论中指出的)。
StringBuilder
方法以显著优势成为最快的方法(尽管这假设
StringBuilder
对象是预先创建的),子字符串排在第二位
split()
以比
StringBuilder
方法慢10倍的速度排在倒数第二

  ArrayList<String> strings = new ArrayList<String>();
  ArrayList<StringBuilder> stringBuilders = new ArrayList<StringBuilder>();
  for(int i = 0; i < 1000; i++) strings.add("Remove the word remove from String "+i);
  for(int i = 0; i < 1000; i++) stringBuilders.add(new StringBuilder(i+" Remove the word remove from String "+i));
  Pattern pattern = Pattern.compile("\\w+\\s");

  // StringBuilder method
  before = System.currentTimeMillis();
  for(int i = 0; i < 5000; i++){
      for(StringBuilder s : stringBuilders){
          s.delete(0, s.indexOf(" ") + 1);
      }
  }
  after = System.currentTimeMillis() - before;
  System.out.println("StringBuilder Method Took "+after);

  // Substring method
  before = System.currentTimeMillis();
  for(int i = 0; i < 5000; i++){
      for(String s : strings){
          String newvalue = s.substring(s.indexOf(" ") + 1);
      }
  }
  after = System.currentTimeMillis() - before;
  System.out.println("Substring Method Took "+after); 

  //Split method
  before = System.currentTimeMillis();
  for(int i = 0; i < 5000; i++){
      for(String s : strings){
          String newvalue = s.split(" ", 2)[1];
          System.out.print("");
      }
  }
  after = System.currentTimeMillis() - before;
  System.out.println("Your Method Took "+after);

  // Regex method
  before = System.currentTimeMillis();
  for(int i = 0; i < 5000; i++){
      for(String s : strings){
          String newvalue = pattern.matcher(s).replaceFirst("");
      }
  }
  after = System.currentTimeMillis() - before;
  System.out.println("Regex Method Took "+after);

值得一提的是,使用长度大于1的
字符串调用
split()

鲁迪的基准有很多问题,包括不公平和不正确地支持分割法。因此,我采用了他的基准并对其进行了改进。如果您碰巧有一组StringBuilder,StringBuilder方法会稍微快一点,但是如果您需要首先从字符串转换它们,那么它会非常慢。子字符串方法是第二快的方法,如果您有字符串而不是字符串生成器,那么应该使用它。CommonsLang是第二快的,子字符串方法和CommonsLang方法都比使用split快4到5倍。String.replaceFirst()使用正则表达式,速度非常慢,因为它每次运行时都需要编译正则表达式,这会使运行时间加倍。即使没有编译步骤,它也比其他步骤慢得多

下面是基准测试的代码。您需要将ApacheCommonsLang添加到类路径中才能运行此操作

import java.util.ArrayList;
import java.util.List;
import java.util.regex.Pattern;

import org.apache.commons.lang3.StringUtils;

/**
 *
 */
public class StringTest {
    public static void main(String[] args) {
        int numIterations = 100000;
        int numRuns = 10;
        ArrayList<String> strings = new ArrayList<String>();
          for(int i = 0; i < 1000; i++) strings.add("Remove the word remove from String "+i);
          //Your method
          long before = 0;
          long after = 0;
          for(int j=0; j < numRuns; j++) {
              before = System.currentTimeMillis();
              for(int i = 0; i < numIterations; i++){
                  for(String s : strings){
                      String newvalue = s.split(" ", 2)[1];
    //                System.out.println("split " + newvalue);
                  }
              }
              after = System.currentTimeMillis() - before;
              System.out.println("Split Took "+after + " ms");
          }


          // Substring method
          for(int j=0; j < numRuns; j++) {
              before = System.currentTimeMillis();
              for(int i = 0; i < numIterations; i++){
                  for(String s : strings){
                      String newvalue = s.substring(s.indexOf(" ") + 1);
                  }
              }
              after = System.currentTimeMillis() - before;
              System.out.println("Substring Took "+after + " ms");
          }



          // Apache Commons Lang method
          before = System.currentTimeMillis();
          for(int j=0; j < numRuns; j++) {
              before = System.currentTimeMillis();
              for(int i = 0; i < numIterations; i++){
                  for (String s : strings) {
                      String parentStringValue = StringUtils.substringAfter(s, " ");
                  }
              }
              after = System.currentTimeMillis() - before;
              System.out.println("CommonsLang Took "+after + " ms");
          }


          for(int j=0; j < numRuns; j++) {
              long deleteTime = 0l;     
              before = System.currentTimeMillis();
              for(int i = 0; i < numIterations; i++){

                  List<StringBuilder> stringBuilders = new ArrayList<StringBuilder>();
                  for (String s : strings) {
                      stringBuilders.add(new StringBuilder(s));
                  }
                  long beforeDelete = System.currentTimeMillis();
                  for (StringBuilder s : stringBuilders) {
                      s.delete(0, s.indexOf(" ") + 1);
                  }
                  deleteTime+=(System.currentTimeMillis() - beforeDelete);
              }
              after = System.currentTimeMillis() - before;
              System.out.println("StringBuilder Delete " + deleteTime + " ms out of " + after + " total ms");
          }

          // Faster Regex method
          Pattern pattern = Pattern.compile("\\w+\\s");
          for(int j=0; j < numRuns; j++) {
              before = System.currentTimeMillis();
              for(int i = 0; i < numIterations; i++){
                  for (String s : strings) {
                      String newvalue = pattern.matcher(s).replaceFirst("");
                  }
              }
              after = System.currentTimeMillis() - before;
              System.out.println("Faster Regex Took "+after + " ms");
          }

          // Slow Regex method
          for(int j=0; j < numRuns; j++) {
              before = System.currentTimeMillis();
              for(int i = 0; i < numIterations; i++){
                  for (String s : strings) {
                      String newvalue = s.replaceFirst("\\w+\\s", "");
                  }
              }
              after = System.currentTimeMillis() - before;
              System.out.println("Slow Regex Took " + after + " ms");
          }

    }
}

鲁迪的基准有很多问题,包括不公平和错误地支持分割方法。因此,我采用了他的基准并对其进行了改进。如果您碰巧有一组StringBuilder,StringBuilder方法会稍微快一点,但是如果您需要首先从字符串转换它们,那么它会非常慢。子字符串方法是第二快的方法,如果您有字符串而不是字符串生成器,那么应该使用它。CommonsLang是第二快的,子字符串方法和CommonsLang方法都比使用split快4到5倍。String.replaceFirst()使用正则表达式,速度非常慢,因为它每次运行时都需要编译正则表达式,这会使运行时间加倍。即使没有编译步骤,它也比其他步骤慢得多

下面是基准测试的代码。您需要将ApacheCommonsLang添加到类路径中才能运行此操作

import java.util.ArrayList;
import java.util.List;
import java.util.regex.Pattern;

import org.apache.commons.lang3.StringUtils;

/**
 *
 */
public class StringTest {
    public static void main(String[] args) {
        int numIterations = 100000;
        int numRuns = 10;
        ArrayList<String> strings = new ArrayList<String>();
          for(int i = 0; i < 1000; i++) strings.add("Remove the word remove from String "+i);
          //Your method
          long before = 0;
          long after = 0;
          for(int j=0; j < numRuns; j++) {
              before = System.currentTimeMillis();
              for(int i = 0; i < numIterations; i++){
                  for(String s : strings){
                      String newvalue = s.split(" ", 2)[1];
    //                System.out.println("split " + newvalue);
                  }
              }
              after = System.currentTimeMillis() - before;
              System.out.println("Split Took "+after + " ms");
          }


          // Substring method
          for(int j=0; j < numRuns; j++) {
              before = System.currentTimeMillis();
              for(int i = 0; i < numIterations; i++){
                  for(String s : strings){
                      String newvalue = s.substring(s.indexOf(" ") + 1);
                  }
              }
              after = System.currentTimeMillis() - before;
              System.out.println("Substring Took "+after + " ms");
          }



          // Apache Commons Lang method
          before = System.currentTimeMillis();
          for(int j=0; j < numRuns; j++) {
              before = System.currentTimeMillis();
              for(int i = 0; i < numIterations; i++){
                  for (String s : strings) {
                      String parentStringValue = StringUtils.substringAfter(s, " ");
                  }
              }
              after = System.currentTimeMillis() - before;
              System.out.println("CommonsLang Took "+after + " ms");
          }


          for(int j=0; j < numRuns; j++) {
              long deleteTime = 0l;     
              before = System.currentTimeMillis();
              for(int i = 0; i < numIterations; i++){

                  List<StringBuilder> stringBuilders = new ArrayList<StringBuilder>();
                  for (String s : strings) {
                      stringBuilders.add(new StringBuilder(s));
                  }
                  long beforeDelete = System.currentTimeMillis();
                  for (StringBuilder s : stringBuilders) {
                      s.delete(0, s.indexOf(" ") + 1);
                  }
                  deleteTime+=(System.currentTimeMillis() - beforeDelete);
              }
              after = System.currentTimeMillis() - before;
              System.out.println("StringBuilder Delete " + deleteTime + " ms out of " + after + " total ms");
          }

          // Faster Regex method
          Pattern pattern = Pattern.compile("\\w+\\s");
          for(int j=0; j < numRuns; j++) {
              before = System.currentTimeMillis();
              for(int i = 0; i < numIterations; i++){
                  for (String s : strings) {
                      String newvalue = pattern.matcher(s).replaceFirst("");
                  }
              }
              after = System.currentTimeMillis() - before;
              System.out.println("Faster Regex Took "+after + " ms");
          }

          // Slow Regex method
          for(int j=0; j < numRuns; j++) {
              before = System.currentTimeMillis();
              for(int i = 0; i < numIterations; i++){
                  for (String s : strings) {
                      String newvalue = s.replaceFirst("\\w+\\s", "");
                  }
              }
              after = System.currentTimeMillis() - before;
              System.out.println("Slow Regex Took " + after + " ms");
          }

    }
}

你用过子串吗?问题是,你是否使用了子字符串?问题还不清楚,还是很慢。这似乎比我目前的做法更糟。@user3639557为什么你认为它慢?我可以看到我在文件中的行上迭代的速度有多快。此操作在每条生产线上进行。我每10000行生成并输出一次。我在不读取文件的情况下使用循环进行了尝试,在405毫秒内有10000条记录。我认为这并不慢。@user3639557-如果您在观看它,那么是您的System.out.println()语句减慢了您的进程。与大多数流程相比,在屏幕上打印速度非常慢。删除它,只输出开始和结束时间。它仍然很慢。这似乎比我现在做的更糟。@user3639557为什么你认为它慢?我可以看到h