Java 找到url路径最后一部分的最快方法是什么？_Java_Regex

Java 找到url路径最后一部分的最快方法是什么？

java regex

Java 找到url路径最后一部分的最快方法是什么？,java,regex,Java,Regex,我有一个url，例如： “http://www.someco.com/news/2016-01-03/滑铁卢电台” url从不包含查询字符串提取字符串“滑铁卢站”最干净的方法是什么当然，我可以使用以下代码： url.substring(url.lastIndexOf('/') + 1)) 但我并不完全满意它，因为它必须执行对最后一个索引的搜索，然后获取子字符串。我想知道是否有更好的方法（使用正则表达式？）在单个步骤中获得相同的结果当然，当执行数十亿次时，解决方案的速度应该会明显加快。我认

我有一个url，例如：

“http://www.someco.com/news/2016-01-03/滑铁卢电台”

url从不包含查询字符串

提取字符串“滑铁卢站”最干净的方法是什么

当然，我可以使用以下代码：

url.substring(url.lastIndexOf('/') + 1))

但我并不完全满意它，因为它必须执行对最后一个索引的搜索，然后获取子字符串。我想知道是否有更好的方法（使用正则表达式？）在单个步骤中获得相同的结果

当然，当执行数十亿次时，解决方案的速度应该会明显加快。

我认为它无法改进。简单的回答是，因为搜索最后一个索引是一个简单的操作，所以可以使用快速算法（直接在String类中！）来实现，而正则表达式很难达到这样的速度。正如您所见，第二次访问字符串的成本无法降低：它只是新字符串的初始化

如果有一个直接在String类中实现的专用方法，速度可能会更快

如果您想了解更多详细信息，您可以自己查看JDK中的代码。为方便起见，请复制到这里

以下代码是我的JDK中lastIndexOf（）方法的实现：

public int lastIndexOf(int ch, int fromIndex) {
    int min = offset;
    char v[] = value;

    int i = offset + ((fromIndex >= count) ? count - 1 : fromIndex);

    if (ch < Character.MIN_SUPPLEMENTARY_CODE_POINT) {
        // handle most cases here (ch is a BMP code point or a
        // negative value (invalid code point))
        for (; i >= min ; i--) {
            if (v[i] == ch) {
                return i - offset;
            }
        }
        return -1;
    }

    int max = offset + count;
    if (ch <= Character.MAX_CODE_POINT) {
        // handle supplementary characters here
        char[] surrogates = Character.toChars(ch);
        for (; i >= min; i--) {
            if (v[i] == surrogates[0]) {
                if (i + 1 == max) {
                    break;
                }
                if (v[i+1] == surrogates[1]) {
                    return i - offset;
                }
            }
        }
    }
    return -1;
}

它在子字符串上不起作用。同时，substring方法在Java中速度非常快，因为它不创建新的char数组，而只是创建一个新的String对象来更改偏移量和计数：

public String substring(int beginIndex, int endIndex) {
    if (beginIndex < 0) {
        throw new StringIndexOutOfBoundsException(beginIndex);
    }
    if (endIndex > count) {
        throw new StringIndexOutOfBoundsException(endIndex);
    }
    if (beginIndex > endIndex) {
        throw new StringIndexOutOfBoundsException(endIndex - beginIndex);
    }
    return ((beginIndex == 0) && (endIndex == count)) ? this :
        new String(offset + beginIndex, endIndex - beginIndex, value);
}

// Package private constructor which shares value array for speed.
String(int offset, int count, char value[]) {
    this.value = value;
    this.offset = offset;
    this.count = count;
}

公共字符串子字符串（int-beginIndex，int-endIndex）{
如果（beginIndex<0）{
抛出新的StringIndexOutOfBoundsException（beginIndex）；
}
如果（结束索引>计数）{
抛出新的StringIndexOutOfBoundsException（endIndex）；
}
如果（beginIndex>endIndex）{
抛出新的StringIndexOutOfBoundsException（endIndex-beginIndex）；
}
返回（（beginIndex==0）和（&（endIndex==count））？这是：
新字符串（偏移量+开始索引，结束索引-开始索引，值）；
}
//包私有构造函数，共享值数组以提高速度。
字符串（整数偏移量、整数计数、字符值[]）{
这个值=值；
这个偏移量=偏移量；
this.count=计数；
}

我称之为大规模的过早优化。为什么你认为这可以更快？必须找到分隔符，并且必须构造一个新字符串。这是不可能的。正则表达式比仅仅迭代一个字符数组的17个元素要复杂得多。这是你能得到的最快的速度，也是你能得到的最简单和可读性。它不太可能导致应用程序出现性能问题：如果必须执行数十亿次，则需要以某种方式从磁盘读取这些URL，这比子字符串慢很多个数量级。我不需要读取数十亿次，因为我没有数十亿个URL要读。我有数百万个，我不想浪费内存来保存子字符串的结果。我需要多次执行子字符串，因为算法需要它。@JohnHenry-即使存在这样的方法，它也不会比上面的代码快。

public String substring(int beginIndex, int endIndex) {
    if (beginIndex < 0) {
        throw new StringIndexOutOfBoundsException(beginIndex);
    }
    if (endIndex > count) {
        throw new StringIndexOutOfBoundsException(endIndex);
    }
    if (beginIndex > endIndex) {
        throw new StringIndexOutOfBoundsException(endIndex - beginIndex);
    }
    return ((beginIndex == 0) && (endIndex == count)) ? this :
        new String(offset + beginIndex, endIndex - beginIndex, value);
}

// Package private constructor which shares value array for speed.
String(int offset, int count, char value[]) {
    this.value = value;
    this.offset = offset;
    this.count = count;
}