Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/334.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在Java中,如何确定字符是否为字母?_Java_Unicode - Fatal编程技术网

在Java中,如何确定字符是否为字母?

在Java中,如何确定字符是否为字母?,java,unicode,Java,Unicode,如何检查一个字符串是否为字母-包括带有重音符号的字母 最近我不得不解决这个问题,所以在最近的VB6问题提醒我之后,我会自己回答。只是检查字母是否在a-Z中,因为这不包括带有重音的字母或其他字母表中的字母 我发现您可以将正则表达式类用于“Unicode字母”或其区分大小写的变体: string.matches("\\p{L}"); // Unicode letter string.matches("\\p{Lu}"); // Unicode upper-case letter 也可以使用角色类执

如何检查一个字符串是否为字母-包括带有重音符号的字母


最近我不得不解决这个问题,所以在最近的VB6问题提醒我之后,我会自己回答。

只是检查字母是否在a-Z中,因为这不包括带有重音的字母或其他字母表中的字母

我发现您可以将正则表达式类用于“Unicode字母”或其区分大小写的变体:

string.matches("\\p{L}"); // Unicode letter
string.matches("\\p{Lu}"); // Unicode upper-case letter
也可以使用角色类执行此操作:

但是,如果需要检查多个字母,则不太方便。

Character.isleter()比string.matches()快得多,因为string.matches()每次都编译一个新模式。即使缓存该模式,我认为Isleter()仍将击败它


编辑:刚刚又遇到了这个问题,我想我应该试着找出一些实际的数字。下面是我对基准测试的尝试,检查所有三种方法(
matches()
,有无缓存
模式
,以及
Character.isleter()
)。我还确保检查了有效字符和无效字符,以避免扭曲内容

import java.util.regex.*;

class TestLetter {
    private static final Pattern ONE_CHAR_PATTERN = Pattern.compile("\\p{L}");
    private static final int NUM_TESTS = 10000000;

    public static void main(String[] args) {
        long start = System.nanoTime();
        int counter = 0;
        for (int i = 0; i < NUM_TESTS; i++) {
            if (testMatches(Character.toString((char) (i % 128))))
                counter++;
        }
        System.out.println(NUM_TESTS + " tests of Pattern.matches() took " +
                (System.nanoTime()-start) + " ns.");
        System.out.println("There were " + counter + "/" + NUM_TESTS +
                " valid characters");
        /*********************************/
        start = System.nanoTime();
        counter = 0;
        for (int i = 0; i < NUM_TESTS; i++) {
            if (testCharacter(Character.toString((char) (i % 128))))
                counter++;
        }
        System.out.println(NUM_TESTS + " tests of isLetter() took " +
                (System.nanoTime()-start) + " ns.");
        System.out.println("There were " + counter + "/" + NUM_TESTS +
                " valid characters");
        /*********************************/
        start = System.nanoTime();
        counter = 0;
        for (int i = 0; i < NUM_TESTS; i++) {
            if (testMatchesNoCache(Character.toString((char) (i % 128))))
                counter++;
        }
        System.out.println(NUM_TESTS + " tests of String.matches() took " +
                (System.nanoTime()-start) + " ns.");
        System.out.println("There were " + counter + "/" + NUM_TESTS +
                " valid characters");
    }

    private static boolean testMatches(final String c) {
        return ONE_CHAR_PATTERN.matcher(c).matches();
    }
    private static boolean testMatchesNoCache(final String c) {
        return c.matches("\\p{L}");
    }
    private static boolean testCharacter(final String c) {
        return Character.isLetter(c.charAt(0));
    }
}
import java.util.regex.*;
类测试字母{
私有静态最终模式ONE_CHAR_Pattern=Pattern.compile(\\p{L});
私有静态最终整数测试=10000000;
公共静态void main(字符串[]args){
长启动=System.nanoTime();
int计数器=0;
对于(int i=0;i
以及我的输出:

10000000 tests of Pattern.matches() took 4325146672 ns. There were 4062500/10000000 valid characters 10000000 tests of isLetter() took 546031201 ns. There were 4062500/10000000 valid characters 10000000 tests of String.matches() took 11900205444 ns. There were 4062500/10000000 valid characters 10000000个Pattern.matches()测试耗时4325146672 ns。 有4062500/10000000个有效字符 10000000次isLetter()测试耗时546031201纳秒。 有4062500/10000000个有效字符 10000000次String.matches()测试花费了11900205444纳秒。 有4062500/10000000个有效字符
因此,即使使用缓存的
模式
,这也几乎是原来的8倍。(未缓存比缓存差近3倍。)

您应该在
testCharacter()中使用
c.codepoint(0)
而不是
c.charAt(0)
;否则,对于BMP之外的字符,它将失败。 10000000 tests of Pattern.matches() took 4325146672 ns. There were 4062500/10000000 valid characters 10000000 tests of isLetter() took 546031201 ns. There were 4062500/10000000 valid characters 10000000 tests of String.matches() took 11900205444 ns. There were 4062500/10000000 valid characters