Java 如果列表中有字符串(在编译时给出):HashSet是最快的解决方案吗?

Java 如果列表中有字符串(在编译时给出):HashSet是最快的解决方案吗?,java,string,performance,hashmap,contains,Java,String,Performance,Hashmap,Contains,在编译时给定一个固定的字符串列表,例如: 利用HashSet我们有一种非常快速的方法(O(1))来判断运行时提供的字符串是否在字符串列表中 例如: Set<String> SET = new HashSet<>(Arrays.asList( "zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine")); boolean listed = SET.conta

在编译时给定一个固定的字符串列表,例如:

利用
HashSet
我们有一种非常快速的方法(O(1))来判断运行时提供的
字符串是否在字符串列表中

例如:

Set<String> SET = new HashSet<>(Arrays.asList( "zero", "one", "two", "three",
        "four", "five", "six", "seven", "eight", "nine"));

boolean listed = SET.contains("some-text");
如果
Checker.values
中列出的值不会更改,则提供尽可能快的实现(例如,如果需要,您可以在代码中使用这些文字)

演示:
HashSetChecker
实现 使用
HashSet
的实现如下所示:

class HashSetChecker implements Checker {
    private final Set<String> set = new HashSet<>(Arrays.asList(VALUES));
    @Override
    public boolean contains(String s) {
        return set.contains(s);
    }
}

让我们看看一些实现:

第一个想法:
HashSet
大容量 有人可能会说,多个字符串可能会在同一个
HashSet
bucket中结束,因此让我们使用一个较大的初始容量:

class HashSetChecker2 implements Checker {
    private final Set<String> set = new HashSet<>(1000);
    { set.addAll(Arrays.asList(VALUES)); }

    @Override
    public boolean contains(String s) {
        return set.contains(s);
    }
}
TreeSet
有些人可能会说,也尝试一下
TreeSet
(它已排序,因此可能有机会)。我知道它是O(log(n)),但是
n
很小(在本例中为10):

然后恢复到短路或检查 有些人可能会说,我们应该首先通过引用检查是否有
字符串
,如果没有,则返回短路或检查:

class RefOrChecker extends OrChecker {
    @Override
    public boolean contains(String s) {
        return "zero" == s || "one" == s || "two" == s || "three" == s
                || "four" == s || "five" == s || "six" == s || "seven" == s
                || "eight" == s || "nine" == s || super.contains(s);
    }
}
使用
开关
:目前为止我发现的最快的 由于我们在编译时有一个固定的
String
s列表,我们可以利用在
switch
语句中使用
String
s的可能性

我们可以为固定列表中的每个
字符串
添加一个
大小写
,并返回
,还可以添加一个
默认
大小写以返回

class SwitchChecker implements Checker {
    @Override
    public boolean contains(String s) {
        switch (s) {
            case "zero":
            case "one":
            case "two":
            case "three":
            case "four":
            case "five":
            case "six":
            case "seven":
            case "eight":
            case "nine":
                return true;
            default:
                return false;
        }
    }
}
新发现:嵌入式交换机(2个
交换机
块)
Maaartinus关于完美散列的文章让我思考。即使我们有一个完美的散列,它仍然必须在运行时提供的要检查的
字符串的整个内容上运行。因此,我们应该使用
字符串中可用的内容:其长度。根据
字符串的长度
我们使用
开关
,在该
开关
中,我们使用内部
开关
仅列出指定长度的字符串。这样,我们减少了
开关中
case
语句的数量:

class EmbeddedSwitchChecker implements Checker {
    @Override
    public boolean contains(String s) {
        switch (s.length()) {
        case 3:
            switch (s) {
                case "one":
                case "two":
                case "six":
                    return true;
                default:
                    return false;
            }
        case 4:
            switch (s) {
                case "zero":
                case "four":
                case "five":
                case "nine":
                    return true;
                default:
                    return false;
            }
        case 5:
            switch (s) {
                case "three":
                case "seven":
                case "eight":
                    return true;
                default:
                    return false;
            }
        default:
            return false;
        }
    }
}
新发现:CharSwitchChecker:冠军 这基本上是改进的
嵌入式SwitchChecker
和OldCurmudgeon的组合:这里我们在
字符串的第一个字符上使用
开关
(但首先我们检查它的长度),并在此基础上缩小到一个可能的
字符串
,或者如果没有,我们还检查了第二个字符,在这种情况下,可能的
字符串只能是一个(我们可以通过调用
String.equals()
来决定):

SwitchChecker
解决方案比
HashSetChecker
实现快约1.7倍,
embeddedsitchchecker
快约2倍,champion
CharSwitchChecker
快约2.13倍。正如所料,初始容量大的
HashSet
HashMap
解决方案的速度稍快,而所有其他解决方案都落后

完整的可运行测试程序(+所有列出的解决方案) 完整的Runnalbe测试程序加上所有列出的解决方案都放在一个框中,供那些想要尝试它或尝试新实现的人使用

编辑:根据Luiggi Mendoza的建议,我更改了测试的
main()
方法。我执行了两次整个测试,只分析了第二个结果。此外,由于测试不会在循环中创建新对象,因此我认为没有理由调用
System.gc()

import java.util.Arrays;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Map;
import java.util.Set;
import java.util.TreeSet;

interface Checker {
    String[] VALUES = { "zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine" };

    boolean contains(String s);
}

class HashSetChecker implements Checker {
    private final Set<String> set = new HashSet<>(Arrays.asList(VALUES));

    @Override
    public boolean contains(String s) {
        return set.contains(s);
    }
}

class HashSetChecker2 implements Checker {
    private final Set<String> set = new HashSet<>(1000);
    {
        set.addAll(Arrays.asList(VALUES));
    }

    @Override
    public boolean contains(String s) {
        return set.contains(s);
    }
}

class HashMapChecker implements Checker {
    private final Map<String, Object> map = new HashMap<>(1000);
    {
        for (String s : VALUES)
            map.put(s, s);
    }

    @Override
    public boolean contains(String s) {
        return map.containsKey(s);
    }
}

class TreeSetChecker implements Checker {
    private final Set<String> set = new TreeSet<>(Arrays.asList(VALUES));

    @Override
    public boolean contains(String s) {
        return set.contains(s);
    }
}

class OrChecker implements Checker {
    @Override
    public boolean contains(String s) {
        return "zero".equals(s) || "one".equals(s) || "two".equals(s) || "three".equals(s)
                || "four".equals(s) || "five".equals(s) || "six".equals(s) || "seven".equals(s)
                || "eight".equals(s) || "nine".equals(s);
    }
}

class RefOrChecker extends OrChecker {
    @Override
    public boolean contains(String s) {
        return "zero" == s || "one" == s || "two" == s || "three" == s || "four" == s || "five" == s
                || "six" == s || "seven" == s || "eight" == s || "nine" == s || super.contains(s);
    }
}

class SwitchChecker implements Checker {
    @Override
    public boolean contains(String s) {
        switch (s) {
        case "zero":
        case "one":
        case "two":
        case "three":
        case "four":
        case "five":
        case "six":
        case "seven":
        case "eight":
        case "nine":
            return true;
        default:
            return false;
        }
    }
}

class EmbeddedSwitchChecker implements Checker {
    @Override
    public boolean contains(String s) {
        switch (s.length()) {
        case 3:
            switch (s) {
            case "one":
            case "two":
            case "six":
                return true;
            default:
                return false;
            }
        case 4:
            switch (s) {
            case "zero":
            case "four":
            case "five":
            case "nine":
                return true;
            default:
                return false;
            }
        case 5:
            switch (s) {
            case "three":
            case "seven":
            case "eight":
                return true;
            default:
                return false;
            }
        default:
            return false;
        }
    }
}

class CharSwitchChecker implements Checker {
    @Override
    public boolean contains(String s) {
        final int length = s.length();
        if (length < 3 || length > 5)
            return false;

        switch (s.charAt(0)) {
        case 'z':
            return "zero".equals(s);
        case 'o':
            return "one".equals(s);
        case 't':
            return s.charAt(1) == 'w' ? "two".equals(s) : "three".equals(s);
        case 'f':
            return s.charAt(1) == 'o' ? "four".equals(s) : "five".equals(s);
        case 's':
            return s.charAt(1) == 'i' ? "six".equals(s) : "seven".equals(s);
        case 'e':
            return "eight".equals(s);
        case 'n':
            return "nine".equals(s);
        }
        return false;
    }
}

public class CheckerTester {
    private static final String[] TESTS = { "zero", "one", "two", "three", "four", "five", "six", "seven",
            "eight", "nine", "ten", "eleven", "twelve", "thirteen", "fourteen", "fifteen", "sixteen",
            "seventeen", "eighteen", "nineteen",

            new String("zero"), new String("one"), new String("two"), new String("three"),
            new String("four"), new String("five"), new String("six"), new String("seven"),
            new String("eight"), new String("nine"), new String("ten"), new String("eleven"),
            new String("twelve"), new String("thirteen"), new String("fourteen"), new String("fifteen"),
            new String("sixteen"), new String("seventeen"), new String("eighteen"), new String("nineteen") };

    public static void test(Checker checker) {
        final int N = 1_000_000;

        long start = System.nanoTime();

        for (int i = 0; i < N; i++)
            for (String test : TESTS)
                checker.contains(test);

        long end = System.nanoTime();

        System.out.printf("%s: %d ms\n", checker.getClass().getName(), (end - start) / 1_000_000);
    }

    public static void main(String args[]) {
        for (int i = 1; i <= 2; i++) {
            System.out.println("---- Check #" + i);
            test(new HashSetChecker());
            test(new HashSetChecker2());
            test(new HashMapChecker());
            test(new TreeSetChecker());
            test(new OrChecker());
            test(new RefOrChecker());
            test(new SwitchChecker());
            test(new EmbeddedSwitchChecker());
            test(new CharSwitchChecker());
        }
    }

}
导入java.util.array;
导入java.util.HashMap;
导入java.util.HashSet;
导入java.util.Map;
导入java.util.Set;
导入java.util.TreeSet;
接口检查器{
字符串[]值={“零”、“一”、“二”、“三”、“四”、“五”、“六”、“七”、“八”、“九”};
布尔包含(字符串s);
}
类HashSetChecker实现检查器{
私有最终集=新哈希集(Arrays.asList(VALUES));
@凌驾
公共布尔包含(字符串s){
返回集。包含(个);
}
}
类HashSetChecker2实现检查器{
私有最终集=新哈希集(1000);
{
set.addAll(Arrays.asList(VALUES));
}
@凌驾
公共布尔包含(字符串s){
返回集。包含(个);
}
}
类HashMapChecker实现检查器{
私有最终映射=新HashMap(1000);
{
用于(字符串s:值)
地图.put(s,s);;
}
@凌驾
公共布尔包含(字符串s){
返回地图。集装箱箱(s);
}
}
类TreeSetChecker实现检查器{
私有最终集=新树集(Arrays.asList(VALUES));
@凌驾
公共布尔包含(字符串s){
返回集。包含(个);
}
}
类OrChecker实现检查器{
@凌驾
公共布尔包含(字符串s){
返回“零”。等于(s)| |一。等于(s)| |二。等于(s)| |三。等于(s)
||“四”等于“五”等于“六”等于“七”
||“八”等于“九”;
}
}
类RefOrChecker扩展了OrChecker{
@凌驾
公共布尔包含(字符串s){
返回“零”==s | | |“一”==s | | |“二”==s | | |“三”==s | |“四”==s | |“五”==s
||“六”==s | | |“七”==s | | |“八”==s | | |“九”==s | | super.contains(s);
}
}
类SwitchChecker实现检查器{
@凌驾
公共布尔包含(字符串s){
开关{
案例“零”:
案例“一”:
案例“二”:
案例“三”:
案例“四”:
案例“五”
class TreeSetChecker implements Checker {
    private final Set<String> set = new TreeSet<>(Arrays.asList(VALUES));
    @Override
    public boolean contains(String s) {
        return set.contains(s);
    }
}
class OrChecker implements Checker {
    @Override
    public boolean contains(String s) {
        return "zero".equals(s) || "one".equals(s) || "two".equals(s)
                || "three".equals(s) || "four".equals(s) || "five".equals(s)
                || "six".equals(s) || "seven".equals(s) || "eight".equals(s)
                || "nine".equals(s);
    }
}
class RefOrChecker extends OrChecker {
    @Override
    public boolean contains(String s) {
        return "zero" == s || "one" == s || "two" == s || "three" == s
                || "four" == s || "five" == s || "six" == s || "seven" == s
                || "eight" == s || "nine" == s || super.contains(s);
    }
}
class SwitchChecker implements Checker {
    @Override
    public boolean contains(String s) {
        switch (s) {
            case "zero":
            case "one":
            case "two":
            case "three":
            case "four":
            case "five":
            case "six":
            case "seven":
            case "eight":
            case "nine":
                return true;
            default:
                return false;
        }
    }
}
class EmbeddedSwitchChecker implements Checker {
    @Override
    public boolean contains(String s) {
        switch (s.length()) {
        case 3:
            switch (s) {
                case "one":
                case "two":
                case "six":
                    return true;
                default:
                    return false;
            }
        case 4:
            switch (s) {
                case "zero":
                case "four":
                case "five":
                case "nine":
                    return true;
                default:
                    return false;
            }
        case 5:
            switch (s) {
                case "three":
                case "seven":
                case "eight":
                    return true;
                default:
                    return false;
            }
        default:
            return false;
        }
    }
}
class CharSwitchChecker implements Checker {
    @Override
    public boolean contains(String s) {
        final int length = s.length();
        if (length < 3 || length > 5)
            return false;

        switch (s.charAt(0)) {
        case 'z':
            return "zero".equals(s);
        case 'o':
            return "one".equals(s);
        case 't':
            return s.charAt(1) == 'w' ? "two".equals(s) : "three".equals(s);
        case 'f':
            return s.charAt(1) == 'o' ? "four".equals(s) : "five".equals(s);
        case 's':
            return s.charAt(1) == 'i' ? "six".equals(s) : "seven".equals(s);
        case 'e':
            return "eight".equals(s);
        case 'n':
            return "nine".equals(s);
        }
        return false;
    }
}
                         TIME        HOW FAST (compared to HashSetChecker)
-----------------------------------------------------------------------------
HashSetChecker:          929 ms       1.00x
HashSetChecker2:         892 ms       1.04x
HashMapChecker:          873 ms       1.06x
TreeSetChecker:         2265 ms       0.41x
OrChecker:              1815 ms       0.51x
RefOrChecker:           1708 ms       0.54x
SwitchChecker:           538 ms       1.73x
EmbeddedSwitchChecker:   467 ms       1.99x
CharSwitchChecker:       436 ms       2.13x
import java.util.Arrays;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Map;
import java.util.Set;
import java.util.TreeSet;

interface Checker {
    String[] VALUES = { "zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine" };

    boolean contains(String s);
}

class HashSetChecker implements Checker {
    private final Set<String> set = new HashSet<>(Arrays.asList(VALUES));

    @Override
    public boolean contains(String s) {
        return set.contains(s);
    }
}

class HashSetChecker2 implements Checker {
    private final Set<String> set = new HashSet<>(1000);
    {
        set.addAll(Arrays.asList(VALUES));
    }

    @Override
    public boolean contains(String s) {
        return set.contains(s);
    }
}

class HashMapChecker implements Checker {
    private final Map<String, Object> map = new HashMap<>(1000);
    {
        for (String s : VALUES)
            map.put(s, s);
    }

    @Override
    public boolean contains(String s) {
        return map.containsKey(s);
    }
}

class TreeSetChecker implements Checker {
    private final Set<String> set = new TreeSet<>(Arrays.asList(VALUES));

    @Override
    public boolean contains(String s) {
        return set.contains(s);
    }
}

class OrChecker implements Checker {
    @Override
    public boolean contains(String s) {
        return "zero".equals(s) || "one".equals(s) || "two".equals(s) || "three".equals(s)
                || "four".equals(s) || "five".equals(s) || "six".equals(s) || "seven".equals(s)
                || "eight".equals(s) || "nine".equals(s);
    }
}

class RefOrChecker extends OrChecker {
    @Override
    public boolean contains(String s) {
        return "zero" == s || "one" == s || "two" == s || "three" == s || "four" == s || "five" == s
                || "six" == s || "seven" == s || "eight" == s || "nine" == s || super.contains(s);
    }
}

class SwitchChecker implements Checker {
    @Override
    public boolean contains(String s) {
        switch (s) {
        case "zero":
        case "one":
        case "two":
        case "three":
        case "four":
        case "five":
        case "six":
        case "seven":
        case "eight":
        case "nine":
            return true;
        default:
            return false;
        }
    }
}

class EmbeddedSwitchChecker implements Checker {
    @Override
    public boolean contains(String s) {
        switch (s.length()) {
        case 3:
            switch (s) {
            case "one":
            case "two":
            case "six":
                return true;
            default:
                return false;
            }
        case 4:
            switch (s) {
            case "zero":
            case "four":
            case "five":
            case "nine":
                return true;
            default:
                return false;
            }
        case 5:
            switch (s) {
            case "three":
            case "seven":
            case "eight":
                return true;
            default:
                return false;
            }
        default:
            return false;
        }
    }
}

class CharSwitchChecker implements Checker {
    @Override
    public boolean contains(String s) {
        final int length = s.length();
        if (length < 3 || length > 5)
            return false;

        switch (s.charAt(0)) {
        case 'z':
            return "zero".equals(s);
        case 'o':
            return "one".equals(s);
        case 't':
            return s.charAt(1) == 'w' ? "two".equals(s) : "three".equals(s);
        case 'f':
            return s.charAt(1) == 'o' ? "four".equals(s) : "five".equals(s);
        case 's':
            return s.charAt(1) == 'i' ? "six".equals(s) : "seven".equals(s);
        case 'e':
            return "eight".equals(s);
        case 'n':
            return "nine".equals(s);
        }
        return false;
    }
}

public class CheckerTester {
    private static final String[] TESTS = { "zero", "one", "two", "three", "four", "five", "six", "seven",
            "eight", "nine", "ten", "eleven", "twelve", "thirteen", "fourteen", "fifteen", "sixteen",
            "seventeen", "eighteen", "nineteen",

            new String("zero"), new String("one"), new String("two"), new String("three"),
            new String("four"), new String("five"), new String("six"), new String("seven"),
            new String("eight"), new String("nine"), new String("ten"), new String("eleven"),
            new String("twelve"), new String("thirteen"), new String("fourteen"), new String("fifteen"),
            new String("sixteen"), new String("seventeen"), new String("eighteen"), new String("nineteen") };

    public static void test(Checker checker) {
        final int N = 1_000_000;

        long start = System.nanoTime();

        for (int i = 0; i < N; i++)
            for (String test : TESTS)
                checker.contains(test);

        long end = System.nanoTime();

        System.out.printf("%s: %d ms\n", checker.getClass().getName(), (end - start) / 1_000_000);
    }

    public static void main(String args[]) {
        for (int i = 1; i <= 2; i++) {
            System.out.println("---- Check #" + i);
            test(new HashSetChecker());
            test(new HashSetChecker2());
            test(new HashMapChecker());
            test(new TreeSetChecker());
            test(new OrChecker());
            test(new RefOrChecker());
            test(new SwitchChecker());
            test(new EmbeddedSwitchChecker());
            test(new CharSwitchChecker());
        }
    }

}
public class Test {

    interface Checker {

        Set<String> VALUES = new HashSet<>(Arrays.asList("zero", "one", "two", "three", "four", "five", "six",
                "seven", "eight", "nine"));

        boolean contains(String s);
    }

    public static class HatChecker implements Checker {

        // Can't think of a name.
        static class Hats {

            // All possible children.
            Hats[] hats = new Hats[256];
            // Are we at the end of a word.
            boolean end = false;
        }

        // Root hats - contains one entry fr each possible fisrt characetr.
        static Hats root = new Hats();

        /**
         * Where should it go?
         */
        private static Hats find(String s, boolean grow) {
            Hats hats = root;
            for (int i = 0; i < s.length(); i++) {
                int ch = s.charAt(i);
                Hats newHats = hats.hats[ch];
                // Not seen this sequence yet?
                if (newHats == null) {
                    if (grow) {
                        // Allowed to grow.
                        newHats = hats.hats[ch] = new Hats();
                    } else {
                        // No growing - stop here.
                        return null;
                    }
                }
                hats = newHats;
            }
            return hats;
        }

        /**
         * Add to the structures.
         */
        private static void add(String s) {
            // Grow it and margk it good.
            find(s, true).end = true;
        }

        static {
            // Grow my structure.
            for (String s : VALUES) {
                add(s);
            }
        }

        @Override
        public boolean contains(String s) {
            // Find where it should be but don't grow.
            Hats found = find(s, false);
            // It's a match if it wa sthere and was an end.
            return found != null && found.end;
        }

    }

    private static class Check {

        private final String s;
        private final boolean matches;

        public Check(String s) {
            this.s = s;
            this.matches = Checker.VALUES.contains(s);
        }

        public String toString() {
            return "(" + s + ")=" + matches;
        }
    }
    private static final Check[] TESTS = {
        new Check("zero"),
        new Check("one"),
        new Check("two"),
        new Check("three"),
        new Check("four"),
        new Check("five"),
        new Check("six"),
        new Check("seven"),
        new Check("eight"),
        new Check("nine"),
        new Check("ten"),
        new Check("eleven"),
        new Check("twelve"),
        new Check("thirteen"),
        new Check("fourteen"),
        new Check("fifteen"),
        new Check("sixteen"),
        new Check("seventeen"),
        new Check("eighteen"),
        new Check("nineteen"),
        new Check(new String("zero")),
        new Check(new String("one")),
        new Check(new String("two")),
        new Check(new String("three")),
        new Check(new String("four")),
        new Check(new String("five")),
        new Check(new String("six")),
        new Check(new String("seven")),
        new Check(new String("eight")),
        new Check(new String("nine")),
        new Check(new String("ten")),
        new Check(new String("eleven")),
        new Check(new String("twelve")),
        new Check(new String("thirteen")),
        new Check(new String("fourteen")),
        new Check(new String("fifteen")),
        new Check(new String("sixteen")),
        new Check(new String("seventeen")),
        new Check(new String("eighteen")),
        new Check(new String("nineteen"))};

    public void timeTest(Checker checker) {
        System.out.println("Time");
        final int N = 1_000_000;

        long start = System.nanoTime();

        for (int i = 0; i < N; i++) {
            for (Check check : TESTS) {
                checker.contains(check.s);
            }
        }

        long end = System.nanoTime();

        System.out.printf("%s: %d ms\n", checker.getClass().getName(),
                (end - start) / 1_000_000);
    }

    public void checkerTest(Checker checker) {
        System.out.println("Checker");
        for (Check check : TESTS) {
            if (checker.contains(check.s) != check.matches) {
                System.err.println("Check(" + check + ") failed");
            }
        }
    }

    public static void main(String args[]) {
        try {
            Checker checker = new HatChecker();
            Test test = new Test();
            test.checkerTest(checker);
            test.timeTest(checker);
        } catch (Throwable ex) {
            ex.printStackTrace(System.err);
        }
    }
}