Java 需要更快的Word Builder算法吗_Java_Android_Algorithm_Simplecursoradapter

Java 需要更快的Word Builder算法吗

java android algorithm

Java 需要更快的Word Builder算法吗,java,android,algorithm,simplecursoradapter,Java,Android,Algorithm,Simplecursoradapter,我有一个帮助学习拼字游戏的应用程序。除了Word Builder，大多数搜索都比C#中的桌面版本快得多。此搜索显示由给定字母a-Z或空格组成的所有单词。我该怎么做才能让它跑得更快？我已经考虑过使用Trie，但还没有找到支持使用blank的方法。我使用SimpleCursorAdapter填充ListView，这就是我返回光标的原因 public Cursor getCursor_subanagrams(String term, String filters, String orde

我有一个帮助学习拼字游戏的应用程序。除了Word Builder，大多数搜索都比C#中的桌面版本快得多。此搜索显示由给定字母a-Z或空格组成的所有单词。我该怎么做才能让它跑得更快？我已经考虑过使用Trie，但还没有找到支持使用blank的方法。我使用SimpleCursorAdapter填充ListView，这就是我返回光标的原因

    public Cursor getCursor_subanagrams(String term, String filters, String ordering) {
    if (term.trim() == "")
        return null;
    // only difference between this and anagram is changing the length filter
    char[] a = term.toCharArray(); // anagram

    int[] first = new int[26]; // letter count of anagram
    int c; // array position
    int blankcount = 0;

    // initialize word to anagram
    for (c = 0; c < a.length; c++) {
        if (a[c] == '?') {
            blankcount++;
            continue;
        }
        first[a[c] - 'A']++;
    }

// gets pool of words to search through
    String lenFilter = String.format("Length(Word) <= %1$s AND Length(Word) <= %2$s", LexData.getMaxLength(), term.length());
    Cursor cursor = database.rawQuery("SELECT WordID as _id, Word, WordID, FrontHooks, BackHooks, " +
            "InnerFront, InnerBack, Anagrams, ProbFactor, OPlayFactor, Score \n" +
            "FROM     `" + LexData.getLexName() + "` \n" +
            "WHERE (" + lenFilter +
            filters +
            " ) " + ordering, null);

// creates new cursor to add valid words to
    MatrixCursor matrixCursor = new MatrixCursor(new String[]{"_id", "Word", "WordID", "FrontHooks", "BackHooks", "InnerFront", "InnerBack",
            "Anagrams", "ProbFactor", "OPlayFactor", "Score"});

// THIS NEEDS TO BE FASTER
    while (cursor.moveToNext()) {
        String word = cursor.getString(1);
        char[] b = word.toCharArray();
        if (isAnagram(first, b, blankcount)) {
            matrixCursor.addRow(get_CursorRow(cursor));
        }
    }
    cursor.close();
    return matrixCursor;
}


private boolean isAnagram(int[] anagram, char[] word, int blankcount) {
    int matchcount = blankcount;
    int c; // each letter
    int[] second = {0,0,0,0,0, 0,0,0,0,0,  0,0,0,0,0,  0,0,0,0,0,  0,0,0,0,0, 0};

    for (c = 0; c < word.length; c++)
        second[word[c] - 'A']++;

    for (c = 0; c < 26; c++)
    {
        matchcount += (anagram[c]<second[c]) ? anagram[c]:second[c];
    }

    if (matchcount == word.length)
        return true;
    return false;
    }

public Cursor getCursor\u子语法（字符串术语、字符串过滤器、字符串排序）{
如果（term.trim（）==“”）
返回null；
//这和字谜之间的唯一区别是更改长度过滤器
char[]a=term.toCharArray（）；//字谜
int[]first=new int[26]；//字谜的字母计数
int c；//数组位置
int blankcount=0；
//将单词初始化为字谜
对于（c=0；cString lenFilter=String.format（“Length（Word）专注于加速最典型的情况，即单词不是（子）字谜，返回false。如果您可以在无法从字谜
中生成Word
时尽快识别，则可以避免昂贵的测试
一种方法是使用单词中字母的位掩码。您不需要存储字母计数，因为如果word
中不在anagram
中的唯一字母数大于空格数，则无法进行此操作，并且可以快速返回false。如果不存在，则可以继续执行该操作e考虑到字母计数，更昂贵的测试
您可以像这样预计算位掩码：
private int letterMask(char[] word)
{
    int c, mask = 0;
    for (c = 0; c < word.length; c++)
        mask |= (1 << (word[c] - 'A'));
    return mask;
}

// compute mask of bits in mask that are not in term:
int missingLettersMask = cursor.getInt(8) & ~termMask;
if(missingLettersMask != 0)
{
    // check if we could possibly make up for these letters using blanks:
    int remainingBlanks = blankcount;
    while((remainingBlanks-- > 0) && (missingLettersMask != 0))
        missingLettersMask &= (missingLettersMask - 1); // remove one bit

    if(missingLettersMask != 0)
        continue; // move onto the next word
}

// word can potentially be made from anagram, call isAnagram:

有一些方法可以加快你的字谜检查功能。Samgak指出了一个。另一个明显的优化方法是，如果单词长度超过可用字母加空格的数量，则返回false。最后，这些都是微观优化，你将检查整个字典
您说过您考虑过使用trie。在我看来，这是一个很好的解决方案，因为trie的结构只会让您检查相关的单词。请按照以下方式构建它：
private int letterMask(char[] word)
{
    int c, mask = 0;
    for (c = 0; c < word.length; c++)
        mask |= (1 << (word[c] - 'A'));
    return mask;
}

// compute mask of bits in mask that are not in term:
int missingLettersMask = cursor.getInt(8) & ~termMask;
if(missingLettersMask != 0)
{
    // check if we could possibly make up for these letters using blanks:
    int remainingBlanks = blankcount;
    while((remainingBlanks-- > 0) && (missingLettersMask != 0))
        missingLettersMask &= (missingLettersMask - 1); // remove one bit

    if(missingLettersMask != 0)
        continue; // move onto the next word
}

// word can potentially be made from anagram, call isAnagram:


对每个单词的字母进行排序，使“三角形”和“整型”都变成“aegilnrt”
将排序后的单词插入trie
在普通trie中放置结束标记的地方，放置一个可能的单词列表

如果要查找准确的字谜，可以对要检查的单词进行排序，遍历trie并在末尾打印出可能的字谜列表。但在这里，您必须处理部分字谜和空白：

常规遍历意味着您获取单词的下一个字母，然后从树中的相应链接（如果存在）下降
部分字谜可以通过忽略下一个字母而不在trie中下降来找到
可以通过降低trie的所有可能分支并减少空白数量来处理空白

当你有空白时，你会得到重复的结果。例如，如果你有字母A、B和C以及一块空白的瓷砖，你可以用CAB这个词，但你可以用四种不同的方式：CAB、_AB、C_B、CA_
您可以通过将结果列表存储在消除重复项的数据结构（如集合或有序集合）中来解决此问题，但为了创建重复项，您仍然会沿着相同的路径走几次
更好的解决方案是跟踪使用哪些参数访问过哪些trie节点，即使用剩余的未使用字母和空格。然后可以缩短这些路径。以下是伪代码实现：
function find_r(t, str, blanks, visited)
{
    // don't revisit explored paths
    key = make_key(t, str, blanks);

    if (key in visited) return [];
    visited ~= key;

    if (str.length == 0 and blanks == 0) {   
        // all resources have been used: return list of anagrams        
        return t.word;
    } else {
        res = [];
        c = 0;

        if (str.length > 0) {
            c = str[0];

            // regular traversal: use current letter and descend
            if (c in t.next) {
                res ~= find_r(t.next[c], str[1:], blanks, visited);
            }

            # partial anagrams: skip current letter and don't descend
            l = 1
            while (l < str.length and str[l] == c) l++;

            res ~= find_r(t, str[l:], blanks, visited);
        }

        if (blanks > 0) {
            // blanks: decrease blanks and descend
            for (i in t.next) {
                if (i < c) {
                    res ~= find_r(t.next[i], str, blanks - 1, visited);
                }
            }
        }

        return res;
    }
}

函数查找（t、str、空格、已访问）
{
//不要重游探索过的道路
键=制作键（t、str、空格）；
如果（输入已访问）返回[]；
访问~=键；
如果（str.length==0，空格==0）{
//所有资源都已使用：返回字谜列表
返回t.word；
}否则{
res=[]；
c=0；
如果（str.length>0）{
c=str[0]；
//常规遍历：使用当前字母并向下遍历
if（t.next中的c）{
res~=find_r（t.next[c]，str[1]，空白，已访问）；
}
#部分字谜：跳过当前字母，不下降
l=1
而（l0）{
//空白：减少空白和下降
for（我在t.next中）{
if（i

（此处，~
表示列表串联或集合插入；[beg=0:end=length]
表示字符串切片；中的测试字典或集合是否包含键。）
一旦你建立了树，这个解决方案在没有空格的情况下是快速的，但是每一个空白和更大的字母池都会变得更坏。一个空白的测试仍然相当快，但是有两个空格，它与你现有的解决方案相当。
<> p>现在拼字游戏中最多有两个空格，这个架子最多只能容纳七个瓦片，所以在实践中可能不那么糟糕。另外一个问题是搜索是否应该考虑两个空白的单词。结果列表会很长，并且包含所有两个字母的单词。玩家可能更感兴趣的是高分数。ng可在单个空白处播放的单词。
如果尚未在单独的线程上执行所有“非UI”工作，请尝试在单独的线程中执行。这是在单独的线程中运行的。