String 不使用'；不包含给定字符串（大约束）_String_Substring_Trie_Suffix Array

String 不使用'；不包含给定字符串（大约束）

string

String 不使用'；不包含给定字符串（大约束）,string,substring,trie,suffix-array,String,Substring,Trie,Suffix Array,我最近在网上发现了一个有趣的问题。以下是一份简短的声明：请注意，总时限不应为1.00s（时间复杂度q的所有子字符串也将包含该丑陋字符串如果一个丑陋的字符串比子字符串本身长，它将不匹配我的第一次尝试如下所示： String ugly[]; // is provided somehow; at most 500000 with max length of 500000 String student; // the String to cut into substrings, max lengt

我最近在网上发现了一个有趣的问题。以下是一份简短的声明：

请注意，总时限不应为1.00s（时间复杂度<10^8）

现在学生A发现了一个只包含小写字符的字符串。他想给学生B剪一个子串作为礼物。学生B有一个他认为“难看”的字符串列表。你能帮助学生A找到许多方法来剪切不包含任何“丑陋”字符串的子字符串吗。（注意相同的子字符串，但从不同的位置也计算）

我起初认为这是一个简单的问题，但这个限制是相当大的。学生A的字符串的最大长度为100000，而丑陋的字符串最多可能有500000条，最大长度为500000条

我尝试使用后缀trie来解决这个问题，但由于内存限制，失败得很惨。有人能提出一个可能的解决问题的办法吗。这是一种与高级数据结构相关的问题，例如后缀数组

建议使用任何编程语言编写代码，最好使用正确的描述。因为我发现如果有一个真正的代码来做研究会更好。

因为从不同位置开始的相等子字符串算作不同的子字符串，长度为n的字符串的最大子字符串数是n*（n+1）/2。（n个从位置0开始的子字符串，n-1个从位置1开始的子字符串，依此类推）

如果丑陋字符串包含在从位置p开始的长度为q的子字符串中，则从p开始且长度>q的所有子字符串也将包含该丑陋字符串

如果一个丑陋的字符串比子字符串本身长，它将不匹配

我的第一次尝试如下所示：

String ugly[]; // is provided somehow; at most 500000 with max length of 500000
String student; // the String to cut into substrings, max length 100000
long num = 0;

ugly.sort(); // by length

for (int start = 0; start < student.size() - 1, ++start) {
    for (int end = start + 1; end < student.size(); ++end) {
        String s = student.substr(start, end);
        int lgth = s.size();
        int u = 0;
        while (lgth >= ugly[u].size()) {
            if (s.contains(ugly[u])) break;
            ++u;
        }
        if (lgth < ugly[u].size()) {
            ++num; // we checked all potentially matching uglies
        } else {
            break; // leave the inner loop and 
                   // start with the next position
        }
    }
}

注：这基本上就是这个想法。我没有测试任何代码（类似java，但可能不会编译），我只是用某种伪代码翻译了我的纯文本想法。

因为从不同位置开始的相等子字符串算作不同的子字符串，所以长度为n的字符串的最大子字符串数为n*（n+1）/2。（n个从位置0开始的子字符串，n-1个从位置1开始的子字符串，依此类推）

如果丑陋字符串包含在从位置p开始的长度为q的子字符串中，则从p开始且长度>q的所有子字符串也将包含该丑陋字符串

如果一个丑陋的字符串比子字符串本身长，它将不匹配

我的第一次尝试如下所示：

String ugly[]; // is provided somehow; at most 500000 with max length of 500000
String student; // the String to cut into substrings, max length 100000
long num = 0;

ugly.sort(); // by length

for (int start = 0; start < student.size() - 1, ++start) {
    for (int end = start + 1; end < student.size(); ++end) {
        String s = student.substr(start, end);
        int lgth = s.size();
        int u = 0;
        while (lgth >= ugly[u].size()) {
            if (s.contains(ugly[u])) break;
            ++u;
        }
        if (lgth < ugly[u].size()) {
            ++num; // we checked all potentially matching uglies
        } else {
            break; // leave the inner loop and 
                   // start with the next position
        }
    }
}

注：这基本上就是这个想法。我没有测试任何代码（类似java，但可能不会编译），我只是用某种伪代码翻译了我的纯文本想法。

最坏情况下，你的代码时间复杂度是O（500000*500000*100000），我认为它会超过时间限制，n等于要拆分为子字符串的字符串的长度，m等于丑陋模式的数量，则操作的数量为O（n²m）。（m*n*（n+1）/2）。（我没有解释搜索，它又增加了一个n；因此是O（n^4））。但发现丑陋的模式会大大加快速度。你可以做一些事情来减少丑陋模式的数量（例如，可以消除包含较短模式的较长模式），但我也可以假设丑陋模式集无法减少。你可以利用单字符丑陋模式。您可以从原始字符串中剪切所有这些字符串，从而生成一组子字符串。每一个都可以用我的算法处理。由于缩短了长度，速度会快得多。我认为仅仅是耗尽可能的子字符串的部分就会导致时间限制超出。（O（N^2））好吧，无论如何你得数一数。显然，如果你有一个长度为p的student子串与任何丑陋的模式都不匹配，你可以在结果中添加

p（p+1）/2

。因此，也许后退（从某个位置开始的整个子字符串开始，如果命中，则缩短它（使用命中的位置））会加快速度。在最坏的情况下，您的代码时间复杂度是O（500000*500000*100000），我想它会超过时间限制，n等于要拆分为子字符串的字符串的长度，m等于丑陋模式的数量，则操作的数量为O（n²m）。（m*n*（n+1）/2）。（我没有解释搜索，它又增加了一个n；因此是O（n^4））。但发现丑陋的模式会大大加快速度。你可以做一些事情来减少丑陋模式的数量（例如，可以消除包含较短模式的较长模式），但我也可以假设丑陋模式集无法减少。你可以利用单字符丑陋模式。您可以从原始字符串中剪切所有这些字符串，从而生成一组子字符串。每一个都可以用我的算法处理。由于缩短了长度，速度会快得多。我认为仅仅是耗尽可能的子字符串的部分就会导致时间限制超出。（O（N^2））好吧，无论如何你得数一数。显然，如果你有一个长度为p的student子串与任何丑陋的模式都不匹配，你可以在结果中添加

p（p+1）/2

。因此，可能向后移动（从某个位置开始的整个子字符串开始，如果命中，则将其缩短（使用命中的位置））会加快速度。

String ugly[]; // as before
String student; // as before
long num = 0;
Vector substrs = new Vector();

ugly.sort(); // by length
substrs.add(student);

void splitStr(String str2split, String pattern, Vector result)
{
    if (str2split.size() < pattern.size()) {
        result.add(str2split);
        return;
    } else {
        int pos = str2split.contains(pattern); // returns position, -1 if not found
        if (pos >= 0) { // found
            String s1 = str2split.substr(0, pos + pattern.size() - 1);
            String s2 = str2split.substr(pos + 1, str2split.size());
            // add s1 and repeat split on s2
            result.add(s1);
            splitStr(s2, pattern, result);
        } else {
            // not found, entire string is ok
            result.add(str2split);
        }
    }
}

for (int u = 0; u < ugly.size(); ++u) {
    Vector newSubstrs = new Vector();
    String ugly2test = ugly[u];
    for (int i = 0; i < substrs.size(); ++i) {
        String t = substrs.get(i);
        splitStr(t, ugly2test, newSubstrs);
    }
    substrs = newSubstrs;
}

for (int i = 0; i < substrs.size(); ++i) {
    String s = substrs.get(i);
    num += s.size() * (s.size() + 1) / 2;
}