Java：如何实现通配符匹配？_Java_Algorithm_Wildcard

Java：如何实现通配符匹配？

java algorithm

Java：如何实现通配符匹配？,java,algorithm,wildcard,Java,Algorithm,Wildcard,我正在研究如何在BST中找到最接近目标的k值，并遇到了以下规则实现： “？”匹配任何单个字符 “*”匹配任何字符序列（包括空序列）匹配应该覆盖整个输入字符串（而不是部分）功能原型应为：布尔isMatch（常量字符*s，常量字符*p）一些例子： isMatch（“aa”、“a”）→ 假的 isMatch（“aa”、“aa”）→ 真的 isMatch（“aaa”、“aa”）→ 假的 isMatch（“aa”和“*”）→ 真的 isMatch（“aa”、“a*”）→ 真的 isMatch（“a

我正在研究如何在BST中找到最接近目标的k值，并遇到了以下规则实现：

“？”匹配任何单个字符

“*”匹配任何字符序列（包括空序列）

匹配应该覆盖整个输入字符串（而不是部分）

功能原型应为：布尔isMatch（常量字符*s，常量字符*p）

一些例子：

isMatch（“aa”、“a”）→ 假的

isMatch（“aa”、“aa”）→ 真的

isMatch（“aaa”、“aa”）→ 假的

isMatch（“aa”和“*”）→ 真的

isMatch（“aa”、“a*”）→ 真的

isMatch（“ab”和“？*”）→ 真的

isMatch（“aab”、“cab”）→ 假的

代码：

import java.util.*;

public class WildcardMatching {
    boolean isMatch(String s, String p) {
        int i=0, j=0;
        int ii=-1, jj=-1;

        while(i<s.length()) {
            if(j<p.length() && p.charAt(j)=='*') {
                ii=i;
                jj=j;
                j++;
            } else if(j<p.length() && 
                      (s.charAt(i) == p.charAt(j) ||
                       p.charAt(j) == '?')) {
                i++;
                j++;
            } else {
                if(jj==-1) return false;

                j=jj;
                i=ii+1;
            }
        }

        while(j<p.length() && p.charAt(j)=='*') j++;

        return j==p.length();
    }

    public static void main(String args[]) {
        String s = "aab";
        String p = "a*";

        WildcardMatching wcm = new WildcardMatching();
        System.out.println(wcm.isMatch(s, p));
    }
}

import java.util.*；
公共类通配符匹配{
布尔isMatch（字符串s、字符串p）{
int i=0，j=0；
INTII=-1，jj=-1；
而（i它看起来像是ii
和jj
被用来处理通配符“*”，它与任何序列都匹配。它们的初始化为-1作为一个标志：它告诉我们是否已命中一个不匹配的序列，并且当前没有计算“*”。我们可以一次浏览一个示例
请注意，i
与参数s
（原始字符串）相关，j
与参数p
（模式）相关
isMatch（“aa”、“a”）：
这将返回false，因为j看起来像ii
和jj
用于处理与任何序列匹配的通配符“*”。它们的初始化为-1作为一个标志：它告诉我们是否命中了不匹配的序列，并且当前没有计算“*”。我们可以一次一个地浏览您的示例
请注意，i
与参数s
（原始字符串）相关，j
与参数p
（模式）相关
isMatch（“aa”、“a”）：
这将返回false，因为j让我们看看这有点不正常
首先，这是字符串（s
）和通配符模式（p
）的并行迭代，使用变量i
索引s
，使用变量j
索引p

当到达s
的结尾时，while
循环将停止迭代。当这种情况发生时，希望p
的结尾也已到达，在while情况下，它将返回true
（j==p.length（）
）
然而，如果p
以*
结尾，这也是有效的（例如isMatch（“ab”，“ab*”）
），而这就是while（j让我们看一看这个有点混乱）
首先，这是字符串（s
）和通配符模式（p
）的并行迭代，使用变量i
索引s
，使用变量j
索引p

当到达s
的结尾时，while
循环将停止迭代。当这种情况发生时，希望p
的结尾也已到达，在while情况下，它将返回true
（j==p.length（）
）
但是，如果p
以*
结尾，这也是有效的（例如isMatch（“ab”，“ab*”）
），这就是而（j代码在我看来是有问题的。（见下文）
ii
和jj
的表面目的是实现某种形式的回溯
例如，当您尝试将“abcde”与模式“a*e”匹配时，算法将首先将模式中的“a”与输入字符串中的“a”匹配。然后，它将急切地将“*”与字符串的其余部分匹配……并发现它犯了错误。此时，它需要回溯并尝试其他方法
ii
和jj
用于记录要回溯到的点，这些变量的用途是记录新的回溯点或回溯
或者至少，在某种程度上，这可能是作者的意图
while（j在我看来，代码有问题。（见下文）
ii
和jj
的表面目的是实现某种形式的回溯
例如，当您尝试将“abcde”与模式“a*e”匹配时，算法将首先将模式中的“a”与输入字符串中的“a”匹配。然后，它将急切地将“*”与字符串的其余部分匹配……并发现它犯了错误。此时，它需要回溯并尝试其他方法
ii
和jj
用于记录要回溯到的点，这些变量的用途是记录新的回溯点或回溯
或者至少，在某种程度上，这可能是作者的意图
while（j我知道您询问的是BST，但老实说，使用regex也有一种方法可以做到这一点（不是用于竞争性编程，而是在生产环境中使用足够稳定和快速）：
仍然有一个可以实现的优化列表（小写/大写区分，为正在检查的字符串设置最大长度，以防止攻击者让您针对4GB字符串进行检查，…）
我知道您询问的是BST，但老实说，使用regex也有一种方法可以做到这一点（不适用于竞争性编程，但在生产环境中使用足够稳定和快速）：
仍然有一个可以实现的优化列表（小写/大写区分，为正在检查的字符串设置最大长度，以防止攻击者让您针对4GB字符串进行检查，…）.你是想重新发明RegExp的轮子吗？还是Java提供的预构建RegExp就可以了？@Arvind for algo practice，我就是这样实现的。如果
} else {
    if(jj==-1) return false;

    i=++ii;
    j=jj+1;
}

    if(j<p.length() && p.charAt(j)=='*') {
        ii=i;
        jj=j;
        j++;

import java.util.regex.Pattern;
import java.util.regex.Matcher;
public class WildCardMatcher{

    public static void main(String []args){
        // Test
        String urlPattern = "http://*.my-webdomain.???",
               urlToMatch = "http://webmail.my-webdomain.com";
        WildCardMatcher wildCardMatcher = new WildCardMatcher(urlPattern);
        System.out.printf("\"%s\".matches(\"%s\") -> %s%n", urlToMatch, wildCardMatcher, wildCardMatcher.matches(urlToMatch));
    }
     
    private final Pattern p;
    public WildCardMatcher(final String urlPattern){
       Pattern charsToEscape = Pattern.compile("([^*?]+)([*?]*)");
        
       // here we need to escape all the strings that are not "?" or "*", and replace any "?" and "*" with ".?" and ".*"
       Matcher m = charsToEscape.matcher(urlPattern);
       StringBuffer sb = new StringBuffer();
       String replacement, g1, g2;
       while(m.find()){
           g1 = m.group(1);
           g2 = m.group(2);
           // We first have to escape pattern (original string can contain charachters that are invalid for regex), then escaping the '\' charachters that have a special meaning for replacement strings
           replacement = (g1 == null ? "" : Matcher.quoteReplacement(Pattern.quote(g1))) +
                         (g2 == null ? "" : g2.replaceAll("([*?])", ".$1")); // simply replacing "*" and "?"" with ".*" and ".?"
           m.appendReplacement(sb, replacement);
       }
       m.appendTail(sb);
       p = Pattern.compile(sb.toString());
    }
     
    @Override
    public String toString(){
        return p.toString();
    }
     
    public boolean matches(final String urlToMatch){
        return p.matcher(urlToMatch).matches();
    }
}