Java 如果发现正则表达式，则拒绝_Java_Regex

Java 如果发现正则表达式，则拒绝

java regex

Java 如果发现正则表达式，则拒绝,java,regex,Java,Regex,我在这里已经得到了一些帮助，但我有一个稍微不同的问题。我正在寻找创建DocumentBuilderFactory但未限制ExpandEntityReferences的案例。我有以下正则表达式： (?x) # finds DocumentBuilderFactory creation and pulls out the variable name # of the form DocumentBuilderFactory VARNAME = DocumentBuilderFactory.newIn

我在这里已经得到了一些帮助，但我有一个稍微不同的问题。我正在寻找创建

DocumentBuilderFactory

但未限制

ExpandEntityReferences

的案例。我有以下正则表达式：

(?x)

# finds DocumentBuilderFactory creation and pulls out the variable name
# of the form DocumentBuilderFactory VARNAME = DocumentBuilderFactory.newInstance
# then checks if that variable name has one of three acceptable ways to stop XXE attacks
# matches any instance where the variable is initialized, but not restricted

(?:
   # This is for DocumentBuilderFactory VARNAME = DocumentBuilderFactory.newInstance with many possible alternates
   DocumentBuilderFactory
   [\s]+?
   (\w+)
   [\s]*?
   =
   [\s]*?
   (?:.*?DocumentBuilderFactory)
   [.\s]+
   newInstance.*

   # checks that the var name is NOT (using ?!) using one of the acceptable rejection methods
   (?!\1[.\s]+
      (?:setFeature\s*\(\s*"http://xml.org/sax/features/external-general-entities"\s*,\s*false\s*\)
        |setFeature\s*\(\s*"http://apache.org/xml/features/disallow-doctype-decl"\s*,\s*false\s*\)
        |setExpandEntityReferences\s*\(\s*false\s*\))
   )
)

测试文件可能如下所示：

// Set the parser properties
  javax.xml.parsers.DocumentBuilderFactory factory = 
    javax.xml.parsers.DocumentBuilderFactory.newInstance();
  factory.setNamespaceAware(true);
  factory.setValidating(false);
  factory.setExpandEntityReferences(false);
  factory.setIgnoringComments(true);
  factory.setIgnoringElementContentWhitespace(true);
  factory.setCoalescing(true);
  javax.xml.parsers.DocumentBuilder builder = factory.newDocumentBuilder();

有没有办法让这个正则表达式在此文件上运行，但正则表达式失败（因为它正确地设置了

factory.setExpandEntityReferences（false）；

更新：

(?:
   DocumentBuilderFactory
   \s+
   (\w+)
   \s*
   =
   \s*
   (?:.*?DocumentBuilderFactory)
   \s*.\s*
   newInstance.*
   (?:[\s\S](?!
      \1\s*.\s*
      (?:setFeature\s*\(\s*"http://xml.org/sax/features/external-general-entities"\s*,\s*false\s*\)
      |setFeature\s*\(\s*"http://apache.org/xml/features/disallow-doctype-decl"\s*,\s*false\s*\)
      |setExpandEntityReferences\s*\(\s*false\s*\))
   ))*$
)

但是，如果我将factory.setExpandEntityReferences（false）拼错为factory.setExpandEntity###References（false），我希望能找到正则表达式，但事实并非如此。有没有办法让此功能正常工作？

对不存在的字符串进行测试：它的基本意思是，“从这一点开始，每个字符后面都不能跟

xyz

”，因为

不匹配换行符，所以您可能希望将其概括为：

(?:[\s\S](?!xyz))*$
   ^^^^^^

（它是互补集的并集，因此真正是所有字符的并集。）

要将此应用于您的案例，只需将

xyz

替换为您不希望出现在任何地方的内容：

   # checks that the var name is NOT (using ?!) using one of the acceptable rejection methods
   (?:[\s\S](?!
       \1[.\s]+
       (?:setFeature\s*\(\s*"http://xml.org/sax/features/external-general-entities"\s*,\s*false\s*\)
         |setFeature\s*\(\s*"http://apache.org/xml/features/disallow-doctype-decl"\s*,\s*false\s*\)
         |setExpandEntityReferences\s*\(\s*false\s*\))
   ))*$

使用单词边界匹配整个单词（如标识符）：当然，当使用例如

工厂

时，您不会希望匹配

旧工厂

！使用单词边界来确保捕获整个单词

在您的情况下，只需在

\1

之前添加一个

\b

：

\b\1

简化角色类并转义文字点：如评论中所述，

\s

包括

\r

和

\n

，因此您可以将

[\s\r\n]

重写为

\s

（不带括号）

此外，您还需要更改实例，如

newInstance.*

到

通配符在字符类中的行为与

\s

或

\w

不同：

仅表示字符类中的文字点。

您不应该在

\r\n\s\w

中使用双反斜杠而不是一个，因为这是Java吗？请注意，您不需要在字符类中转义一个点，所以

[。]

有效并将匹配一个点：）若要添加到@HamZa，您只需编写

\s

而不是

[\s\r\n]

@HamZa是的，您是正确的，我删除了它以使其成为纯正则表达式，但是的，在我的文件中它是双斜杠的。@acheong87真棒！我已经改变了，我已经添加了你的信息，并更新了我的问题。这有点奏效，但我可能误解了你的解释。你能看一下吗？你能试着用

（？！[\s\s]）

替换

吗？你的正则表达式正适合我。确保你避开了你的反斜杠，同时也试着避开你的前斜杠。下面是一个在线示例，其中单词拼写错误（####），因此段匹配：。这是同一个例子，但是单词拼写正确，因此匹配失败：。我很困惑。它在纯java中工作，但在我的java用例中不起作用。我想这可能是正则表达式引擎的不同之处。不管怎样，你都是对的。我将找出为什么我的结果在每个java案例中都不显示相同的。谢谢，算了吧。我忘了先输入（？x）。你说得对。太棒了。谢谢你的帮助

newInstance.*

newInstance[.]*