regexp（java）中的错误在哪里？_Java_Regex

regexp（java）中的错误在哪里？

java regex

regexp（java）中的错误在哪里？,java,regex,Java,Regex,我的xml文件如下所示： <?xml version="1.0" encoding="UTF-8"?> <stylesheet xmlns="http://www.w3.org/1999/XSL/Transform" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> <xsl:output indent="yes"/> <xsl:template match="

我的xml文件如下所示：

<?xml version="1.0" encoding="UTF-8"?>
<stylesheet xmlns="http://www.w3.org/1999/XSL/Transform" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
    <xsl:output indent="yes"/>
    <xsl:template match="/">
        <html xmlns="http://www.w3.org/1999/xhtml">
            <head>
                <meta charset="UTF-8" content="text/html" http-equiv="Content-Type"/>
            </head>
            <body>


<div>&nbsp;</div>

            Hello body content !!

            </body>
        </html>
    </xsl:template>
    <xsl:template name="br-replace">
        <xsl:param name="word"/>
        <xsl:choose>
            <xsl:when test="contains($word,'&#xA;')">
                <xsl:value-of select="substring-before($word,'&#xA;')"/>
                <br xmlns="http://www.w3.org/1999/xhtml"/>
                <xsl:call-template name="br-replace">
                    <xsl:with-param name="word" select="substring-after($word,'&#xA;')"/>
                </xsl:call-template>
            </xsl:when>
            <xsl:otherwise>
                <xsl:value-of select="$word"/>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
    <xsl:template name="format-date">
        <xsl:param name="word"/>
        <xsl:value-of select="substring($word, 1, 10)"/>
    </xsl:template>
</stylesheet>


你好身体内容！！

我尝试将其分为三部分：

介于

和之间的文本



Java代码：
Matcher before = Pattern.compile("(.*?)<body>", Pattern.MULTILINE | Pattern.DOTALL | Pattern.CASE_INSENSITIVE)
                .matcher(input);
        String beforeStr = null;
        if (before.find()) {
            beforeStr = before.group(1);
        }

        Matcher after = Pattern.compile("</body>(.*?)", Pattern.MULTILINE | Pattern.DOTALL | Pattern.CASE_INSENSITIVE)
                .matcher(input);
        String afterStr = null;
        if (after.find()) {
            afterStr = after.group(1);
        }

        Matcher body = Pattern.compile("<body>(.*?)</body>",
                Pattern.MULTILINE | Pattern.DOTALL | Pattern.CASE_INSENSITIVE).matcher(input);
            String bodyStr = null;
        if (body.find()) {
            bodyStr= body.group(1);
        }

Matcher before=Pattern.compile（（.*），Pattern.MULTILINE | Pattern.DOTALL | Pattern
.匹配器（输入）；
字符串beforeStr=null；
if（在.find（）之前）{
beforeStr=before.组（1）；
}
Matcher after=Pattern.compile（（.*），Pattern.MULTILINE | Pattern.DOTALL | Pattern.Pattern.不区分大小写）
.匹配器（输入）；
字符串afterStr=null；
if（在.find（）之后）{
afterStr=after.组（1）；
}
Matcher body=Pattern.compile（（.*），
Pattern.MULTILINE | Pattern.DOTALL | Pattern.不区分大小写）.matcher（输入）；
字符串bodyStr=null；
if（body.find（））{
bodyStr=body.group（1）；
}

知道为什么字符串'afterStr'是空的吗？模式有问题吗？
非贪婪量词，右边没有
"</body>(.*?)"
           ^matches as little as possible. In this case, 0 characters.

“（.*）”
^尽可能少地匹配。在本例中，0个字符。

只需使用贪婪匹配：
</body>(.*)

（*）

上面的内容可以满足您的需要。
非贪婪量词，没有右边的内容
"</body>(.*?)"
           ^matches as little as possible. In this case, 0 characters.

“（.*）”
^尽可能少地匹配。在本例中，0个字符。

只需使用贪婪匹配：
</body>(.*)

（*）

上面的内容可以满足您的需要。
如果您要以文本方式而不是使用XML解析器来完成此操作，那么只使用索引和子字符串
不是更容易吗？Regex是一个错误的工具，但是如果您打算使用错误的工具，那么可以选择一个更好的错误工具。：-）
将您的代码与此进行比较（假设input
是字符串）：
int indexOfBodyStart=input.indexOf（“”）；
int indexOfBodyEnd=input.indexOf（“”）；
字符串beforeBody=input.substring（0，indexOfBodyStart）；
String body=input.substring（indexOfBodyStart+6，indexOfBodyEnd）；
字符串afterBody=input.substring（indexOfBodyEnd+7）；

与正则表达式解决方案相比，这或多或少都会失败。（例如，如果文本
出现在实际正文之前的引号内，或
出现在正文结尾之前，则两种解决方案都将失败。）
标记此CW是因为您特别询问了regex。
如果您要以文本方式而不是使用XML解析器来执行此操作，那么只使用indexOf
和子字符串
不是更容易吗？Regex是一个错误的工具，但是如果您打算使用错误的工具，那么可以选择一个更好的错误工具。：-）
将您的代码与此进行比较（假设input
是字符串）：
int indexOfBodyStart=input.indexOf（“”）；
int indexOfBodyEnd=input.indexOf（“”）；
字符串beforeBody=input.substring（0，indexOfBodyStart）；
String body=input.substring（indexOfBodyStart+6，indexOfBodyEnd）；
字符串afterBody=input.substring（indexOfBodyEnd+7）；

与正则表达式解决方案相比，这或多或少都会失败。（例如，如果文本
出现在实际正文之前的引号内，或
出现在正文结尾之前，则两种解决方案都将失败。）
标记此CW是因为您特别询问了正则表达式。
为什么不使用xml解析器而不是使用正则表达式解析xml？不需要使用模式。顺便说一句，多行，您的正则表达式不包含任何^
或$
锚。您也不需要在*
之后添加？
*
也包含零次出现。为什么不使用xml解析器而不是使用正则表达式解析xml？不需要使用模式。顺便说一句，多行
，您的正则表达式不包含任何^
或$
锚。而且您也不需要在*
之后添加？
<代码>*
也包含零次出现。我不同意该解决方案。更简单的方法是使用贪婪的量词：*
——它将自动匹配字符串的其余部分<代码>[\d\d]
仅在JavaScript中需要，而JavaScript中没有DOTALL
选项。@TimPietzcker我对此感到困惑。谢谢。我不同意这个解决方案。更简单的方法是使用贪婪的量词：*
——它将自动匹配字符串的其余部分<代码>[\d\d]
仅在JavaScript中需要，而JavaScript中没有DOTALL
选项。@TimPietzcker我对此感到困惑。谢谢