Php 实现简单正则表达式的建议（用于bbcode/geshi解析）_Php_Syntax Highlighting_Geshi

Php 实现简单正则表达式的建议（用于bbcode/geshi解析）

php

Php 实现简单正则表达式的建议（用于bbcode/geshi解析）,php,syntax-highlighting,geshi,Php,Syntax Highlighting,Geshi,我用PHP制作了一个个人笔记软件，这样我就可以存储和组织我的笔记，并希望有一个简单的格式来编写它们我在Markdown中做过，但发现它有点混乱，并且没有简单的语法突出显示，所以我以前做过bbcode，并希望实现它现在，对于我真正想要实现的GeSHi（语法高亮），它需要如下最简单的代码： $geshi = new GeSHi($sourcecode, $language); $geshi->parse_code(); 现在这是最简单的部分，但我想做的是允许我的bbcode调用它我的当

我用PHP制作了一个个人笔记软件，这样我就可以存储和组织我的笔记，并希望有一个简单的格式来编写它们

我在Markdown中做过，但发现它有点混乱，并且没有简单的语法突出显示，所以我以前做过bbcode，并希望实现它

现在，对于我真正想要实现的GeSHi（语法高亮），它需要如下最简单的代码：

$geshi = new GeSHi($sourcecode, $language);
$geshi->parse_code();

现在这是最简单的部分，但我想做的是允许我的bbcode调用它

我的当前正则表达式与组合的[syntax=cpp][/syntax]bbcode匹配如下：

preg_replace('#\[syntax=(.*?)\](.*?)\[/syntax\]#si' , 'geshi(\\2,\\1)????', text);

你会注意到，我捕获了语言和内容，我究竟如何将它连接到格西代码

preg_replace似乎只是能够用一个字符串而不是“表达式”来替换它，我不知道如何使用这两行代码来处理捕获的数据

我真的很兴奋这个项目，并希望克服它。

使用preg\u match：

$match = preg_match('#\[syntax=(.*?)\](.*?)\[/syntax\]#si', $text);
$geshi = new GeSHi($match[2], $match[1]);

使用预匹配：

$match = preg_match('#\[syntax=(.*?)\](.*?)\[/syntax\]#si', $text);
$geshi = new GeSHi($match[2], $match[1]);

在我看来，你的正则表达式已经正确了。您的问题在于调用，因此我建议创建一个包装器函数：

function geshi($src, $l) {
    $geshi = new GeSHi($sourcecode, $language);
    $geshi->parse_code();
    return $geshi->how_do_I_get_the_results();
}

现在这通常就足够了，但是源代码本身可能包含单引号或双引号。因此，您不能按照需要编写

preg\u replace（“…/e”，“geshi（“$2”，“$1”）”，…）

。（请注意，'$1'和'$2'需要引号，因为preg_replace只替换$1，$2占位符，但这需要是有效的php内联代码）

这就是为什么需要使用

preg\u replace\u callback

来避免/e exec替换代码中的转义问题。例如：

preg_replace_callback('#\[syntax=(.*?)\](.*?)\[/syntax\]#si' , 'geshi_replace', $text);

我将制作第二个包装器，但您可以将其与原始代码结合使用：

function geshi_replace($uu) {
    return geshi($uu[2], $uu[1]);
}

在我看来，你的正则表达式已经正确了。您的问题在于调用，因此我建议创建一个包装器函数：

function geshi($src, $l) {
    $geshi = new GeSHi($sourcecode, $language);
    $geshi->parse_code();
    return $geshi->how_do_I_get_the_results();
}

现在这通常就足够了，但是源代码本身可能包含单引号或双引号。因此，您不能按照需要编写

preg\u replace（“…/e”，“geshi（“$2”，“$1”）”，…）

。（请注意，'$1'和'$2'需要引号，因为preg_replace只替换$1，$2占位符，但这需要是有效的php内联代码）

这就是为什么需要使用

preg\u replace\u callback

来避免/e exec替换代码中的转义问题。例如：

preg_replace_callback('#\[syntax=(.*?)\](.*?)\[/syntax\]#si' , 'geshi_replace', $text);

我将制作第二个包装器，但您可以将其与原始代码结合使用：

function geshi_replace($uu) {
    return geshi($uu[2], $uu[1]);
}

我不久前写过这个类，这个类的原因是允许简单的定制/解析。也许有点过分了，但效果很好，我的应用程序需要它。用法非常简单：

$geshiH = new Geshi_Helper();
$text = $geshiH->geshi($text); // this assumes that the text should be parsed (ie inline syntaxes)

----或----

我不得不从我拥有的其他自定义项中进行一些切分，但切分过程中没有语法错误，应该可以正常工作。请随意使用它

<?php

require_once 'Geshi/geshi.php';

class Geshi_Helper 
{
    /**
     * @var array Array of matches from the code block.
     */
    private $_codeMatches = array();

    private $_token = "";

    private $_count = 1;

    public function __construct()
    {
        /* Generate a unique hash token for replacement) */
        $this->_token = md5(time() . rand(9999,9999999));
    }

    /**
     * Performs syntax highlights using geshi library to the content.
     *
     * @param string $content - The context to parse
     * @return string Syntax Highlighted content
     */
    public function geshi($content, $lang=null)
    {
        if (!is_null($lang)) {
            /* Given the returned results 0 is not set, adding the "" should make this compatible */
            $content = $this->_highlightSyntax(array("", strtolower($lang), $content));
        }else {
            /* Need to replace this prior to the code replace for nobbc */
            $content = preg_replace('~\[nobbc\](.+?)\[/nobbc\]~ie', '\'[nobbc]\' . strtr(\'$1\', array(\'[\' => \'&#91;\', \']\' => \'&#93;\', \':\' => \'&#58;\', \'@\' => \'&#64;\')) . \'[/nobbc]\'', $content);

            /* For multiple content we have to handle the br's, hence the replacement filters */
            $content = $this->_preFilter($content);

            /* Reverse the nobbc markup */
            $content = preg_replace('~\[nobbc\](.+?)\[/nobbc\]~ie', 'strtr(\'$1\', array(\'&amp;#91;\' => \'[\', \'&amp;#93;\' => \']\', \'&amp;#58;\' => \':\', \'&amp;#64;\' => \'@\'))', $content);

            $content = $this->_postFilter($content);
        }

        return $content;
    }

    /**
     * Performs syntax highlights using geshi library to the content.
     * If it is unknown the number of blocks, use highlightContent
     * instead.
     *
     * @param string $content - The code block to parse
     * @param string $language - The language to highlight with
     * @return string Syntax Highlighted content
     * @todo Add any extra / customization styling here.
     */
    private function _highlightSyntax($contentArray)
    {
        $codeCount = $contentArray[1];

        /* If the count is 2 we are working with the filter */
        if (count($contentArray) == 2) {
            $contentArray = $this->_codeMatches[$contentArray[1]];
        }

        /* for default [syntax] */
        if ($contentArray[1] == "")
            $contentArray[1] = "php";

        /* Grab the language */
        $language = (isset($contentArray[1]))?$contentArray[1]:'text';

        /* Remove leading spaces to avoid problems */
        $content = ltrim($contentArray[2]);

        /* Parse the code to be highlighted */
        $geshi = new GeSHi($content, strtolower($language));
        return $geshi->parse_code();
    }

    /**
     * Substitute the code blocks for formatting to be done without
     * messing up the code.
     *
     * @param array $match - Referenced array of items to substitute
     * @return string Substituted content
     */
    private function _substitute(&$match)
    {
        $index = sprintf("%02d", $this->_count++);
        $this->_codeMatches[$index] = $match;
        return "----" . $this->_token . $index . "----";
    }

    /**
     * Removes the code from the rest of the content to apply other filters.
     *
     * @param string $content - The content to filter out the code lines
     * @return string Content with code removed.
     */
    private function _preFilter($content)
    {
        return preg_replace_callback("#\s*\[syntax=(.*?)\](.*?)\[/syntax\]\s*#siU", array($this, "_substitute"), $content);
    }

    /**
     * Replaces the code after the filters have been ran.
     *
     * @param string $content - The content to replace the code lines
     * @return string Content with code re-applied.
     */
    private function _postFilter($content)
    {
        /* using dashes to prevent the old filtered tag being escaped */
        return preg_replace_callback("/----\s*" . $this->_token . "(\d{2})\s*----/si", array($this, "_highlightSyntax"), $content);
    }
}
?>

我不久前写过这个类，这个类的原因是允许轻松定制/解析。也许有点过分了，但效果很好，我的应用程序需要它。用法非常简单：
$geshiH = new Geshi_Helper();
$text = $geshiH->geshi($text); // this assumes that the text should be parsed (ie inline syntaxes)

----或----
我不得不从我拥有的其他自定义项中进行一些切分，但切分过程中没有语法错误，应该可以正常工作。请随意使用它
<?php

require_once 'Geshi/geshi.php';

class Geshi_Helper 
{
    /**
     * @var array Array of matches from the code block.
     */
    private $_codeMatches = array();

    private $_token = "";

    private $_count = 1;

    public function __construct()
    {
        /* Generate a unique hash token for replacement) */
        $this->_token = md5(time() . rand(9999,9999999));
    }

    /**
     * Performs syntax highlights using geshi library to the content.
     *
     * @param string $content - The context to parse
     * @return string Syntax Highlighted content
     */
    public function geshi($content, $lang=null)
    {
        if (!is_null($lang)) {
            /* Given the returned results 0 is not set, adding the "" should make this compatible */
            $content = $this->_highlightSyntax(array("", strtolower($lang), $content));
        }else {
            /* Need to replace this prior to the code replace for nobbc */
            $content = preg_replace('~\[nobbc\](.+?)\[/nobbc\]~ie', '\'[nobbc]\' . strtr(\'$1\', array(\'[\' => \'&#91;\', \']\' => \'&#93;\', \':\' => \'&#58;\', \'@\' => \'&#64;\')) . \'[/nobbc]\'', $content);

            /* For multiple content we have to handle the br's, hence the replacement filters */
            $content = $this->_preFilter($content);

            /* Reverse the nobbc markup */
            $content = preg_replace('~\[nobbc\](.+?)\[/nobbc\]~ie', 'strtr(\'$1\', array(\'&amp;#91;\' => \'[\', \'&amp;#93;\' => \']\', \'&amp;#58;\' => \':\', \'&amp;#64;\' => \'@\'))', $content);

            $content = $this->_postFilter($content);
        }

        return $content;
    }

    /**
     * Performs syntax highlights using geshi library to the content.
     * If it is unknown the number of blocks, use highlightContent
     * instead.
     *
     * @param string $content - The code block to parse
     * @param string $language - The language to highlight with
     * @return string Syntax Highlighted content
     * @todo Add any extra / customization styling here.
     */
    private function _highlightSyntax($contentArray)
    {
        $codeCount = $contentArray[1];

        /* If the count is 2 we are working with the filter */
        if (count($contentArray) == 2) {
            $contentArray = $this->_codeMatches[$contentArray[1]];
        }

        /* for default [syntax] */
        if ($contentArray[1] == "")
            $contentArray[1] = "php";

        /* Grab the language */
        $language = (isset($contentArray[1]))?$contentArray[1]:'text';

        /* Remove leading spaces to avoid problems */
        $content = ltrim($contentArray[2]);

        /* Parse the code to be highlighted */
        $geshi = new GeSHi($content, strtolower($language));
        return $geshi->parse_code();
    }

    /**
     * Substitute the code blocks for formatting to be done without
     * messing up the code.
     *
     * @param array $match - Referenced array of items to substitute
     * @return string Substituted content
     */
    private function _substitute(&$match)
    {
        $index = sprintf("%02d", $this->_count++);
        $this->_codeMatches[$index] = $match;
        return "----" . $this->_token . $index . "----";
    }

    /**
     * Removes the code from the rest of the content to apply other filters.
     *
     * @param string $content - The content to filter out the code lines
     * @return string Content with code removed.
     */
    private function _preFilter($content)
    {
        return preg_replace_callback("#\s*\[syntax=(.*?)\](.*?)\[/syntax\]\s*#siU", array($this, "_substitute"), $content);
    }

    /**
     * Replaces the code after the filters have been ran.
     *
     * @param string $content - The content to replace the code lines
     * @return string Content with code re-applied.
     */
    private function _postFilter($content)
    {
        /* using dashes to prevent the old filtered tag being escaped */
        return preg_replace_callback("/----\s*" . $this->_token . "(\d{2})\s*----/si", array($this, "_highlightSyntax"), $content);
    }
}
?>

我希望我能+10，我一定会选择并使用这个类。谢谢。其中一个正则表达式我必须将分隔符从/
替换为#
，它工作得很好：）哪一个？我会修正它，这样其他人就不会有同样的困惑了。请原谅这种糟糕的格式，但是这个函数：private function\u preFilter（$content）{return preg\u replace\u callback（'.\s*\[syntax=（.*）（.*.[/syntax\]\s*.\siU'，array（$this，“\u substitute”），$content）}
否则PHP会抱怨一堆未知的修饰符，比如y，b等。我希望我能+10，我一定会选择并使用这门课。谢谢。其中一个正则表达式我必须将分隔符从/
替换为#
，它工作得很好：）哪一个？我会修正它，这样其他人就不会有同样的困惑了。请原谅这种糟糕的格式，但是这个函数：private function\u preFilter（$content）{return preg\u replace\u callback（'.\s*\[syntax=（.*）（.*.[/syntax\]\s*.\siU'，array（$this，“\u substitute”），$content）}
否则PHP会抱怨一堆未知的修饰符，比如y，这回答了我原来的问题，谢谢！我写得这么晚，找不到任何解决办法，这很有意义：P这回答了我原来的问题，谢谢！我写得这么晚，找不到任何解决办法，这是有道理的：P