Php 实现简单正则表达式的建议(用于bbcode/geshi解析)
我用PHP制作了一个个人笔记软件,这样我就可以存储和组织我的笔记,并希望有一个简单的格式来编写它们 我在Markdown中做过,但发现它有点混乱,并且没有简单的语法突出显示,所以我以前做过bbcode,并希望实现它 现在,对于我真正想要实现的GeSHi(语法高亮),它需要如下最简单的代码:Php 实现简单正则表达式的建议(用于bbcode/geshi解析),php,syntax-highlighting,geshi,Php,Syntax Highlighting,Geshi,我用PHP制作了一个个人笔记软件,这样我就可以存储和组织我的笔记,并希望有一个简单的格式来编写它们 我在Markdown中做过,但发现它有点混乱,并且没有简单的语法突出显示,所以我以前做过bbcode,并希望实现它 现在,对于我真正想要实现的GeSHi(语法高亮),它需要如下最简单的代码: $geshi = new GeSHi($sourcecode, $language); $geshi->parse_code(); 现在这是最简单的部分,但我想做的是允许我的bbcode调用它 我的当
$geshi = new GeSHi($sourcecode, $language);
$geshi->parse_code();
现在这是最简单的部分,但我想做的是允许我的bbcode调用它
我的当前正则表达式与组合的[syntax=cpp][/syntax]bbcode匹配如下:
preg_replace('#\[syntax=(.*?)\](.*?)\[/syntax\]#si' , 'geshi(\\2,\\1)????', text);
你会注意到,我捕获了语言和内容,我究竟如何将它连接到格西代码
preg_replace似乎只是能够用一个字符串而不是“表达式”来替换它,我不知道如何使用这两行代码来处理捕获的数据
我真的很兴奋这个项目,并希望克服它。使用preg\u match:
$match = preg_match('#\[syntax=(.*?)\](.*?)\[/syntax\]#si', $text);
$geshi = new GeSHi($match[2], $match[1]);
使用预匹配:
$match = preg_match('#\[syntax=(.*?)\](.*?)\[/syntax\]#si', $text);
$geshi = new GeSHi($match[2], $match[1]);
在我看来,你的正则表达式已经正确了。您的问题在于调用,因此我建议创建一个包装器函数:
function geshi($src, $l) {
$geshi = new GeSHi($sourcecode, $language);
$geshi->parse_code();
return $geshi->how_do_I_get_the_results();
}
现在这通常就足够了,但是源代码本身可能包含单引号或双引号。因此,您不能按照需要编写preg\u replace(“…/e”,“geshi(“$2”,“$1”)”,…)
。(请注意,'$1'和'$2'需要引号,因为preg_replace只替换$1,$2占位符,但这需要是有效的php内联代码)
这就是为什么需要使用preg\u replace\u callback
来避免/e exec替换代码中的转义问题。
例如:
preg_replace_callback('#\[syntax=(.*?)\](.*?)\[/syntax\]#si' , 'geshi_replace', $text);
我将制作第二个包装器,但您可以将其与原始代码结合使用:
function geshi_replace($uu) {
return geshi($uu[2], $uu[1]);
}
在我看来,你的正则表达式已经正确了。您的问题在于调用,因此我建议创建一个包装器函数:
function geshi($src, $l) {
$geshi = new GeSHi($sourcecode, $language);
$geshi->parse_code();
return $geshi->how_do_I_get_the_results();
}
现在这通常就足够了,但是源代码本身可能包含单引号或双引号。因此,您不能按照需要编写preg\u replace(“…/e”,“geshi(“$2”,“$1”)”,…)
。(请注意,'$1'和'$2'需要引号,因为preg_replace只替换$1,$2占位符,但这需要是有效的php内联代码)
这就是为什么需要使用preg\u replace\u callback
来避免/e exec替换代码中的转义问题。
例如:
preg_replace_callback('#\[syntax=(.*?)\](.*?)\[/syntax\]#si' , 'geshi_replace', $text);
我将制作第二个包装器,但您可以将其与原始代码结合使用:
function geshi_replace($uu) {
return geshi($uu[2], $uu[1]);
}
我不久前写过这个类,这个类的原因是允许简单的定制/解析。也许有点过分了,但效果很好,我的应用程序需要它。用法非常简单:
$geshiH = new Geshi_Helper();
$text = $geshiH->geshi($text); // this assumes that the text should be parsed (ie inline syntaxes)
----或----
我不得不从我拥有的其他自定义项中进行一些切分,但切分过程中没有语法错误,应该可以正常工作。请随意使用它
<?php
require_once 'Geshi/geshi.php';
class Geshi_Helper
{
/**
* @var array Array of matches from the code block.
*/
private $_codeMatches = array();
private $_token = "";
private $_count = 1;
public function __construct()
{
/* Generate a unique hash token for replacement) */
$this->_token = md5(time() . rand(9999,9999999));
}
/**
* Performs syntax highlights using geshi library to the content.
*
* @param string $content - The context to parse
* @return string Syntax Highlighted content
*/
public function geshi($content, $lang=null)
{
if (!is_null($lang)) {
/* Given the returned results 0 is not set, adding the "" should make this compatible */
$content = $this->_highlightSyntax(array("", strtolower($lang), $content));
}else {
/* Need to replace this prior to the code replace for nobbc */
$content = preg_replace('~\[nobbc\](.+?)\[/nobbc\]~ie', '\'[nobbc]\' . strtr(\'$1\', array(\'[\' => \'[\', \']\' => \']\', \':\' => \':\', \'@\' => \'@\')) . \'[/nobbc]\'', $content);
/* For multiple content we have to handle the br's, hence the replacement filters */
$content = $this->_preFilter($content);
/* Reverse the nobbc markup */
$content = preg_replace('~\[nobbc\](.+?)\[/nobbc\]~ie', 'strtr(\'$1\', array(\'&#91;\' => \'[\', \'&#93;\' => \']\', \'&#58;\' => \':\', \'&#64;\' => \'@\'))', $content);
$content = $this->_postFilter($content);
}
return $content;
}
/**
* Performs syntax highlights using geshi library to the content.
* If it is unknown the number of blocks, use highlightContent
* instead.
*
* @param string $content - The code block to parse
* @param string $language - The language to highlight with
* @return string Syntax Highlighted content
* @todo Add any extra / customization styling here.
*/
private function _highlightSyntax($contentArray)
{
$codeCount = $contentArray[1];
/* If the count is 2 we are working with the filter */
if (count($contentArray) == 2) {
$contentArray = $this->_codeMatches[$contentArray[1]];
}
/* for default [syntax] */
if ($contentArray[1] == "")
$contentArray[1] = "php";
/* Grab the language */
$language = (isset($contentArray[1]))?$contentArray[1]:'text';
/* Remove leading spaces to avoid problems */
$content = ltrim($contentArray[2]);
/* Parse the code to be highlighted */
$geshi = new GeSHi($content, strtolower($language));
return $geshi->parse_code();
}
/**
* Substitute the code blocks for formatting to be done without
* messing up the code.
*
* @param array $match - Referenced array of items to substitute
* @return string Substituted content
*/
private function _substitute(&$match)
{
$index = sprintf("%02d", $this->_count++);
$this->_codeMatches[$index] = $match;
return "----" . $this->_token . $index . "----";
}
/**
* Removes the code from the rest of the content to apply other filters.
*
* @param string $content - The content to filter out the code lines
* @return string Content with code removed.
*/
private function _preFilter($content)
{
return preg_replace_callback("#\s*\[syntax=(.*?)\](.*?)\[/syntax\]\s*#siU", array($this, "_substitute"), $content);
}
/**
* Replaces the code after the filters have been ran.
*
* @param string $content - The content to replace the code lines
* @return string Content with code re-applied.
*/
private function _postFilter($content)
{
/* using dashes to prevent the old filtered tag being escaped */
return preg_replace_callback("/----\s*" . $this->_token . "(\d{2})\s*----/si", array($this, "_highlightSyntax"), $content);
}
}
?>
我不久前写过这个类,这个类的原因是允许轻松定制/解析。也许有点过分了,但效果很好,我的应用程序需要它。用法非常简单:
$geshiH = new Geshi_Helper();
$text = $geshiH->geshi($text); // this assumes that the text should be parsed (ie inline syntaxes)
----或----
我不得不从我拥有的其他自定义项中进行一些切分,但切分过程中没有语法错误,应该可以正常工作。请随意使用它
<?php
require_once 'Geshi/geshi.php';
class Geshi_Helper
{
/**
* @var array Array of matches from the code block.
*/
private $_codeMatches = array();
private $_token = "";
private $_count = 1;
public function __construct()
{
/* Generate a unique hash token for replacement) */
$this->_token = md5(time() . rand(9999,9999999));
}
/**
* Performs syntax highlights using geshi library to the content.
*
* @param string $content - The context to parse
* @return string Syntax Highlighted content
*/
public function geshi($content, $lang=null)
{
if (!is_null($lang)) {
/* Given the returned results 0 is not set, adding the "" should make this compatible */
$content = $this->_highlightSyntax(array("", strtolower($lang), $content));
}else {
/* Need to replace this prior to the code replace for nobbc */
$content = preg_replace('~\[nobbc\](.+?)\[/nobbc\]~ie', '\'[nobbc]\' . strtr(\'$1\', array(\'[\' => \'[\', \']\' => \']\', \':\' => \':\', \'@\' => \'@\')) . \'[/nobbc]\'', $content);
/* For multiple content we have to handle the br's, hence the replacement filters */
$content = $this->_preFilter($content);
/* Reverse the nobbc markup */
$content = preg_replace('~\[nobbc\](.+?)\[/nobbc\]~ie', 'strtr(\'$1\', array(\'&#91;\' => \'[\', \'&#93;\' => \']\', \'&#58;\' => \':\', \'&#64;\' => \'@\'))', $content);
$content = $this->_postFilter($content);
}
return $content;
}
/**
* Performs syntax highlights using geshi library to the content.
* If it is unknown the number of blocks, use highlightContent
* instead.
*
* @param string $content - The code block to parse
* @param string $language - The language to highlight with
* @return string Syntax Highlighted content
* @todo Add any extra / customization styling here.
*/
private function _highlightSyntax($contentArray)
{
$codeCount = $contentArray[1];
/* If the count is 2 we are working with the filter */
if (count($contentArray) == 2) {
$contentArray = $this->_codeMatches[$contentArray[1]];
}
/* for default [syntax] */
if ($contentArray[1] == "")
$contentArray[1] = "php";
/* Grab the language */
$language = (isset($contentArray[1]))?$contentArray[1]:'text';
/* Remove leading spaces to avoid problems */
$content = ltrim($contentArray[2]);
/* Parse the code to be highlighted */
$geshi = new GeSHi($content, strtolower($language));
return $geshi->parse_code();
}
/**
* Substitute the code blocks for formatting to be done without
* messing up the code.
*
* @param array $match - Referenced array of items to substitute
* @return string Substituted content
*/
private function _substitute(&$match)
{
$index = sprintf("%02d", $this->_count++);
$this->_codeMatches[$index] = $match;
return "----" . $this->_token . $index . "----";
}
/**
* Removes the code from the rest of the content to apply other filters.
*
* @param string $content - The content to filter out the code lines
* @return string Content with code removed.
*/
private function _preFilter($content)
{
return preg_replace_callback("#\s*\[syntax=(.*?)\](.*?)\[/syntax\]\s*#siU", array($this, "_substitute"), $content);
}
/**
* Replaces the code after the filters have been ran.
*
* @param string $content - The content to replace the code lines
* @return string Content with code re-applied.
*/
private function _postFilter($content)
{
/* using dashes to prevent the old filtered tag being escaped */
return preg_replace_callback("/----\s*" . $this->_token . "(\d{2})\s*----/si", array($this, "_highlightSyntax"), $content);
}
}
?>
我希望我能+10,我一定会选择并使用这个类。谢谢。其中一个正则表达式我必须将分隔符从/
替换为#
,它工作得很好:)哪一个?我会修正它,这样其他人就不会有同样的困惑了。请原谅这种糟糕的格式,但是这个函数:private function\u preFilter($content){return preg\u replace\u callback('.\s*\[syntax=(.*)(.*.[/syntax\]\s*.\siU',array($this,“\u substitute”),$content)}
否则PHP会抱怨一堆未知的修饰符,比如y,b等。我希望我能+10,我一定会选择并使用这门课。谢谢。其中一个正则表达式我必须将分隔符从/
替换为#
,它工作得很好:)哪一个?我会修正它,这样其他人就不会有同样的困惑了。请原谅这种糟糕的格式,但是这个函数:private function\u preFilter($content){return preg\u replace\u callback('.\s*\[syntax=(.*)(.*.[/syntax\]\s*.\siU',array($this,“\u substitute”),$content)}
否则PHP会抱怨一堆未知的修饰符,比如y,这回答了我原来的问题,谢谢!我写得这么晚,找不到任何解决办法,这很有意义:P这回答了我原来的问题,谢谢!我写得这么晚,找不到任何解决办法,这是有道理的:P