Php 检测字符串中的HTML标记_Php

Php 检测字符串中的HTML标记

php

Php 检测字符串中的HTML标记,php,Php,我需要检测字符串是否包含HTML标记 if(!preg_match('(?<=<)\w+(?=[^<]*?>)', $string)){ return $string; } 我不太懂正则表达式，所以不确定问题出在哪里。我试着逃离\号，但它什么也没做有比正则表达式更好的解决方案吗？如果没有，使用preg_匹配的正确正则表达式是什么？您需要使用一些字符或另一个字符来“分隔”正则表达式。试试这个： if(!preg_match('#(?<=<)\w+(

我需要检测字符串是否包含HTML标记

if(!preg_match('(?<=<)\w+(?=[^<]*?>)', $string)){ 
    return $string;
}

我不太懂正则表达式，所以不确定问题出在哪里。我试着逃离\号，但它什么也没做

有比正则表达式更好的解决方案吗？如果没有，使用preg_匹配的正确正则表达式是什么？

您需要使用一些字符或另一个字符来“分隔”正则表达式。试试这个：

if(!preg_match('#(?<=<)\w+(?=[^<]*?>)#', $string)){ 
    return $string;
}

如果（！preg_match（'#）（？一个简单的解决方案是：
if($string != strip_tags($string)) {
    // contains HTML
}

与正则表达式相比，它的优点是更容易理解，但是我无法评论这两种解决方案的执行速度。
解析HTML通常是一个困难的问题，这里有一些很好的材料：




但是关于您的问题（“更好的”解决方案）-可以更具体地说明您想要实现的目标，以及您可以使用哪些工具？
如果您不擅长正则表达式（如我），我会发现很多正则表达式库通常帮助我完成任务
这里有一个小教程，它将
这里是我指的。
我会使用strlen（）
因为如果您不这样做，则会逐个字符进行比较，这可能会很慢，尽管我希望比较一发现差异就会立即退出。
如果您只想检测/替换某些标记：此函数将搜索某些html标记并将其封装在括号中-这是毫无意义的-jus不要将其修改为您想对标记执行的任何操作
$html = preg_replace_callback(
    '|\</?([a-zA-Z]+[1-6]?)(\s[^>]*)?(\s?/)?\>|',
    function ($found) {
        if(isset($found[1]) && in_array(
            $found[1], 
            array('div','p','span','b','a','strong','center','br','h1','h2','h3','h4','h5','h6','hr'))
        ) {
            return '[' . $found[0] . ']';
        };
    },
    $html  
);

$html=preg\u replace\u回调(
“| \]*）？（\s？/）？\>”，
函数（$found）{
if（isset（$found[1]）&&in_数组(
$found[1]，
数组（'div'，'p'，'span'，'b'，'a'，'strong'，'center'，'br'，'h1'，'h2'，'h3'，'h4'，'h5'，'h6'，'hr'））
) {
返回'['.$found[0].]'；
};
},
$html
);

正则表达式的解释：
\< ... \>   //start and ends with tag brackets
\</?        //can start with a slash for closing tags
([a-zA-Z]+[1-6]?)    //the tag itself (for example "h1")
(\s[^>]*)? //anything such as class=... style=... etc.
(\s?/)?     //allow self-closing tags such as <br />

\<…\>//以标记括号开始和结束
\]*)？//类=…样式=…等任何内容。
（\s？/）？//允许自动关闭标记，如

如果目的只是检查字符串是否包含html标记。无论html标记是否有效。然后您可以尝试此操作
function is_html($string) {
  // Check if string contains any html tags.
  return preg_match('/<\s?[^\>]*\/?\s?>/i', $string);
}

函数是html（$string）{
//检查字符串是否包含任何html标记。
返回preg\u match（'/]*\/？\s？>/i'，$string）；
}

这适用于所有有效或无效的html标记。您可以在此处选中“确认”
我建议您只允许定义的标记！您不希望用户键入
标记，这可能会导致XSS漏洞
尝试以下方法：
$string = '<strong>hello</strong>';
$pattern = "/<(p|span|b|strong|i|u) ?.*>(.*)<\/(p|span|b|strong|i|u)>/"; // Allowed tags are: <p>, <span>, <b>, <strong>, <i> and <u>
preg_match($pattern, $string, $matches);

if (!empty($matches)) {
    echo 'Good, you have used a HTML tag.';
}
else {
    echo 'You didn\'t use a HTML tag or it is not allowed.';
}

$string='你好；
$pattern=“/（.*）/”；//允许的标记有：、和
预匹配（$pattern，$string，$matches）；
如果（！空（$matches））{
echo“很好，您使用了HTML标记。”；
}
否则{
echo“您没有使用HTML标记，或者它是不允许的。”；
}
将/
添加到正则表达式字符串+1的开头和结尾，这是检测标记存在的最简单的方法。您甚至不需要strlen
。回答很好！简单得多，尽管我认为正则表达式通常非常快。这也会告诉html标记。如果字符串c包含任何控制字符，如/n/r..@R1CHY_-RICH：您能为您描述的假阳性提供一个示例吗？以下内容对我来说是“无html”：$s=“hello\r\nworld”；if（strip_-tags（$s）！=$s）{echo'包含html'；}else{echo'no-html'}
用这句话来测试：“一个樱桃覆盆子的重量。它是（？）？
$string = '<strong>hello</strong>';
$pattern = "/<(p|span|b|strong|i|u) ?.*>(.*)<\/(p|span|b|strong|i|u)>/"; // Allowed tags are: <p>, <span>, <b>, <strong>, <i> and <u>
preg_match($pattern, $string, $matches);

if (!empty($matches)) {
    echo 'Good, you have used a HTML tag.';
}
else {
    echo 'You didn\'t use a HTML tag or it is not allowed.';
}