Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/php/245.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/regex/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Php 通过正则表达式设置的HTML文档类型和字符集_Php_Regex - Fatal编程技术网

Php 通过正则表达式设置的HTML文档类型和字符集

Php 通过正则表达式设置的HTML文档类型和字符集,php,regex,Php,Regex,我正在做一个项目,在这个项目上需要获取doctype和char集 事实上我是通过另一种方式得到的。但是,通过正则表达式获取doctype和char集是非常困难的,因为这可以通过各种方式在html中表示出来 谁能帮我弄到这个 我需要得到之后的所有文本,如下所示: 要捕获以下内容,]+)[/]*> (第一个)back引用(\1)将保存][^\'“>]+[^\'”>)[\'”]*.\124;字符集=[]*([^\'“>][^\'”>]+[^\'“>])([^>]*.\i'); $subject=“”

我正在做一个项目,在这个项目上需要获取doctype和char集

事实上我是通过另一种方式得到的。但是,通过正则表达式获取doctype和char集是非常困难的,因为这可以通过各种方式在html中表示出来

谁能帮我弄到这个


我需要得到
之后的所有文本,如下所示:

要捕获以下内容,
]+)[/]*>

(第一个)back引用(
\1
)将保存
][^\'“>]+[^\'”>)[\'”]*.\124;字符集=[]*([^\'“>][^\'”>]+[^\'“>])([^>]*.\i');
$subject=“”;
$result=preg_match($pattern,$subject,$matches)
/*如果$result==1,则找到匹配项*/
/*捕获的文本可以在$matches中找到[3]*/

Kevin Fegan感谢您的回答。但不幸的是,我无法使用PHP
preg_match()
函数处理您的正则表达式,因为它显示了以下内容:
警告:模式中的preg_match()[function.preg match]:未知修饰符']',对于这两个正则表达式,我收到了此消息。在这种情况下,您能帮我吗???@razi223-我相信您收到的警告是因为您可能没有在
模式中为
preg\u match()函数提供正确的
分隔符。我已经在我的答案中添加了一些附加信息和示例代码。@razi223-如果这个答案对您有效,您可以将它标记为已接受,以便其他人知道这个解决方案有效。如果没有,请告诉我,我将很高兴继续帮助您。
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML Strict//EN">
HTML PUBLIC "-//W3C//DTD HTML Strict//EN"
<meta[ ]+([^>]*|)(charset=['" ]*([^'"> ][^'">]+[^'"> ])['" ]*|charset=[ ]*([^'"> ][^'">]+[^'"> ]))([^>]*|)>
<meta charset=utf-8>
<meta charset='utf-8'>
<meta charset="utf-8">
<meta charset='utf-8 '>
<meta charset=" utf-8">
<meta charset=" utf-8 ">

<meta charset=utf-8>
<meta charset='utf-8' something='value'>
<meta something='value' charset="utf-8">
<meta something='value' charset='utf-8 ' somethingelse='value'>

<meta http-equiv='Content-Type' content=text/html; charset=utf-8>
<meta http-equiv='Content-Type' content='text/html; charset=utf-8'>
<meta http-equiv='Content-Type' content="text/html; charset=utf-8">
<meta http-equiv='Content-Type' content='text/html; charset=utf-8' >
<meta http-equiv='Content-Type' content='text/html; charset=utf-8 ' >
<meta http-equiv='Content-Type' content='text/html; charset= utf-8' >
<meta http-equiv='Content-Type' content='text/html; charset= utf-8 ' >

<meta http-equiv='Content-Type' content=text/html; charset=utf-8>
<meta http-equiv='Content-Type' content='text/html; charset=utf-8' something='value'>
<meta http-equiv='Content-Type' something='value' content="text/html; charset=utf-8">
<meta something='value' http-equiv='Content-Type' content='text/html; charset=utf-8' >
<meta http-equiv='Content-Type' something='value' content='text/html; charset=utf-8 ' something='value' >
utf-8
preg_match($pattern, $subject, $matches)
$pattern="#<!DOCTYPE[ ]+([^ ][^>]+[^ />]+)[ /]*>#i";
$subject="<!DOCTYPE HTML or more can be here>";
$result=preg_match($pattern, $subject, $matches)
/* if $result===1, then a match was found */
/* and the captured text can be found in $matches[1] */
$pattern='#<meta[ ]+([^>]*|)(charset=[\'" ]*([^\'"> ][^\'">]+[^\'"> ])[\'" ]*|charset=[ ]*([^\'"> ][^\'">]+[^\'"> ]))([^>]*|)>#i';
$pattern="#<meta[ ]+([^>]*|)(charset=['\" ]*([^'\"> ][^'\">]+[^'\"> ])['\" ]*|charset=[ ]*([^'\"> ][^'\">]+[^'\"> ]))([^>]*|)>#i";
$pattern="#<meta[ ]+([^>]*|)(charset=['\x22 ]*([^'\x22> ][^'\x22>]+[^'\x22> ])['\x22 ]*|charset=[ ]*([^'\x22> ][^'\x22>]+[^'\x22> ]))([^>]*|)>#i";
$pattern='#<meta[ ]+([^>]*|)(charset=[\'" ]*([^\'"> ][^\'">]+[^\'"> ])[\'" ]*|charset=[ ]*([^\'"> ][^\'">]+[^\'"> ]))([^>]*|)>#i';
$subject="<meta http-equiv='Content-Type' content='text/html; charset=utf-8'>";
$result=preg_match($pattern, $subject, $matches)
/* if $result===1, then a match was found */
/* and the captured text can be found in $matches[3] */