Php 在下面的场景中,如何检查和更正数组元素中的无效HTML?

Php 在下面的场景中,如何检查和更正数组元素中的无效HTML?,php,html,arrays,dom,xml-parsing,Php,Html,Arrays,Dom,Xml Parsing,我有一个名为$comments的数组,如下所示: Array ( [0] => Array ( [text] => Second Comment Added ) [1] => Array ( [text] => This is the long comment added to check thwe size of the co

我有一个名为
$comments
的数组,如下所示:

Array
(
    [0] => Array
        (
            [text] => Second Comment Added                
        )

    [1] => Array
        (
            [text] => This is the long comment added to check thwe size of the comment on the device,if the size is more then add the hyperlink button to go on to the next page
        )

    [2] => Array
        (
            [text] => This comment is of two lines need to check more about it                
        )

    [3] => Array
        (
            [text] => This comment is of two lines need to check more                
        )

    [4] => Array
        (
            [text] => Uploading Photo  for comment <div title="comment_attach_image">

<a title="" title="colorbox" href="https://www.filepicker.io/api/file/CnYTVQdATAOQTkMxpAq4" ><img src="https://www.filepicker.io/api/file/CnYTVQdATAOQTkMxpAq4" height="150px" width="150px" /></a>

<a href="https://www.filepicker.io/api/file/CnYTVQdATAOQTkMxpAq4" class="comment_attach_image_link_dwl">Download</a>

</div>                
        )

    [5] => Array
        (
            [text] => test                
        )

    [6] => Array
        (
            [text] => Amit&#039;s pic<div class="comment_attach_image">
            <a class="group1 cboxElement" href="http://52.1.47.143/file/attachment/2015/03/e55f0f3080eb9828270a7963648a5826.jpeg" ><img src="http://52.1.47.143/file/attachment/2015/03/e55f0f3080eb9828270a7963648a5826.jpeg" height="150px" width="150px" /></a>

            <a class="comment_attach_image_link_dwl"  href="http://52.1.47.143/feed/download/year_2015/month_03/file_e55f0f3080eb9828270a7963648a5826.jpeg" >Download</a>
            </div>
        )

    [7] => Array
        (
            [text] => PDF file added<div class="comment_attach_file">
            <a class="comment_attach_file_link" href="http://52.1.47.143/feed/download/year_2015/month_03/file_1b87d4420c693f2bbdf738cbf2457d89.pdf" >1b87d4420c693f2bbdf738cbf2457d89.pdf</a>

            <a class="comment_attach_file_link_dwl"  href="http://52.1.47.143/feed/download/year_2015/month_03/file_1b87d4420c693f2bbdf738cbf2457d89.pdf" >Download</a>
            </div>                
        )

    [8] => Array
        (
            [text] => Just did it...                
        )

    [9] => Array
        (
            [text] => Akki <div title="comment_attach_image">

<a title="" title="colorbox" href="https://www.filepicker.io/api/file/NJqijbKTIOA0ZJBNknsm" ><img src="https://www.filepicker.io/api/file/NJqijbKTIOA0ZJBNknsm" height="150px" width="150px" /></a>

<a href="https://www.filepicker.io/api/file/NJqijbKTIOA0ZJBNknsm" class="comment_attach_image_link_dwl">Download</a>

</div>                
        )

) 
如果您仔细注意到
已更改为
,并且包含空白值的额外
标题
属性已被删除

如何在PHP中检查此无效HTML并使其正确

提前谢谢

以下是我的解析代码:

foreach($comments as $key=>$comment) {
    $text = strstr($comment['text'], '<div');
    if (strlen($text) <= 0) {
      $comments[$key]['type_id'] =  'text';
      $comments[$key]['url'] =  '';
      $comments[$key]['text'] =  $comment['text'];
    } else if($xml = @simplexml_load_string($text)) { 
      $comments[$key]['type_id'] =  substr(strrchr($xml['class'], '_'), 1);
      $comments[$key]['url'] = str_replace(array('href=','"'), '',$xml->a['href']->asXML());
      $comments[$key]['text'] =  strtok($comment['text'], '<');           
    } else {
      continue;
    }    
  }
foreach($comments as$key=>$comment){
$text=strstrstr($comment['text'],'试试这个

$original=array('<div title="comment_attach_image">','title=""');
$changedText=array('<div class="comment_attach_image">','');
str_replace($original,$changedText,$string);
$original=数组(“”,'title=“”);
$changedText=数组(“”,”);
str_replace($original,$changedText,$string);

它会将
标题
替换为
,将
标题=“”
替换为

您好,如果我正确理解了您的问题,您希望将
更改为
,对吗?您可以使用
stru replace()
函数将
标题
替换为
@phpfresher:是的,你说得对。但代码应该只在存在此类无效HTML时执行。还有一件事,要从第一个锚标记中删除包含空值的附加标题属性。如果你可以将代码作为答案发布,它将太好了。好的,当然。我会把它作为答案贴出来。。。
$original=array('<div title="comment_attach_image">','title=""');
$changedText=array('<div class="comment_attach_image">','');
str_replace($original,$changedText,$string);