Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/html/71.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Php 清理HTML_Php_Html_Codeigniter - Fatal编程技术网

Php 清理HTML

Php 清理HTML,php,html,codeigniter,Php,Html,Codeigniter,我需要将大量的HTML代码(包含,,)清理成一个更标准的HTML,没有原始HTML代码的样式,没有。清理的一种方法是删除除之外的所有原始HTML标记,将替换为。剩下的标记将被去除类、ID和属性 我该怎么做?目前,我正在使用strip_tags(),它删除了所有标签,但将所有剩余的内容压缩到一行中,使其难以阅读 要清理的HTML代码示例 <table cellpadding="0" cellspacing="0" width="100%"> <tr><td align

我需要将大量的HTML代码(包含
,,
)清理成一个更标准的HTML,没有原始HTML代码的样式,没有
。清理的一种方法是删除除

  • 之外的所有原始HTML标记,
    将替换为
    。剩下的标记将被去除类、ID和属性

    我该怎么做?目前,我正在使用
    strip_tags()
    ,它删除了所有标签,但将所有剩余的内容压缩到一行中,使其难以阅读

    要清理的HTML代码示例

    <table cellpadding="0" cellspacing="0" width="100%">
    <tr><td align="center">
    <table cellpadding="0" cellspacing="0" width="850">
    <tr valign="top">
    <td width="50%" style="padding-left:30px;">
    <img border="0" src="http://www.newpads.info/l/AC-000-078.gif"><br>2000 Massachusetts Ave., Cambridge, MA 02140<br>Phone: (617) 498-0011 - Fax: (617) 498-0044<br><a href="http://www.windsorrealty.net" rel="nofollow">http://www.windsorrealty.net</a></td>
    <td width="35%" style="border-left: 1px solid gray; padding-left: 15px">
    <div><span style="font-weight:bold;">Sugandha Singh</span></div>
    <div style="padding:10px;">
    <div style="padding:2px;"><img src="http://www.newpads.info/img/phone.gif"> 781 985 4489</div>
    <div style="padding:2px;"><img src="http://www.newpads.info/img/email2.gif"> ssinghrealty@gmail.com</div>
    <div><img src="http://www.newpads.info/img/question.gif"> <a href="http://ag006436.speedhatch.com/rentals/CAM-058-197/inquiry" rel="nofollow"><font size="3">Ask Me A Question</font></a></div><div><img src="http://www.newpads.info/img/magnet.png">  <a href="http://ag006436.speedhatch.com" rel="nofollow"><font size="3">Search My Apartments</font></a></div></div>
    </td>
    </tr>
    </table>
    <br><table width="850">
    <tr><td colspan="2" height="2" bgcolor="#275c7d"></td></tr><tr><tr><td colspan="2"><div style="font-weight:bold;"><font size="3">HARVARD LAW / SQUARE. HEAT+HOTWATER INCL. JAN 1. 1/2 FEE</font></div></td></tr></tr><tr valign="top"><td><img src="http://maps.google.com/maps/api/staticmap?center=42.38047,-71.121008&amp;path=weight:4|42.37847,-71.118008|42.37847,-71.124008|42.38247,-71.124008|42.38247,-71.118008|42.37847,-71.118008&amp;zoom=15&amp;size=335x225&amp;sensor=false" style="width:275px;"></td><td><font size="2"><table style="width:100%;height:100%;"><tr valign="top"><td width="50%"><table cellpadding="3" style="width:100%;"><tr><td colspan="2" style="font-weight:bold;">Basic Info</td></tr><tr><td style="width:45%;">Referral ID:</td><td>CAM-058-197</td></tr><tr><td>Beds: 1</td><td>Baths: 1</td></tr><tr><td>Rent:</td><td>$1800</td></tr><tr><td>Broker Fee:</td><td>Half Month</td></tr><tr><td>Date Avail:</td><td>January 1st</td></tr><tr><td>Rent Includes:</td><td>Heat, Hot Water</td></tr><tr><td>Pet Policy:</td><td>Cat Ok</td></tr><tr><td colspan="2">on Langdon St., Cambridge - Harvard Square</td></tr></table></td><td width="50%"><table cellpadding="5" style="width:100%;"><tr><td colspan="2" style="font-weight:bold;">Apartment Features</td></tr><tr><td width="50%">- Gas Range</td><td width="50%">- HT&HW</td></tr><tr><td width="50%">- Modern Bath</td><td width="50%">- Modern Kitchen</td></tr><tr><td width="50%">- Storage - Basement</td><td width="50%"></td></tr></table></td></tr><tr><td colspan="2"></td></tr></table></font></td></tr><tr><td colspan="3"><table width="100%" border="0" cellspacing="0" cellpadding="3"><tr><td colspan="2" align="center"><b>Transportation options</b></td></tr><tr><td width="50%"><div><div><div style="text-align:center;text-decoration:underline;">Subway Lines and Stops</div><ul><li>RED - Harvard Square (11 min)</li></ul></td><td width="50%"><div style="text-align:center;text-decoration:underline;">Bus Routes and Stops</div><ul><li>74 - Waterhouse St & Massachusetts Ave (5 min)</li><li>72 - Waterhouse St & Massachusetts Ave (5 min)</li><li>77 - Massachusetts Ave & Waterhouse St (5 min)</li><li>75 - Waterhouse St & Massachusetts Ave (5 min)</li><li>71 - Waterhouse St & Massachusetts Ave (5 min)</li><li>And More...</li></ul></div></div></td></tr></table></td></tr><tr><td colspan="3"><div><b><font size="2">Apartment Description:</font></b></div><div style="padding:5px;"><font size="2">Recent Renovations. Great Location. Easy Walk to Harvard Law School or Harvard Square.<br>All Hardwood Floors, Kitchen w/Dining Area, Good Closet Space, Laundry Facilities.<br>(pics. of similar unit in the bldg)<br>HEAT and HOT WATER is INCLUDED in the RENT!<br>Available January 1.</font></div><br></td></tr><tr><td colspan="3"><div><strong>Similar Properties</strong></div><div>1 Bd on Huron Ave., $1835, NO FEE, Include Util., Avail Now</div><div>1 Bd on Huron Ave., $1810, Include Util., NO FEE, Avail Now</div></td></tr><tr><td colspan="2" height="2" bgcolor="#275c7d"></td></tr></table><br><table width="850" cellpadding="0" cellspacing="0" border="0"><tr><td style="text-align:center;width:50%;"><img src="http://www.newpads.info/p/2373415.jpg" width="400" border="0"></td><td style="text-align:center;width:50%;"><img src="http://www.newpads.info/p/2373416.jpg" width="400" border="0"></td></tr><tr><tr><td height="10"></td></tr><td style="text-align:center;width:50%;"><img src="http://www.newpads.info/p/2373417.jpg" width="400" border="0"></td><td style="text-align:center;width:50%;"><img src="http://www.newpads.info/p/2373418.jpg" width="400" border="0"></td></tr><tr><tr><td height="10"></td></tr><td style="text-align:center;width:50%;"><img src="http://www.newpads.info/p/2373419.jpg" width="400" border="0"></td><td style="text-align:center;width:50%;"><img src="http://www.newpads.info/p/2373420.jpg" width="400" border="0"></td></tr><tr><tr><td height="10"></td></tr></tr></table><table width="100%"><tr><td height="20"></td></tr><tr><td align="center"><font size="4">Contact <strong>Sugandha Singh</strong> at 781 985 4489 or ssinghrealty@gmail.com.</font></td></tr></table><table width="100%" cellspacing="0" cellpadding="0"><tr><td height="25"></td></tr><tr><td align="center"><div style="font-family: Verdana, sans-serif;"><font size="0.6">Equal Housing Opportunity - Windsor Realty is not responsible for any errors or omissions. Terms, conditions and rent are subject to change without prior notice. The information gathered is from third party sources including the owner and public records and is not guaranteed.</font></div></td></tr></table></td></tr></table><img src="http://www.newpads.info/CLAD/904329.gif">
    
    注意:如果Codeigniter有助于任何解析功能,我将使用Codeigniter。

    您可以知道要保留哪些标记。这样,您只需解决将
    元素替换为
    的问题

    $doc = new DOMDocument();
    $doc->loadHTML(...);
    $xpath = new DOMXpath($doc);
    $nodes = $xpath->query("//*");
    
    $rtn = array();
    foreach ($nodes as $node)
    {
        switch ($node->nodeName)
        {
            case "ul":
            case "li":
            case "ol":
            case "br":
            case "strong":
                $rtn[] = $node->nodeValue;
                break;
        }
    }
    
    通过运行xpath查询选择元素并用
    元素替换它们,同时接管原始元素的子元素,可以对该类实现这一点

    可以在(问题:)中找到一些相关代码,用于移动孩子,有(问题:)表明。

    在这种情况下是一个很好的选择。强烈推荐。

    用于更换标签

    别忘了排除你需要的标签


    因为我在谷歌上搜索了一个打电话给我的电话号码,所以我登陆了这个页面。
    $doc = new DOMDocument();
    $doc->loadHTML(...);
    $xpath = new DOMXpath($doc);
    $nodes = $xpath->query("//*");
    
    $rtn = array();
    foreach ($nodes as $node)
    {
        switch ($node->nodeName)
        {
            case "ul":
            case "li":
            case "ol":
            case "br":
            case "strong":
                $rtn[] = $node->nodeValue;
                break;
        }
    }