如何使用php从html中提取数据

如何使用php从html中提取数据,php,html,Php,Html,我已获取此html代码 <td valign="top" style="padding:3px"> <p> <b>Release Year: </b>2005 <br /> <b> Genre: <a href=/genres/Animation>Animation</a>, <a href=/genres/Comedy>

我已获取此html代码

<td valign="top" style="padding:3px">

    <p>

    <b>Release Year: </b>2005

    <br />

    <b>
    Genre: 
    <a href=/genres/Animation>Animation</a>, 
    <a href=/genres/Comedy>Comedy</a>
    </b>

    <br />

    <b>External Links: </b> 
    <a href="http://www.imdb.com/title/tt0397306/" target="_blank">IMDB</a> 
    <br />

    <b>No. of episodes: </b> 23 episodes 

    <br />

    <b>Latest Episode With Links: </b> 

    <a title="Watch American Dad! Latest Episode (American Dad! Season 1 Episode 23)" href="/episode/american_dad_s1_e23.html">
    Season 1 Episode 23 Tears of a Clooney (14/05/2006)
    </a>

    <br />

    <div style="float: left; height: 30px; overflow: hidden; width: 100px;">

    <div class="fb-like" data-href="http://watchseries.ag/season-1/american_dad" data-send="false" data-layout="button_count" data-show-faces="false"></div>

    </div>

    <a href="https://twitter.com/share" class="twitter-share-button" data-url="http://watchseries.ag/season-1/american_dad">Tweet</a>

    <script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script>

    <br clear="all" />

    <b>Description :</b> The random escapades of Stan Smith, an extreme right wing CIA agent dealing with family life and keeping America safe, all in the most absurd way possible.<br>

    </p>
    </td>
我还创建了php函数,该函数过滤html代码并返回数组,但它没有给出上面所示的结果。我显示了数组()它给出了这个结果

输出

Array
(
    [release year] => 2005Genre
)
功能

function do_html_array($td,$dlm='<br>'){
    if(!empty($td)){
        $td = html_entity_decode($td);
        $td = preg_replace('/<script\b[^>]*>(.*?)<\/script>/is', "", $td);
        $html_array = explode($dlm,$td);
        $html_key_array = array();
        foreach($html_array as $key=>$html){
                $html = explode(':',trim(strip_tags($html)));
                if(trim($html[0])!=''){
                    if(count($html)<1) $html[1] = '';                   
                    if(strtolower(trim($html[0]))=='description') $html[1] = str_ireplace('[+]more','',$html[1]);
                    $html_key_array[strtolower(trim($html[0]))] = trim($html[1]);
                    switch(trim(strtolower($html[0]))){
                        case'external links':
                             preg_match_all('~<a\s+.*?</a>~is',$html_array[$key],$html_key_array['imdb_link']);                          
                        break;
                        case'genre':
                             preg_match_all('~<a\s+.*?</a>~is',$html_array[$key],$html_key_array['genre_link']);                             
                        break;
                        // further define here...
                    }
                }
        }
        return $html_key_array;
    }
    return false; 
}
函数do_html_数组($td,$dlm='
){ 如果(!空($td)){ $td=html\u实体\u解码($td); $td=preg_replace('/]*>(.*?)/is','',$td); $html_array=explode($dlm,$td); $html_key_array=array(); foreach($html\u数组作为$key=>$html){ $html=explode(“:”,trim(strip_标记($html)); 如果(修剪($html[0])!=''){
如果(计数($html)那会是什么样的错误?正如@Bulk已经问过的,什么错误?告诉我们,如果有错误消息是什么。如果没有,那么告诉我们发生了什么,以及为什么它不是你想要发生的。使用一些标准的解析器,比如PHP DOM parser,这使你的解析变得容易。感谢你的所有回答,我编辑了我的问题,并展示了我从中得到的信息m这个php函数和Makesh我已经在使用简单的HTMLDOM了。请帮助,谢谢帮助,请提前谢谢
function do_html_array($td,$dlm='<br>'){
    if(!empty($td)){
        $td = html_entity_decode($td);
        $td = preg_replace('/<script\b[^>]*>(.*?)<\/script>/is', "", $td);
        $html_array = explode($dlm,$td);
        $html_key_array = array();
        foreach($html_array as $key=>$html){
                $html = explode(':',trim(strip_tags($html)));
                if(trim($html[0])!=''){
                    if(count($html)<1) $html[1] = '';                   
                    if(strtolower(trim($html[0]))=='description') $html[1] = str_ireplace('[+]more','',$html[1]);
                    $html_key_array[strtolower(trim($html[0]))] = trim($html[1]);
                    switch(trim(strtolower($html[0]))){
                        case'external links':
                             preg_match_all('~<a\s+.*?</a>~is',$html_array[$key],$html_key_array['imdb_link']);                          
                        break;
                        case'genre':
                             preg_match_all('~<a\s+.*?</a>~is',$html_array[$key],$html_key_array['genre_link']);                             
                        break;
                        // further define here...
                    }
                }
        }
        return $html_key_array;
    }
    return false; 
}