用Divs-PHP-Dom包装

用Divs-PHP-Dom包装,php,dom,Php,Dom,有人能帮帮我吗 我试图从某个页面上获取html如下所示的信息 <div class="block"> <h2>Season 1</h2> <div class="episode"><a href="somelink.com">Episode 1</a></div> <div class="episode"><a href="somelink.com">Episode 2<

有人能帮帮我吗

我试图从某个页面上获取html如下所示的信息

<div class="block">
  <h2>Season 1</h2>
  <div class="episode"><a href="somelink.com">Episode 1</a></div>
  <div class="episode"><a href="somelink.com">Episode 2</a></div>
  <h2>Season 2</h2>
  <div class="episode"><a href="somelink.com">Episode 1</a></div>
</div>

第一季
第二季
但我一直坚持的是,每一季我都想把它们放在片子里,比如,在片子里放上一季的插曲

<div class="block">
    <div class="season">
      <h2>Season 1</h2>
      <div class="episode"><a href="somelink.com">Episode 1</a></div>
      <div class="episode"><a href="somelink.com">Episode 2</a></div>
    </div>
    <div class="season">
      <h2>Season 2</h2>
      <div class="episode"><a href="somelink.com">Episode 1</a></div>
    </div>
</div>

第一季
第二季
还有我正在使用的PHP代码

$page = "someurl.com";

$page = $this->curl->get($page);
$dom = new DOMDocument();
@$dom->loadHTML($page);

$divs = $dom->getElementsByTagName('div');
for($i=0;$i<$divs->length;$i++){
  if ($divs->item($i)->getAttribute("class")=="block") {
    $h2s = $divs->item($i)->getElementsByTagName('h2');
    if (count($h2s) > 0) {
      foreach ($h2s as $h2) {
      // Stuck at this point
      }
    }
  }
}
$page=“someurl.com”;
$page=$this->curl->get($page);
$dom=新的DOMDocument();
@$dom->loadHTML($page);
$divs=$dom->getElementsByTagName('div');
对于($i=0;$i长度;$i++){
如果($divs->item($i)->getAttribute(“类”)==“块”){
$h2s=$divs->item($i)->getElementsByTagName('h2');
如果(计数($h2s)>0){
foreach($h2s作为$h2){
//卡在这一点上
}
}
}
}

如何在PHP DOM中实现这一点?有人能给我举个例子吗?谢谢。

下面的代码将每个
及其
.eption
同级封装在
容器中

   $page = '<div class="block">
      <h2>Season 1</h2>
      <div class="episode"><a href="s1ep1.com">Episode 1</a></div>
      <div class="episode"><a href="s1ep2.com">Episode 2</a></div>
      <h2>Season 2</h2>
      <div class="episode"><a href="s2ep1.com">Episode 1</a></div>
      <div class="episode"><a href="s2ep1.com">Episode 2</a></div>
    </div>';

  $dom = new DOMDocument();

  $origVal = libxml_use_internal_errors(true);
  @$dom->loadHTML($page);
  libxml_clear_errors();
  libxml_use_internal_errors($origVal);

  //create a tmeplate 'season' div
  $season = $dom->createElement('div');
  $season->setAttribute('class', 'season');

  //get all '.block' divs using xpath
  $xpath = new DOMXPath($dom);
  $divs = $xpath->query("//*[@class='block']");

  $clones = array();
  $clone = '';

  foreach($divs as $currDiv) {

     //check if the 'block' contains any <h2> elemnts, if not, skip this block
     if(!count($currDiv->getElementsByTagName('h2'))) {
        continue;
     }

     foreach($currDiv->childNodes as $child) {

        if(in_array($child->nodeName, array(
                                           '#text',
                                           '#comment'
                                      ))
        ) {
           //ignore white space (and text content), and comments in 'block' div
           continue;
        }

        if($child->nodeName == 'h2') {
           if($clone) {
              //save all clones of 'season' template div in an array for further use
              $clones[] = $clone;
           }

           $clone = $season->cloneNode(true);
        }

        //this is the tricky part. If we do not append a clone of original div, then it actually moves the div to $clone. This changes HTML structure and disrupts the current loop
        //so we append the clones of child to the 'season' div
        if($child->nodeName == 'h2' || $child->getAttribute('class') == 'episode') {
           $clone->appendChild($child->cloneNode(true));
        }
     }
     $clones[] = $clone;

     //remove all children of current 'block' div
     while($currDiv->childNodes->length) {
        $currDiv->removeChild($currDiv->firstChild);
     }

     //isnert all 'season' nodes in it
     foreach($clones as $c) {
        $currDiv->appendChild($c);
     }
  }

  echo $dom->saveHTML();
$page='1!'
第一季
第二季
';
$dom=新的DOMDocument();
$origVal=libxml\u use\u internal\u errors(true);
@$dom->loadHTML($page);
libxml_clear_errors();
libxml使用内部错误($origVal);
//创建一个tmeplate“季节”div
$season=$dom->createElement('div');
$seasure->setAttribute('class','seasure');
//使用xpath获取所有“.block”div
$xpath=newdomxpath($dom);
$divs=$xpath->query(“/*[@class='block']”);
$clones=array();
$clone='';
foreach($divs作为$currDiv){
//检查“块”是否包含任何元素,如果不包含,则跳过此块
如果(!count($currDiv->getElementsByTagName('h2')){
继续;
}
foreach($currDiv->childNodes作为$child){
if(在数组中($child->nodeName,数组)(
"文本",,
"点评"
))
) {
//忽略“block”div中的空白(和文本内容)和注释
继续;
}
如果($child->nodeName=='h2'){
如果($clone){
//将“季节”模板div的所有克隆保存在一个数组中以供进一步使用
$clone[]=$clone;
}
$clone=$season->cloneNode(真);
}
//这是一个棘手的部分。如果我们不附加原始div的克隆,那么它实际上会将div移动到$clone。这会改变HTML结构并中断当前循环
//因此,我们将child的克隆添加到“季节”div
如果($child->nodeName=='h2'| |$child->getAttribute('class')=='eposion'){
$clone->appendChild($child->cloneNode(true));
}
}
$clone[]=$clone;
//删除当前“块”div的所有子级
而($currDiv->childNodes->length){
$currDiv->removeChild($currDiv->firstChild);
}
//isnert中的所有“季节”节点
foreach(克隆为$c){
$currDiv->appendChild($c);
}
}
echo$dom->saveHTML();

Regardass谁可能为您解决此问题,我们都希望您尝试并展示您在问题中的尝试。这样,您就可以了解自己做错了/不正确的地方。您用什么来表示/解析DOM结构?我非常感谢您花了这么多时间来写这篇文章。非常好,谢谢:)