用Divs-PHP-Dom包装
有人能帮帮我吗 我试图从某个页面上获取html如下所示的信息用Divs-PHP-Dom包装,php,dom,Php,Dom,有人能帮帮我吗 我试图从某个页面上获取html如下所示的信息 <div class="block"> <h2>Season 1</h2> <div class="episode"><a href="somelink.com">Episode 1</a></div> <div class="episode"><a href="somelink.com">Episode 2<
<div class="block">
<h2>Season 1</h2>
<div class="episode"><a href="somelink.com">Episode 1</a></div>
<div class="episode"><a href="somelink.com">Episode 2</a></div>
<h2>Season 2</h2>
<div class="episode"><a href="somelink.com">Episode 1</a></div>
</div>
第一季
第二季
但我一直坚持的是,每一季我都想把它们放在片子里,比如,在片子里放上一季的插曲
<div class="block">
<div class="season">
<h2>Season 1</h2>
<div class="episode"><a href="somelink.com">Episode 1</a></div>
<div class="episode"><a href="somelink.com">Episode 2</a></div>
</div>
<div class="season">
<h2>Season 2</h2>
<div class="episode"><a href="somelink.com">Episode 1</a></div>
</div>
</div>
第一季
第二季
还有我正在使用的PHP代码
$page = "someurl.com";
$page = $this->curl->get($page);
$dom = new DOMDocument();
@$dom->loadHTML($page);
$divs = $dom->getElementsByTagName('div');
for($i=0;$i<$divs->length;$i++){
if ($divs->item($i)->getAttribute("class")=="block") {
$h2s = $divs->item($i)->getElementsByTagName('h2');
if (count($h2s) > 0) {
foreach ($h2s as $h2) {
// Stuck at this point
}
}
}
}
$page=“someurl.com”;
$page=$this->curl->get($page);
$dom=新的DOMDocument();
@$dom->loadHTML($page);
$divs=$dom->getElementsByTagName('div');
对于($i=0;$i长度;$i++){
如果($divs->item($i)->getAttribute(“类”)==“块”){
$h2s=$divs->item($i)->getElementsByTagName('h2');
如果(计数($h2s)>0){
foreach($h2s作为$h2){
//卡在这一点上
}
}
}
}
如何在PHP DOM中实现这一点?有人能给我举个例子吗?谢谢。下面的代码将每个
及其.eption
同级封装在容器中
$page = '<div class="block">
<h2>Season 1</h2>
<div class="episode"><a href="s1ep1.com">Episode 1</a></div>
<div class="episode"><a href="s1ep2.com">Episode 2</a></div>
<h2>Season 2</h2>
<div class="episode"><a href="s2ep1.com">Episode 1</a></div>
<div class="episode"><a href="s2ep1.com">Episode 2</a></div>
</div>';
$dom = new DOMDocument();
$origVal = libxml_use_internal_errors(true);
@$dom->loadHTML($page);
libxml_clear_errors();
libxml_use_internal_errors($origVal);
//create a tmeplate 'season' div
$season = $dom->createElement('div');
$season->setAttribute('class', 'season');
//get all '.block' divs using xpath
$xpath = new DOMXPath($dom);
$divs = $xpath->query("//*[@class='block']");
$clones = array();
$clone = '';
foreach($divs as $currDiv) {
//check if the 'block' contains any <h2> elemnts, if not, skip this block
if(!count($currDiv->getElementsByTagName('h2'))) {
continue;
}
foreach($currDiv->childNodes as $child) {
if(in_array($child->nodeName, array(
'#text',
'#comment'
))
) {
//ignore white space (and text content), and comments in 'block' div
continue;
}
if($child->nodeName == 'h2') {
if($clone) {
//save all clones of 'season' template div in an array for further use
$clones[] = $clone;
}
$clone = $season->cloneNode(true);
}
//this is the tricky part. If we do not append a clone of original div, then it actually moves the div to $clone. This changes HTML structure and disrupts the current loop
//so we append the clones of child to the 'season' div
if($child->nodeName == 'h2' || $child->getAttribute('class') == 'episode') {
$clone->appendChild($child->cloneNode(true));
}
}
$clones[] = $clone;
//remove all children of current 'block' div
while($currDiv->childNodes->length) {
$currDiv->removeChild($currDiv->firstChild);
}
//isnert all 'season' nodes in it
foreach($clones as $c) {
$currDiv->appendChild($c);
}
}
echo $dom->saveHTML();
$page='1!'
第一季
第二季
';
$dom=新的DOMDocument();
$origVal=libxml\u use\u internal\u errors(true);
@$dom->loadHTML($page);
libxml_clear_errors();
libxml使用内部错误($origVal);
//创建一个tmeplate“季节”div
$season=$dom->createElement('div');
$seasure->setAttribute('class','seasure');
//使用xpath获取所有“.block”div
$xpath=newdomxpath($dom);
$divs=$xpath->query(“/*[@class='block']”);
$clones=array();
$clone='';
foreach($divs作为$currDiv){
//检查“块”是否包含任何元素,如果不包含,则跳过此块
如果(!count($currDiv->getElementsByTagName('h2')){
继续;
}
foreach($currDiv->childNodes作为$child){
if(在数组中($child->nodeName,数组)(
"文本",,
"点评"
))
) {
//忽略“block”div中的空白(和文本内容)和注释
继续;
}
如果($child->nodeName=='h2'){
如果($clone){
//将“季节”模板div的所有克隆保存在一个数组中以供进一步使用
$clone[]=$clone;
}
$clone=$season->cloneNode(真);
}
//这是一个棘手的部分。如果我们不附加原始div的克隆,那么它实际上会将div移动到$clone。这会改变HTML结构并中断当前循环
//因此,我们将child的克隆添加到“季节”div
如果($child->nodeName=='h2'| |$child->getAttribute('class')=='eposion'){
$clone->appendChild($child->cloneNode(true));
}
}
$clone[]=$clone;
//删除当前“块”div的所有子级
而($currDiv->childNodes->length){
$currDiv->removeChild($currDiv->firstChild);
}
//isnert中的所有“季节”节点
foreach(克隆为$c){
$currDiv->appendChild($c);
}
}
echo$dom->saveHTML();
Regardass谁可能为您解决此问题,我们都希望您尝试并展示您在问题中的尝试。这样,您就可以了解自己做错了/不正确的地方。您用什么来表示/解析DOM结构?我非常感谢您花了这么多时间来写这篇文章。非常好,谢谢:)