XML解析Xquery,它在XML树中包含HTML表
我有以下xml,来自URLXML解析Xquery,它在XML树中包含HTML表,html,xml,xpath,xquery,Html,Xml,Xpath,Xquery,我有以下xml,来自URL <rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:media="http://search.yahoo.com/mrss/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"> <channel> <title>Videos</title> <link>https://www.exampl
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:media="http://search.yahoo.com/mrss/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
<channel>
<title>Videos</title>
<link>https://www.example.com/r/videos/</link>
<description>A long description of the video.</description>
<image>...</image>
<atom:link rel="self" href="http://www.example.com/videos/.xml" type="application/rss+xml"/>
<item>
<title>The most used Jazz lick in history.</title>
<link>
http://www.example.com/
</link>
<guid isPermaLink="true">
http://www.example.com/
</guid>
<pubDate>Mon, 07 Sep 2015 14:43:34 +0000</pubDate>
<description>
<table>
<tr>
<td>
<a href="http://www.example.com/">
<img src="http://www.example.com/.jpg" alt="The most used Jazz lick in history." title="The most used Jazz lick in history." />
</a>
</td>
<td> submitted by
<a href="http://www.example.com/"> jcepiano </a>
<br/>
<a href="http://www.youtube.com/">[link]</a>
<a href="http://www.example.com/">
[508 comments]
</a>
</td>
</tr>
</table>
</description>
<media:title>The most used Jazz lick in history.</media:title>
<media:thumbnail url="http://example.jpg"/>
</item>
<item>
<title>The most used Jazz lick in history.</title>
<link>
http://www.example.com/
</link>
<guid isPermaLink="true">
http://www.example.com/
</guid>
<pubDate>Mon, 07 Sep 2015 14:43:34 +0000</pubDate>
<description>
<table>
<tr>
<td>
<a href="http://www.example.com/">
<img src="http://www.example.com/.jpg" alt="The most used Jazz lick in history." title="The most used Jazz lick in history." />
</a>
</td>
<td> submitted by
<a href="http://www.example.com/"> jcepiano </a>
<br/>
<a href="http://www.youtube.com/">[link]</a>
<a href="http://www.example.com/">
[508 comments]
</a>
</td>
</tr>
</table>
</description>
<media:title>The most used Jazz lick in history.</media:title>
<media:thumbnail url="http://example.jpg"/>
</item>
<item>
<title>The most used Jazz lick in history.</title>
<link>
http://www.example.com/
</link>
<guid isPermaLink="true">
http://www.example.com/
</guid>
<pubDate>Mon, 07 Sep 2015 14:43:34 +0000</pubDate>
<description>
<table>
<tr>
<td>
<a href="http://www.example.com/">
<img src="http://www.example.com/.jpg" alt="The most used Jazz lick in history." title="The most used Jazz lick in history." />
</a>
</td>
<td> submitted by
<a href="http://www.example.com/"> jcepiano </a>
<br/>
<a href="http://www.youtube.com/">[link]</a>
<a href="http://www.example.com/">
[508 comments]
</a>
</td>
</tr>
</table>
</description>
<media:title>The most used Jazz lick in history.</media:title>
<media:thumbnail url="http://example.jpg"/>
</item>
<item>
<title>The most used Jazz lick in history.</title>
<link>
http://www.example.com/
</link>
<guid isPermaLink="true">
http://www.example.com/
</guid>
<pubDate>Mon, 07 Sep 2015 14:43:34 +0000</pubDate>
<description>
<table>
<tr>
<td>
<a href="http://www.example.com/">
<img src="http://www.example.com/.jpg" alt="The most used Jazz lick in history." title="The most used Jazz lick in history." />
</a>
</td>
<td> submitted by
<a href="http://www.example.com/"> jcepiano </a>
<br/>
<a href="http://www.youtube.com/">[link]</a>
<a href="http://www.example.com/">
[508 comments]
</a>
</td>
</tr>
</table>
</description>
<media:title>The most used Jazz lick in history.</media:title>
<media:thumbnail url="http://example.jpg"/>
</item>
</channel>
</rss>
我得到一个错误:
可捕获的致命错误:在第16行的/home/thanksbelieve/public\u html/vsi/trend\u vids.php中,domeElement类的对象无法转换为字符串
我想我需要一种将对象转换为字符串的方法,然后它应该可以正常工作,我还尝试了在第二个foreach
循环中执行loadHTML((string)$desc))
之前执行saveHTML()
,但没有成功
我在网上找不到一本易学的教程。
任何帮助都将不胜感激
谢谢:)我终于可以使用下面的代码了
<?php
$url = "https://www.example.com/r/videos/.xml";
$feed_dom = new domDocument;
$feed_dom->load($url);
$feed_dom->preserveWhiteSpace = false;
$items = $feed_dom->getElementsByTagName('item');
foreach($items as $item){
$title = $item->getElementsByTagName('title')->item(0)->nodeValue;
$desc_table = $item->getElementsByTagName('description')->item(0)->nodeValue;
echo $title . "<br>";
$table_dom = new domDocument;
$table_dom->loadHTML($desc_table);
$xpath = new DOMXpath($table_dom);
$table_dom->preserveWhiteSpace = false;
$yt_link_node = $xpath->query("//table/tr/td[2]/a[2]");
foreach($yt_link_node as $yt_link){
$yt = $yt_link->getAttribute('href');
echo $yt . "<br>";
echo "<br>";
}
}
?>
load($url);
$feed_dom->preserveWhiteSpace=false;
$items=$feed_dom->getElementsByTagName('item');
foreach($items作为$item){
$title=$item->getElementsByTagName('title')->item(0)->nodeValue;
$desc_table=$item->getElementsByTagName('description')->item(0)->nodeValue;
回声$title.“
”;
$table_dom=新的domDocument;
$table\u dom->loadHTML($desc\u table);
$xpath=newdomxpath($table_dom);
$table_dom->preserveWhiteSpace=false;
$yt_link_node=$xpath->query(“//table/tr/td[2]/a[2]”);
foreach($yt\u link\u节点作为$yt\u link){
$yt=$yt_链接->获取属性('href');
回声$yt.“
”;
回声“
”;
}
}
?>
我感谢Abel,您的评论对实现代码非常有帮助!:) 我很遗憾看到你显然没有让这个工作后。不幸的是,“你能写出完整的代码来实现同样的目标吗?”是这样的。如果您有特定的问题,即某种错误,请显示您的代码(请参阅),我们可以尝试提供帮助。如果你真的需要有人帮你做这项工作,那么很多咨询公司都存在,但是没有适合的地方。除此之外。如果你从一项新技术开始,从小处着手。以最小的代码集为例(也就是说,只要
Hello world
,然后从那里展开。如果你被这个简单的例子困住了,问一些关于你哪里出了问题或遇到了麻烦的具体问题也会更容易,回答这些问题也会更容易。正如你所建议的,如果你能帮忙的话,我已经添加了我的代码!我也用Xquery尝试了你的方法,这就是工作正常-//频道/item/title/text()
但是这根本不起作用-//频道/item/description/table/tr/td/a[.=“[link]”][1]/@href
所以我不得不将foreach嵌套在XML提要中以访问HTML表。很高兴你让它工作了。我不得不走开,直到现在才看到你的最新评论。我看到你找到了->item(0)->nodeValue
,这确实是从DOM节点列表中获取属性(在本例中是节点值)的方法。
<?php
$url = "https://www.example.com/r/videos/.xml";
$feed_dom = new domDocument;
$feed_dom->load($url);
$feed_dom->preserveWhiteSpace = false;
$items = $feed_dom->getElementsByTagName('item');
foreach($items as $item){
$title = $item->getElementsByTagName('title')->item(0)->nodeValue;
$desc_table = $item->getElementsByTagName('description')->item(0)->nodeValue;
echo $title . "<br>";
$table_dom = new domDocument;
$table_dom->loadHTML($desc_table);
$xpath = new DOMXpath($table_dom);
$table_dom->preserveWhiteSpace = false;
$yt_link_node = $xpath->query("//table/tr/td[2]/a[2]");
foreach($yt_link_node as $yt_link){
$yt = $yt_link->getAttribute('href');
echo $yt . "<br>";
echo "<br>";
}
}
?>