Php 如何分解和解析特定的维基百科文本_Php_Parsing_Simplexml_Domdocument_Mediawiki Api

Php 如何分解和解析特定的维基百科文本

php parsing

Php 如何分解和解析特定的维基百科文本,php,parsing,simplexml,domdocument,mediawiki-api,Php,Parsing,Simplexml,Domdocument,Mediawiki Api,我将使用以下工作示例检索返回SimpleXMLElement对象的特定Wikipedia页面： ini_set('user_agent', 'michael@example.com'); $doc = New DOMDocument(); $doc->load('http://en.wikipedia.org/w/api.php?action=parse&page=Main%20Page&format=xml'); $xml = simplexml_import_dom(

我将使用以下工作示例检索返回SimpleXMLElement对象的特定Wikipedia页面：

ini_set('user_agent', 'michael@example.com');
$doc = New DOMDocument();
$doc->load('http://en.wikipedia.org/w/api.php?action=parse&page=Main%20Page&format=xml');

$xml = simplexml_import_dom($doc);

print '<pre>';
print_r($xml);
print '</pre>';

在喝了一杯新鲜茶，吃了一根香蕉之后，我想出了一个解决办法：

echo "Some value: ";
echo $html->getElementById('someid')->nodeValue;

这样我就可以得到一个elements值，这就是我想要的。例如：

也许你在找

$doc->loadHTMLFile（'http://en.wikipedia.org/');？
SimpleXMLElement Object
(
    [body] => SimpleXMLElement Object
        (
            [table] => SimpleXMLElement Object
                (
                    [@attributes] => Array
                        (
                            [id] => mp-topbanner
                            [style] => width:100% ...

ini_set('user_agent','michael@example.com');
$doc = new DOMDocument();
$doc->load('http://en.wikipedia.org/w/api.php?action=parse&page=Main%20Page&format=xml');
$nodes = $doc->getElementsByTagName('text');

$str = $nodes->item(0)->nodeValue;

$html = new DOMDocument();
$html->loadHTML($str);

echo "Some value: ";
echo $html->getElementById('someid')->nodeValue;