Php 简单HTMLDOM解析器-获取所有明文,而不是特定元素的文本
我尝试了所有贴在上面的解决方案。虽然这和我的问题很相似,但它的解决方案对我来说并不适用 我正在尝试获取的纯文本位于外部,并且应该位于查询内部: 说明:Php 简单HTMLDOM解析器-获取所有明文,而不是特定元素的文本,php,parsing,dom,html-parsing,web-scraping,Php,Parsing,Dom,Html Parsing,Web Scraping,我尝试了所有贴在上面的解决方案。虽然这和我的问题很相似,但它的解决方案对我来说并不适用 我正在尝试获取的纯文本位于外部,并且应该位于查询内部: 说明: // - selects nodes regardless of their position in tree div - selects elements which node name is 'div' [@id="maindiv"] - selects only those div
// - selects nodes regardless of their position in tree
div - selects elements which node name is 'div'
[@id="maindiv"] - selects only those divs having the attribute id="maindiv"
/ - sets focus to the div element
text() - selects only text elements
[2] - selects the second text element (the first is whitespace)
Note! The actual position of the text element may depend on
your preserveWhitespace setting.
Manual: http://www.php.net/manual/de/class.domdocument.php#domdocument.props.preservewhitespace
例如:
$html = <<<EOF
<div id="maindiv">
<b>I dont want this text</b>
I want this text
</div>
EOF;
$doc = new DOMDocument();
$doc->loadHTML($html);
$selector = new DOMXpath($doc);
$node = $selector->query('//div[@id="maindiv"]/text()[2]')->item(0);
echo trim($node->nodeValue); // I want this text
删除第一个:
谢谢你的快速回复,你能给我解释一下吗。
I don't want this text I want this text
$selector->query('//div[@id="maindiv"]/text()[2]')
// - selects nodes regardless of their position in tree
div - selects elements which node name is 'div'
[@id="maindiv"] - selects only those divs having the attribute id="maindiv"
/ - sets focus to the div element
text() - selects only text elements
[2] - selects the second text element (the first is whitespace)
Note! The actual position of the text element may depend on
your preserveWhitespace setting.
Manual: http://www.php.net/manual/de/class.domdocument.php#domdocument.props.preservewhitespace
$html = <<<EOF
<div id="maindiv">
<b>I dont want this text</b>
I want this text
</div>
EOF;
$doc = new DOMDocument();
$doc->loadHTML($html);
$selector = new DOMXpath($doc);
$node = $selector->query('//div[@id="maindiv"]/text()[2]')->item(0);
echo trim($node->nodeValue); // I want this text
$part->find('b', 0)->outertext = '';
echo $part->innertext; // I want this text