Php 简单HTMLDOM-在div之间查找文本_Php_Simple Html Dom

Php 简单HTMLDOM-在div之间查找文本

php

Php 简单HTMLDOM-在div之间查找文本,php,simple-html-dom,Php,Simple Html Dom,我需要在这里提取div之间的文本（“四个div中的第三个…”）——使用简单的HTML Dom PHP库我想什么都试过了next_sibling（）返回注释，然后 next\u sibling（）->next\u sibling（）返回标记。理想情况下，我希望获得从第一条注释末尾到下一个标记的所有文本 <div class="left"> Bla-bla.. <div class="float">Bla-bla... </div><!--/end of

我需要在这里提取div之间的文本（“四个div中的第三个…”）——使用简单的HTML Dom PHP库

我想什么都试过了

next_sibling（）

返回注释，然后

next\u sibling（）->next\u sibling（）

标记。理想情况下，我希望获得从第一条注释末尾到下一个

标记的所有文本

<div class="left">
Bla-bla..
<div class="float">Bla-bla...
</div><!--/end of div.float-->
    <br />The third of four performances in the Society's Morning Melodies series features...<a href='index.php?page=tickets&month=20140201'>&lt;&lt; Back to full event listing</a>
</div><!--/end of div.left-->

我的下一步是获取div.left的

innertext

，然后删除其中的所有div，但这似乎是一个大麻烦。有什么更简单的方法吗？

为什么不在div.class上使用->明文？它根据需要输出文本

$html->find("div[class=left]")->plaintext;

Martti

使用

find（'text'，$index）

获取所有文本块，其中

$index

是所需文本的索引

所以在这种情况下，它是：

echo $html->find('text', 3);

// OUTPUT:
The third of four performances in the Society's Morning Melodies series features...

你可以在网上阅读更多

编辑：

下面是一个工作代码：

$input = '<div class="left">
Bla-bla..
<div class="float">Bla-bla...
</div><!--/end of div.float-->
    <br />The third of four performances in the Society\'s Morning Melodies series features...<a href="index.php?page=tickets&month=20140201">&lt;&lt; Back to full event listing</a>
</div><!--/end of div.left-->';

//Create a DOM object
$html = new simple_html_dom();
// Load HTML from a string
$html->load($input);

// Using $index
echo $html->find('text', 3);

echo "<hr>";

// Or, it's the 3rd element starting from the end
$text = $html->find('text');
echo $text[count($text)-3];

// Clear DOM object
$html->clear();
unset($html);

// OUTPUT
The third of four performances in the Society's Morning Melodies series features...
The third of four performances in the Society's Morning Melodies series features...

$input='1！'
等等。。
布拉布拉。。。

该协会晨间音乐系列四场演出中的第三场以。。。
';
//创建DOM对象
$html=新的简单html\U dom（）；
//从字符串加载HTML
$html->load（$input）；
//使用$index
echo$html->find（'text'，3）；
回声“”；
//或者，它是从结尾开始的第三个元素
$text=$html->find（'text'）；
echo$text[计数（$text）-3]；
//清除DOM对象
$html->clear（）；
未结算（$html）；
//输出
该协会早间音乐系列的四场演出中的第三场以。。。
该协会早间音乐系列的四场演出中的第三场以。。。

事实上，我认为简单HTMLDOM不提供实现这一点的工具，因为没有“get-before”或“get-after”类型的命令。如果我错了，请让我知道

您不能将此文本放在id为的范围内，并通过

jQuery

或

文档执行选择。getElementById

？您是否使用了jQuery的最近（）和子级？您是否仅尝试在PHP中执行此操作？文档是否为有效的xhtml？如果是这样的话，请使用SimpleXML之类的东西将其加载为XML文档，然后使用xpath作为成功的途径。如果您能够帮助它，请不要勉强。就像我说的，如果它是有效的XHTML，那么使用像SimpleXMLIn

DOMXPath

这样的XML解析器要高效得多，我会做

//text（）[previous:：comment（）[contains（，“/end of div.float”）]和following:：comment（）[contains（，“/end of div.left”）]

或其他什么的。。。我不知道“简单”htmldom中的等价物是什么……因为我只需要从开始的部分开始，而不需要在那之前有一整页内容。我希望我能以某种方式钩住那个评论标签。不幸的是，不知道其他页面上会有多少文本块。有用的提示。@Natalia这不是你的

的最后一篇文章吗？那你就知道它在哪里了！抱歉，但是echo$html->find（'text'，3）；实际上这里没有输出任何东西。还有什么我应该试试的吗？@Natalia，demo补充道：）酷。我希望他能在真正的页面上工作。我只是循环浏览了所有的文本块，有300多个，包括所有类型的javascript片段，当然，每个页面上所需文本块的位置都会有所不同。

[count（$text）-3]总有一天会派上用场的！：）除了上一个兄弟姐妹和下一个兄弟姐妹之外，css还有+
和~。不幸的是simple不支持它。你可以考虑切换到哪一个does@pguardiario，哇，这是非常新的（4天），似乎是完美的支持CSS3。。。我会等待一段时间，直到得到足够的反馈，然后再更换simpleDom，我也会尝试测试它很快。。。谢谢你的链接：）@pguardiario-我想再玩一些“高级HTML Dom”，有没有它支持的选择器列表？css3选择器的列表是，有些没有意义，比如E:hover或a:visted。但是那些能起作用的应该会起作用。我甚至不能开始。在SourceForge论坛上发布了一条消息。
$input = '<div class="left">
Bla-bla..
<div class="float">Bla-bla...
</div><!--/end of div.float-->
    <br />The third of four performances in the Society\'s Morning Melodies series features...<a href="index.php?page=tickets&month=20140201">&lt;&lt; Back to full event listing</a>
</div><!--/end of div.left-->';

//Create a DOM object
$html = new simple_html_dom();
// Load HTML from a string
$html->load($input);

// Using $index
echo $html->find('text', 3);

echo "<hr>";

// Or, it's the 3rd element starting from the end
$text = $html->find('text');
echo $text[count($text)-3];

// Clear DOM object
$html->clear();
unset($html);

// OUTPUT
The third of four performances in the Society's Morning Melodies series features...
The third of four performances in the Society's Morning Melodies series features...