PHP获取两个delimeter之间的字符串_Php_Html_Regex_String_Parsing

PHP获取两个delimeter之间的字符串

php html regex string parsing

PHP获取两个delimeter之间的字符串,php,html,regex,string,parsing,Php,Html,Regex,String,Parsing,我试图从以下字符串中获取href值： <td valign="top" width="300"class="topborder"><a href="/path/to/somewhere" class="bigger">random1</a><br/> <td valign="top" width="300"class="topborder"><a href="/path/to/somewhere2" class="bigger"&

我试图从以下字符串中获取href值：

<td valign="top" width="300"class="topborder"><a href="/path/to/somewhere" class="bigger">random1</a><br/>
<td valign="top" width="300"class="topborder"><a href="/path/to/somewhere2" class="bigger">random2</a><br/>

在这种情况下，我应该获得“/path/to/somewhere”和“/path/to/somewhere 2”

我尝试执行以下操作，但只得到空字符串

$htmlc = str_replace(' ', '', $htmlc);
//$htmlc contains the string I am parsing with the spaces removed
preg_match_all('/width=\"300\"class=\"topborder\"><ahref=\"([^\"class=\"bigger\"]+)/', $htmlc, $hrefvals);

$htmlc=str_replace（''，$htmlc）；
//$htmlc包含我正在解析的字符串，其中删除了空格
preg\u match\u all（'/width=\'300\'class=\'topborder\'>尝试这样的模式
/width=\"300\"class=\"topborder\"><ahref=\"(.*?)"/

/width=\'300\'class=\'topborder\'>

$（文档）.ready（函数（）{
$（“按钮”）。单击（函数（）{
警报（$（“#blah”）.attr（“href”）；
});
});

然后
<a href="http://www.blah.com" id="blah">Blah</a></p>
<button>Show href Value</button>


显示href值

这就是你的意思吗？
你只需要DOM和XPath。正则表达式不是为HTML解析而设计的
<?php
$html = <<<HTML
<td valign="top" width="300"class="topborder"><a href="/path/to/somewhere" class="bigger">random1</a><br/>
<td valign="top" width="300"class="topborder"><a href="/path/to/somewhere2" class="bigger">random2</a><br/>
HTML;
$dom = new DOMDocument;
$dom->loadHTML($html);
// replace with @$dom->loadHTMLFile('http://...') with you want to parse an URL
$xpath = new DOMXPath($dom);
$links = array_map(function ($node) {
        return $node->getAttribute('href');
    }, iterator_to_array($xpath->query("//td[@class='topborder']/a[@class='bigger']")));
var_dump($links);

或者你可以试试：
$htmlc = '
<td valign="top" width="300"class="topborder"><a href="/path/to/somewhere" class="bigger">random1</a><br/>
<td valign="top" width="300"class="topborder"><a href="/path/to/somewhere2" class="bigger">random2</a><br/>
';

preg_match_all('~(?<=<a\shref=")[^"]*~', $htmlc, $hrefvals);
var_dump($hrefvals);

$htmlc=




';
preg_match_all（“~”（？你的ahref=
应该是a href=
当然？在ahref中a和href之间没有空格，这是有意的吗？我实际上已经删除了前面的空格，很抱歉忘了将其复制到问题中。删除空格是个坏主意（对于你正在尝试做的事情来说也是个坏主意）.hmm，不完全是。我正在尝试解析一个包含另一页html的字符串，并从该html中获取href值。问题是我发布的字符串只是整个字符串的一小部分。整个字符串包含许多其他带有topborder类的href值，这是我不想要的。我需要的href值不仅是td有“topborder”类，href有“Biger”类。你知道如何将其纳入你的答案中吗？@SKLAK我已经更新了我的答案。只需在[@class='Biger']
表达式的末尾添加$xpath->query（）即可。你能告诉我它是否适用于你吗？：）@不客气！很高兴我能帮助你。干杯！：）没关系，我复制的模式不正确。这也有效，但会获取一些不必要的href值。我认为DOM和XPath的答案更高效、更干净。不管怎样，你确实正确地回答了我的问题。谢谢。我不知道效率如何，但它确实更干净，更好地重构和在将来修改/使用。很好的c好的；）
array(2) {
  [0]=>
  string(18) "/path/to/somewhere"
  [1]=>
  string(19) "/path/to/somewhere2"
}

$htmlc = '
<td valign="top" width="300"class="topborder"><a href="/path/to/somewhere" class="bigger">random1</a><br/>
<td valign="top" width="300"class="topborder"><a href="/path/to/somewhere2" class="bigger">random2</a><br/>
';

preg_match_all('~(?<=<a\shref=")[^"]*~', $htmlc, $hrefvals);
var_dump($hrefvals);