Php 获取一个<；a>；用preg_match_all和curl标记_Php_Html_Curl_Preg Match All

Php 获取一个<；a>；用preg_match_all和curl标记

php html curl

Php 获取一个<；a>；用preg_match_all和curl标记,php,html,curl,preg-match-all,Php,Html,Curl,Preg Match All,几天来我一直在寻找解决问题的方法。我使用CURL获取网页的内容，然后使用prey_match_all使用我风格的内容，但在文档中查找一些标记时遇到了一个问题我希望preg_match_all查找后面跟着标记的所有标记，然后将这些标记的所有href值存储在数组变量中以下是我的想法： preg_match_all("~(<a href=\"(.*)\">\w+<\/strong>)~iU", $result, $link); 谁能帮帮我吗我

几天来我一直在寻找解决问题的方法。我使用CURL获取网页的内容，然后使用prey_match_all使用我风格的内容，但在文档中查找一些标记时遇到了一个问题

我希望preg_match_all查找后面跟着标记的所有标记，然后将这些标记的所有href值存储在数组变量中
以下是我的想法：

preg_match_all("~(<a href=\"(.*)\">\w+<\/strong>)~iU", $result, $link);

谁能帮帮我吗
我强烈建议你和我一起去
这段代码应该可以做到这一点

<?php /** * @author Jay Gilford * @edited KHMKShore:stackoverflow */ /** * get_links() * * @param string $url * @return array */ function get_links($url) { // Create a new DOM Document to hold our webpage structure $xml = new DOMDocument(); // Load the url's contents into the DOM (the @ supresses any errors from invalid XML) @$xml->loadHTMLFile($url); // Empty array to hold all links to return $links = array(); //Loop through each <a> and </a> tag in the dom foreach($xml->getElementsByTagName('a') as $link) { //if it has a strong tag in it, save the href link. if (count($link->getElementsByTagName('strong')) > 0) { $links[] = array('url' => $link->getAttribute('href'), 'text' => $link->nodeValue); } } //Return the links return $links; }

首先，正则表达式很容易失败 <a alt="cow > moo" href="cow.php">moo</a> 第二，您的正则表达式稍微有点过时，以下操作将起作用： ~(<a href="(.*)">\w+</a>)~ ~（）~ 第三，也是最重要的一点，如果你想保证提取出你想要的东西而不失败，就像@KHMKShore所指出的那样，DOMDocument是最好的途径。这个不能坚持，太迟了。确保$result和$link是正确的方法。。。除此之外，我们还需要看一些示例html 来编写regex…非常感谢！你是对的，DOMDocument是正确的选择！非常感谢你！ ~(<a href="(.*)">\w+</a>)~