如果没有数据属性，则PHP正则表达式替换链接_Php_Html_Regex_Parsing_Replace

如果没有数据属性，则PHP正则表达式替换链接

php html regex parsing replace

如果没有数据属性，则PHP正则表达式替换链接,php,html,regex,parsing,replace,Php,Html,Regex,Parsing,Replace,我需要循环浏览一堆HTML代码并删除\r\n\n优惠–当您注册Fanduel时，可获得高达400英镑的存款奖金。如果您设置了正则表达式，并且不想使用解析器试试这个 <a (?!data-link=)[^>]*>((?!<\/a>).*?)<\/a> 请说明是否需要进一步解释。如果您设置了正则表达式，并且不想使用解析器试试这个 <a (?!data-link=)[^>]*>((?!<\/a>).*?)<\/a&

我需要循环浏览一堆HTML代码并删除


\r\n
\n优惠–当您注册Fanduel时，可获得高达400英镑的存款奖金。

如果您设置了正则表达式，并且不想使用解析器

试试这个

<a (?!data-link=)[^>]*>((?!<\/a>).*?)<\/a>


请说明是否需要进一步解释。
如果您设置了正则表达式，并且不想使用解析器
试试这个
<a (?!data-link=)[^>]*>((?!<\/a>).*?)<\/a>


请说明是否需要进一步解释。
默认情况下，PHP中的DomainDocument扩展可用。它大概更快，而且完全是为你想要达到的目标而设计的。您可以使用它加载文档并搜索没有数据链接属性的任何链接，如下所示：
$dom = new DOMDocument;
$dom->loadHTMLFile('http://www.example.com'); // load the file

$xpath = new DOMXPath($dom);
$nodes = $xpath->query('//a[not(@data-link=\'keepLink\')]'); // search for links that do not have the 'data-link' attribute set to 'keepLink'

foreach($nodes as $element){
    $textInside = $element->nodeValue; // get the text inside the link
    $parentNode = $element->parentNode; // save parent node
    $parentNode->replaceChild(new DOMText($textInside), $element); // remove the element
}

$myNewHTML = $dom->saveHTML(); // see http://php.net/manual/ro/domdocument.savehtml.php for limitations such as auto-adding of doc-type

echo $myNewHTML;

概念证明：
请记住，这将只获取元素中的文本值，而不包含data link='keepLink'属性值。
默认情况下，PHP中可以使用DOMDocument扩展名。它大概更快，而且完全是为你想要达到的目标而设计的。您可以使用它加载文档并搜索没有数据链接属性的任何链接，如下所示：
$dom = new DOMDocument;
$dom->loadHTMLFile('http://www.example.com'); // load the file

$xpath = new DOMXPath($dom);
$nodes = $xpath->query('//a[not(@data-link=\'keepLink\')]'); // search for links that do not have the 'data-link' attribute set to 'keepLink'

foreach($nodes as $element){
    $textInside = $element->nodeValue; // get the text inside the link
    $parentNode = $element->parentNode; // save parent node
    $parentNode->replaceChild(new DOMText($textInside), $element); // remove the element
}

$myNewHTML = $dom->saveHTML(); // see http://php.net/manual/ro/domdocument.savehtml.php for limitations such as auto-adding of doc-type

echo $myNewHTML;

概念证明：
请记住，这将只获取元素中的文本值，而不包含data link='keepLink'属性值。
这似乎是一项不应该使用正则表达式并依赖于适当的DOM解析器的任务。使用DOMDocument编辑html和DOMXPath以查找目标。@CasimiritHippolyte此？是此任务@Casimirithippolyte谢谢我会看看的。有什么理由我应该使用DOMDocument而不是Regex吗？如果这是一个愚蠢的问题，很抱歉，但我是新来的。这似乎是一项不应该使用正则表达式并依赖于适当的DOM解析器的任务。使用DOMDocument编辑html和DOMDXPath以找到目标。@casimirithippolyte this？是的，这一个@Casimirithippolyte谢谢我会看看的。有什么理由我应该使用DOMDocument而不是Regex吗？如果这是一个愚蠢的问题，很抱歉，但我对这个问题是新手。这不会删除
-->Fanduel
的内容吗。您需要将元素替换为其文本内容，而不是简单地删除它。这不会删除
-->Fanduel的内容吗。您需要将元素替换为其文本内容，而不是简单地将其删除。
<a (?!data-link=)[^>]*>((?!<\/a>).*?)<\/a>

$dom = new DOMDocument;
$dom->loadHTMLFile('http://www.example.com'); // load the file

$xpath = new DOMXPath($dom);
$nodes = $xpath->query('//a[not(@data-link=\'keepLink\')]'); // search for links that do not have the 'data-link' attribute set to 'keepLink'

foreach($nodes as $element){
    $textInside = $element->nodeValue; // get the text inside the link
    $parentNode = $element->parentNode; // save parent node
    $parentNode->replaceChild(new DOMText($textInside), $element); // remove the element
}

$myNewHTML = $dom->saveHTML(); // see http://php.net/manual/ro/domdocument.savehtml.php for limitations such as auto-adding of doc-type

echo $myNewHTML;