PHP在两个字符串之间抓取内容 //从联合域页脚获取内容 $content=file\u get\u contents（'http://www.uniteddomains.com/index/footer/'); //从内容中删除空格 $content=preg_replace（'/\s+/'，''$content）； //匹配所有tld标签 $regex='#target=“_parent”>。（.*）#； preg_match（$regex，$source，$matches）；打印（匹配项）；_Php_Regex_Parsing

PHP在两个字符串之间抓取内容 //从联合域页脚获取内容 $content=file\u get\u contents（'http://www.uniteddomains.com/index/footer/'); //从内容中删除空格 $content=preg_replace（'/\s+/'，''$content）； //匹配所有tld标签 $regex='#target=“_parent”>。（.*）#； preg_match（$regex，$source，$matches）；打印（匹配项）；

php regex parsing

PHP在两个字符串之间抓取内容 //从联合域页脚获取内容 $content=file\u get\u contents（'http://www.uniteddomains.com/index/footer/'); //从内容中删除空格 $content=preg_replace（'/\s+/'，''$content）； //匹配所有tld标签 $regex='#target=“_parent”>。（.*）#； preg_match（$regex，$source，$matches）；打印（匹配项）；,php,regex,parsing,Php,Regex,Parsing,我希望匹配所有TLD：每个tld前面是target=“\u parent”>。，后面是我希望最终得到一个数组，比如array（'africa'、'amsterdam'、'bnc'…etc）我做错了什么注意：删除所有空格的第二步只是简化事情这里有一个正则表达式，它将为该页面执行此操作 // get CONTENT from united domains footer $content = file_get_contents('http://www.uniteddomains.com/in

我希望匹配所有TLD：

每个tld前面是

target=“\u parent”>。

，后面是

我希望最终得到一个数组，比如

array（'africa'、'amsterdam'、'bnc'…etc）

我做错了什么

注意：删除所有空格的第二步只是简化事情

这里有一个正则表达式，它将为该页面执行此操作

// get CONTENT from united domains footer
$content = file_get_contents('http://www.uniteddomains.com/index/footer/');

// remove spaces from CONTENT
$content = preg_replace('/\s+/', '', $content);

// match all tld tags
$regex = '#target="_parent">.(.*?)</a></li><li>#';
preg_match($regex, $source, $matches);


print_r($matches);

\。\w+（？=）

PHP

\.\w+(?=</a></li>)

$content=file\u get\u contents（'http://www.uniteddomains.com/index/footer/');
preg_match_all（'/\.\w+（？=）/m'，$content，$matches）；
打印（匹配项）；

以下是结果：

.非洲、.阿姆斯特丹、.波士顿、.布鲁塞尔、.柏林、.波士顿、.布达佩斯、.金特、.汉堡、.科伦、.伦敦、.马德里、.墨尔本、.莫斯科、.迈阿密、.名古屋、.纽约、.冲绳、.大阪、.巴黎、.魁北克、.罗马、.琉球、.斯德哥尔摩、.悉尼、.东京、.维加斯、.维昂、.横滨、.非洲、.阿拉伯、.拜仁、.bzh、.cymru、.kiwi、.lat、，.scot、.vlaanderen、.wales、.app、.blog、.chat、.cloud、.digital、.email、.mobile、.online、.site、.mls、.secure、.web、.wiki、.associates、.business、.car、.careers、.contractors、.Cloth、.design、.equipment、.estate、.gallery、.graphics、.hotel、.immo、.Inv，.solutions、.sucks、.Taxis、.trade、.archi、.成人、.bio、.center、.city、.club、.cool、.date、.earth、.energy、.family、.free、.green、.live、.lol、.love、.med、.ngo、.news、.phone、.pictures、.radio、.reviews、.rip、.team、.technology、.today、.voting、.buy，.eus、.gay、.eco、.hiv、.irish、.one、.pics、.A片、.sex、.singles、.vin、.vip、.bar、.pizza、.wine、.bike、.book、.holiday、.horse、.film、.music、.party、.email、.pets、.play、.Rock、.rugby、.ski、.sport、.surf、.tour、.video、使用

$content = file_get_contents('http://www.uniteddomains.com/index/footer/');
preg_match_all('/\.\w+(?=<\/a><\/li>)/m', $content, $matches);
print_r($matches);

这仍然是HTML解析，应该使用而不是正则表达式来完成。这不是HTML解析，它是在一个字符串中找到一个恰好是HTML的特定模式。在它只匹配小写的情况下，我该如何做呢？使用精确的代码，它提取“地理和旅行”和其他标题文本。

$doc = new DOMDocument();
@$doc->loadHTMLFile('http://www.uniteddomains.com/index/footer/');
$xpath = new DOMXPath($doc);
$items = $xpath->query('/html/body/div/ul/li/ul/li[not(@class)]/a[@target="_parent"]/text()');
$result = '';
foreach($items as $item) {
    $result .= $item->nodeValue; }
$result = explode('.', $result);
array_shift($result);
print_r($result);