Php Xpath中preg_匹配的错误是什么?未定义的偏移量:1
我尝试从属性id:获取id,代码如下:Php Xpath中preg_匹配的错误是什么?未定义的偏移量:1,php,xpath,preg-match,Php,Xpath,Preg Match,我尝试从属性id:获取id,代码如下: <?php $getURL = file_get_contents('http://realestate.com.kh/residential-for-rent-in-phnom-penh-daun-penh-phsar-chas-2-beds-apartment-1001192296/'); $dom = new DOMDocument(); @$dom->loadHTML($getURL); $xpath = new DOMXPath($d
<?php
$getURL = file_get_contents('http://realestate.com.kh/residential-for-rent-in-phnom-penh-daun-penh-phsar-chas-2-beds-apartment-1001192296/');
$dom = new DOMDocument();
@$dom->loadHTML($getURL);
$xpath = new DOMXPath($dom);
/*echo $xpath->evaluate("normalize-space(substring-before(substring-after(//p[contains(text(),'Property ID:')][1], 'Property ID:'), '–'))");*/
$id = $xpath->evaluate('//div[contains(@class,"property-table")]')->item(0)->nodeValue;
preg_match("/Property ID :(.*)/", $id, $matches);
echo $matches[1];
怎么了?如果我像这样制造刺痛
$id ="Property Details Property Type : Apartment Price $ 350 pm Building Size 72 Sqms Property ID : 1001192296";
并在我的代码中替换它。那么,myselt创建的数据和从xpath抓取的数据有什么区别呢?
提前感谢您的帮助。您需要检查
preg\u match()
是否真的找到了任何东西
如果没有结果,将不会有$matches[1]
。您应该使用if(count($matches)>1){…}
来解决您遇到的问题。您的preg_match()
不起作用,因为您从xpath获得的nodeValue
正是这样的:
Property Details
Property Type :
Apartment
Price
$ 350 pm
Building Size
72 Sqms
Property ID
:
1001192296
所以你必须这样尝试:
$getURL = file_get_contents('http://realestate.com.kh/residential-for-rent-in-phnom-penh-daun-penh-phsar-chas-2-beds-apartment-1001192296/');
$dom = new DOMDocument();
@$dom->loadHTML($getURL);
$xpath = new DOMXPath($dom);
/*echo $xpath->evaluate("normalize-space(substring-before(substring-after(//p[contains(text(),'Property ID:')][1], 'Property ID:'), '–'))");*/
$id = $xpath->evaluate('//div[contains(@class,"property-table")]')->item(0)->nodeValue;
$id = preg_replace('!\s+!', ' ', $id);
preg_match("/Property ID :(.*)/", $id, $matches);
echo $matches[1];
此($id=preg_replace(“!\s+!”,“$id);
)将所有制表符、单词之间的空格合并为一个空格
更新:
由于下面的注释,我现在使用$xpath->evaluate()
获取HTML的全文,并尝试匹配所有属性ID(例如仅数字和P数字)
$getURL=file\u get\u contents('http://realestate.com.kh/residential-for-rent-in-phnom-penh-daun-penh-phsar-chas-2-beds-apartment-1001192296/');
$dom=新的DOMDocument();
@$dom->loadHTML($getURL);
$xpath=newdomxpath($dom);
//这只返回不带html标记的整个页面的文本
$id=$xpath->evaluate(“//html”)->item(0)->nodeValue;
$id=preg_replace(“!\s+!”,“$id”);
//不是一个好的正则表达式,但与属性ID匹配
preg_match_all(“/Property ID(|)):[|](\w{0,1}[-]|)\d*)/”,$ID,$matches);
//更改后,您必须选择匹配项为$matches[2]
foreach($property\u id匹配[2]{
echo$property_id.“
”;
}
$matches[1]
未定义是。但是,如果我用该数据创建一个字符串,它就可以工作,为什么它不能处理从xpath获取的数据呢?如果preg_匹配不匹配,那么数组$matches
没有索引1
注意事项来自echo$matches[1]代码>有没有办法解决这个问题?我应该使用什么来获取数据而不是使用Xpath。错误在于preg\u match()
而不是Xpath。显然,您没有正确解析HTML,您需要重新检查xpath的表达式。但是,使用此xpath表达式,我可以在需要时获取数据,我将对此进行更新表决,因为由于错误,您是正确的。我使用此预匹配(“/Property ID:(.*?(=\s)/”,$ID,$matches);它不起作用,但当我用它测试时,它起作用了。我想知道在我的代码中替换这个时会出现什么错误。问题是,你测试的永远不是你得到的id,你尝试匹配的id不在var$id
中,因为你只过滤了这个div//div[contains(@class,“property table”)]
,我将更新我的答案,在那里我得到完整的html并匹配所有属性id
$getURL = file_get_contents('http://realestate.com.kh/residential-for-rent-in-phnom-penh-daun-penh-phsar-chas-2-beds-apartment-1001192296/');
$dom = new DOMDocument();
@$dom->loadHTML($getURL);
$xpath = new DOMXPath($dom);
/*echo $xpath->evaluate("normalize-space(substring-before(substring-after(//p[contains(text(),'Property ID:')][1], 'Property ID:'), '–'))");*/
$id = $xpath->evaluate('//div[contains(@class,"property-table")]')->item(0)->nodeValue;
$id = preg_replace('!\s+!', ' ', $id);
preg_match("/Property ID :(.*)/", $id, $matches);
echo $matches[1];
$getURL = file_get_contents('http://realestate.com.kh/residential-for-rent-in-phnom-penh-daun-penh-phsar-chas-2-beds-apartment-1001192296/');
$dom = new DOMDocument();
@$dom->loadHTML($getURL);
$xpath = new DOMXPath($dom);
// this only returns the text of the whole page without html tags
$id = $xpath->evaluate( "//html" )->item(0)->nodeValue;
$id = preg_replace('!\s+!', ' ', $id);
// not a good regex, but matches the property IDs
preg_match_all("/Property ID( |):[ |]((\w{0,1}[-]|)\d*)/", $id, $matches);
// after the changes you have to go for the matches is $matches[2]
foreach( $matches[2] as $property_id ) {
echo $property_id."<br>";
}