Php XPath表刮取 交易代码: RCL 新闻标题: DSENEWS:授权代表的撤回 新闻: 授权代表的撤回:皇家资本有限公司,DSE TREC第21号,已撤回其授权代表之一Zikrul Haque先生,立即生效。 交易代码: 岛 新闻标题: DSENEWS:授权代表的撤回 新闻: 授权代表的撤回:IDLC证券有限公司,DSE TREC第58号,已撤回其授权代表之一Mohammad Ziaur Rahman先生,立即生效。
这是我的表,我想从中获取每个表的交易代码和标题,并将其存储到我的数据库中,请帮助取消此数据。Php XPath表刮取 交易代码: RCL 新闻标题: DSENEWS:授权代表的撤回 新闻: 授权代表的撤回:皇家资本有限公司,DSE TREC第21号,已撤回其授权代表之一Zikrul Haque先生,立即生效。 交易代码: 岛 新闻标题: DSENEWS:授权代表的撤回 新闻: 授权代表的撤回:IDLC证券有限公司,DSE TREC第58号,已撤回其授权代表之一Mohammad Ziaur Rahman先生,立即生效。,php,xpath,web-scraping,Php,Xpath,Web Scraping,这是我的表,我想从中获取每个表的交易代码和标题,并将其存储到我的数据库中,请帮助取消此数据。$html=” <table border='0' cellspacing='3' width='100%'> <tr> <td width='15%'><font color='#3366FF'><b>Trading Code:</b></font></td> <td wi
$html=”
<table border='0' cellspacing='3' width='100%'>
<tr>
<td width='15%'><font color='#3366FF'><b>Trading Code:</b></font></td>
<td width='85%'>RCL</td>
</tr>
<tr>
<td><font color='#3366FF'><b>News Title:</b></font></td>
<td>DSENEWS: Withdrawal of Authorized Representative</td>
</tr>
<tr>
<td><font color='#3366FF'><b>News:</b></font></td>
<td align='justify'>Withdrawal of Authorized Representative: Royal Capital Ltd., DSE TREC No. 21, has withdrawn one of its Authorized Representatives, Mr. Md. Zikrul Haque, with immediate effect.</td>
</tr>
</table>
<br>
<table border='0' cellspacing='3' width='100%'>
<tr>
<td width='15%'><font color='#3366FF'><b>Trading Code:</b></font></td>
<td width='85%'>ISL</td>
</tr>
<tr>
<td><font color='#3366FF'><b>News Title:</b></font></td>
<td>DSENEWS: Withdrawal of Authorized Representative</td>
</tr>
<tr>
<td><font color='#3366FF'><b>News:</b></font></td>
<td align='justify'>Withdrawal of Authorized Representative: IDLC Securities Ltd., DSE TREC No. 58, has withdrawn one of its Authorized Representatives, Mr. Mohammad Ziaur Rahman, with immediate effect.</td>
</tr>
</table>
交易代码:
RCL
新闻标题:
DSENEWS:授权代表的撤回
新闻:
授权代表的撤回:皇家资本有限公司,DSE TREC第21号,已撤回其授权代表之一Zikrul Haque先生,立即生效。
交易代码:
岛
新闻标题:
DSENEWS:授权代表的撤回
新闻:
授权代表的撤回:IDLC证券有限公司,DSE TREC第58号,已撤回其授权代表之一Mohammad Ziaur Rahman先生,立即生效。
";
/*构造XPath表达式以查找所需数据*/
$query='//td[包含(,“交易代码”)]/跟随兄弟姐妹::td |//td[包含(,“新闻标题”)]/跟随兄弟姐妹::td';
/*创建DOMDocument和DOMXPath对象*/
$dom=新的DOMDocument;
$dom->loadHTML($html);
$xp=新的DOMXPath($dom);
/*运行查询以查找节点*/
$col=$xp->query($query);
/*处理节点*/
如果(!空($col)){
foreach($col作为$td){
/*对发现的数据进行处理*/
echo$td->nodeValue;
}
}
可以找到构造XPath表达式(&CSS选择器)时的有用参考
$html="<table border='0' cellspacing='3' width='100%'>
<tr>
<td width='15%'>Trading Code:</td>
<td width='85%'>RCL</td>
</tr>
<tr>
<td><font color='#3366FF'><b>News Title:</b></font></td>
<td>DSENEWS: Withdrawal of Authorized Representative</td>
</tr>
<tr>
<td><font color='#3366FF'><b>News:</b></font></td>
<td align='justify'>Withdrawal of Authorized Representative: Royal Capital Ltd., DSE TREC No. 21, has withdrawn one of its Authorized Representatives, Mr. Md. Zikrul Haque, with immediate effect.</td>
</tr>
</table>
<br>
<table border='0' cellspacing='3' width='100%'>
<tr>
<td width='15%'><font color='#3366FF'><b>Trading Code:</b></font></td>
<td width='85%'>ISL</td>
</tr>
<tr>
<td><font color='#3366FF'><b>News Title:</b></font></td>
<td>DSENEWS: Withdrawal of Authorized Representative</td>
</tr>
<tr>
<td><font color='#3366FF'><b>News:</b></font></td>
<td align='justify'>Withdrawal of Authorized Representative: IDLC Securities Ltd., DSE TREC No. 58, has withdrawn one of its Authorized Representatives, Mr. Mohammad Ziaur Rahman, with immediate effect.</td>
</tr>
</table>";
/* Construct XPath expression to find required data*/
$query='//td[contains( . , "Trading Code" )]/following-sibling::td|//td[contains( . , "News Title" )]/following-sibling::td';
/* create the DOMDocument & DOMXPath objects */
$dom=new DOMDocument;
$dom->loadHTML( $html );
$xp=new DOMXPath( $dom );
/* Run the query to find nodes */
$col=$xp->query( $query );
/* Process the nodes */
if( !empty( $col ) ){
foreach( $col as $td ){
/* do something with data found */
echo $td->nodeValue;
}
}