Php XPath表刮取 交易代码: RCL 新闻标题: DSENEWS:授权代表的撤回 新闻: 授权代表的撤回:皇家资本有限公司,DSE TREC第21号,已撤回其授权代表之一Zikrul Haque先生,立即生效。 交易代码: 岛 新闻标题: DSENEWS:授权代表的撤回 新闻: 授权代表的撤回:IDLC证券有限公司,DSE TREC第58号,已撤回其授权代表之一Mohammad Ziaur Rahman先生,立即生效。

Php XPath表刮取 交易代码: RCL 新闻标题: DSENEWS:授权代表的撤回 新闻: 授权代表的撤回:皇家资本有限公司,DSE TREC第21号,已撤回其授权代表之一Zikrul Haque先生,立即生效。 交易代码: 岛 新闻标题: DSENEWS:授权代表的撤回 新闻: 授权代表的撤回:IDLC证券有限公司,DSE TREC第58号,已撤回其授权代表之一Mohammad Ziaur Rahman先生,立即生效。,php,xpath,web-scraping,Php,Xpath,Web Scraping,这是我的表,我想从中获取每个表的交易代码和标题,并将其存储到我的数据库中,请帮助取消此数据。$html=” <table border='0' cellspacing='3' width='100%'> <tr> <td width='15%'><font color='#3366FF'><b>Trading Code:</b></font></td> <td wi

这是我的表,我想从中获取每个表的交易代码和标题,并将其存储到我的数据库中,请帮助取消此数据。

$html=”
<table border='0' cellspacing='3' width='100%'>
   <tr>
      <td width='15%'><font color='#3366FF'><b>Trading Code:</b></font></td>
      <td width='85%'>RCL</td>
   </tr>
   <tr>
      <td><font color='#3366FF'><b>News Title:</b></font></td>
      <td>DSENEWS: Withdrawal of Authorized Representative</td>
   </tr>
   <tr>
      <td><font color='#3366FF'><b>News:</b></font></td>
      <td align='justify'>Withdrawal of Authorized Representative: Royal Capital Ltd., DSE TREC No. 21, has withdrawn one of its Authorized Representatives, Mr. Md. Zikrul Haque, with immediate effect.</td>
   </tr>
</table>
<br>
<table border='0' cellspacing='3' width='100%'>
   <tr>
      <td width='15%'><font color='#3366FF'><b>Trading Code:</b></font></td>
      <td width='85%'>ISL</td>
   </tr>
   <tr>
      <td><font color='#3366FF'><b>News Title:</b></font></td>
      <td>DSENEWS: Withdrawal of Authorized Representative</td>
   </tr>
   <tr>
      <td><font color='#3366FF'><b>News:</b></font></td>
      <td align='justify'>Withdrawal of Authorized Representative: IDLC Securities Ltd., DSE TREC No. 58, has withdrawn one of its Authorized Representatives, Mr. Mohammad Ziaur Rahman, with immediate effect.</td>
   </tr>
</table>
交易代码: RCL 新闻标题: DSENEWS:授权代表的撤回 新闻: 授权代表的撤回:皇家资本有限公司,DSE TREC第21号,已撤回其授权代表之一Zikrul Haque先生,立即生效。
交易代码: 岛 新闻标题: DSENEWS:授权代表的撤回 新闻: 授权代表的撤回:IDLC证券有限公司,DSE TREC第58号,已撤回其授权代表之一Mohammad Ziaur Rahman先生,立即生效。 "; /*构造XPath表达式以查找所需数据*/ $query='//td[包含(,“交易代码”)]/跟随兄弟姐妹::td |//td[包含(,“新闻标题”)]/跟随兄弟姐妹::td'; /*创建DOMDocument和DOMXPath对象*/ $dom=新的DOMDocument; $dom->loadHTML($html); $xp=新的DOMXPath($dom); /*运行查询以查找节点*/ $col=$xp->query($query); /*处理节点*/ 如果(!空($col)){ foreach($col作为$td){ /*对发现的数据进行处理*/ echo$td->nodeValue; } }
可以找到构造XPath表达式(&CSS选择器)时的有用参考

$html="<table border='0' cellspacing='3' width='100%'>
   <tr>
      <td width='15%'>Trading Code:</td>
      <td width='85%'>RCL</td>
   </tr>
   <tr>
      <td><font color='#3366FF'><b>News Title:</b></font></td>
      <td>DSENEWS: Withdrawal of Authorized Representative</td>
   </tr>
   <tr>
      <td><font color='#3366FF'><b>News:</b></font></td>
      <td align='justify'>Withdrawal of Authorized Representative: Royal Capital Ltd., DSE TREC No. 21, has withdrawn one of its Authorized Representatives, Mr. Md. Zikrul Haque, with immediate effect.</td>
   </tr>
</table>
<br>
<table border='0' cellspacing='3' width='100%'>
   <tr>
      <td width='15%'><font color='#3366FF'><b>Trading Code:</b></font></td>
      <td width='85%'>ISL</td>
   </tr>
   <tr>
      <td><font color='#3366FF'><b>News Title:</b></font></td>
      <td>DSENEWS: Withdrawal of Authorized Representative</td>
   </tr>
   <tr>
      <td><font color='#3366FF'><b>News:</b></font></td>
      <td align='justify'>Withdrawal of Authorized Representative: IDLC Securities Ltd., DSE TREC No. 58, has withdrawn one of its Authorized Representatives, Mr. Mohammad Ziaur Rahman, with immediate effect.</td>
   </tr>
</table>";




/* Construct XPath expression to find required data*/
$query='//td[contains( . , "Trading Code" )]/following-sibling::td|//td[contains( . , "News Title" )]/following-sibling::td';

/* create the DOMDocument & DOMXPath objects */
$dom=new DOMDocument;
$dom->loadHTML( $html );
$xp=new DOMXPath( $dom );

/* Run the query to find nodes */
$col=$xp->query( $query );

/* Process the nodes */
if( !empty( $col ) ){
    foreach( $col as $td ){
        /* do something with data found */
        echo $td->nodeValue;
    }
}