用javascript代码在php中抓取网站_Php_Web Scraping

用javascript代码在php中抓取网站

php web-scraping

用javascript代码在php中抓取网站,php,web-scraping,Php,Web Scraping,我需要清理这个网站，这里是链接 . 你们中有谁能指导我或给我一些提示如何在php中刮取它吗？遵循以下方法：这一页要翻成斜纹 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>This is the Title</title> <meta name="description" content="Met

我需要清理这个网站，
这里是链接 .

你们中有谁能指导我或给我一些提示如何在php中刮取它吗？

遵循以下方法：

这一页要翻成斜纹

  <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> 
  <head> <title>This is the Title</title> 
<meta name="description" content="Meta Description" /> 
<meta name="keywords" content="Meta Keywords" /> 
    </head> 
  <body> 
   <ul>
<li><a href="link1.html">Link 1</a></li>
<li><a href="link2.html">Link 2</a></li>
<li><a href="link3.html">Link 3</a></li>
<li><a href="link4.html">Link 4</a></li>
<li><a href="link5.html">Link 5</a></li>
   </ul>
  </div> 
  </body>


这是标题

刮削法

  <?php 
  $file_string = file_get_contents('page_to_scrape');
  preg_match('/<title>(.*)<\/title>/i', $file_string, $title);
  $title_out = $title[1];
  preg_match('/<meta name="keywords" content="(.*)" \/> /i', $file_string, $keywords);
  $keywords_out = $keywords[1];
  preg_match('/<meta name="description" content="(.*)" \/> /i', $file_string, $description);
  $description_out = $description[1];
  preg_match_all('/<li><a href="(.*)">(.*)<\/a><\/li>/i', $file_string, $links);
   ?>

   <p><strong>Title:</strong> <?php echo $title_out; ?></p>
   <p><strong>Keywords:</strong> <?php echo $keywords_out; ?></p>
   <p><strong>Description:</strong> <?php echo $description_out; ?></p>
    <p><strong>Links:</strong> <em>(Name - Link)</em><br />
    <?php
echo '<ol>';
for($i = 0; $i < count($links[1]); $i++) {
    echo '<li>' . $links[2][$i] . ' - ' . $links[1][$i] . '</li>';
}
echo '</ol>';
    ?>
    </p>

wt我应该在link1.html中给出吗？网站链接？？抱歉，如果我问了一个转储问题哦，好吧，我明白了，我必须写那些包含所需信息的链接，就像我想刮取航班数据一样，然后我必须写与航班相关的链接？？？我不需要标题和关键字，我要航班信息根据特定的日期和城市如何刮取此页面？您需要更改file_get_目录（'page_to_scrape'）到文件获取内容（'http://flights.makemytrip.com/makemytrip');