php读取Web地址的文件列表并搜索这些站点的源字符串_Php

php读取Web地址的文件列表并搜索这些站点的源字符串

php

php读取Web地址的文件列表并搜索这些站点的源字符串,php,Php,我创建了这个脚本来搜索ny2.txt，其中包含一个web URL列表（目前只有一行）然后我想通过每一行循环，得到该站点的源代码。最后，我正在检查站点中是否存在文本这来自错误日志 [08-May-2014 08:10:55 America/Denver] PHP Warning: file_get_contents(): php_network_getaddresses: getaddrinfo failed: Name or service not known in /home4/mi

我创建了这个脚本来搜索ny2.txt，其中包含一个web URL列表（目前只有一行）

然后我想通过每一行循环，得到该站点的源代码。最后，我正在检查站点中是否存在文本

这来自错误日志

[08-May-2014 08:10:55 America/Denver] PHP Warning:  file_get_contents(): php_network_getaddresses: getaddrinfo failed: Name or service not known in /home4/millipg7/public_html/limitedtee/test/test.php on line 8
[08-May-2014 08:10:55 America/Denver] PHP Warning:  file_get_contents(http://campersbarn.com
): failed to open stream: php_network_getaddresses: getaddrinfo failed: Name or service not known in /home4/millipg7/public_html/limitedtee/test/test.php on line 8

如果ny2.txt中的内容是“http the url.com”或“http the url.com” php执行速度非常快，但什么也没发生

<?php

$lines = file('ny2.txt');
$fh = fopen("result.txt", 'w');

foreach ($lines as $line_num => $url) {
  $html = file_get_contents($url);
  if (strpos($html,'krgrpowered')!==false)
   fwrite($fh,$url."\n");
} 
    fclose($fh);
?>

要删除尾随换行符，可以使用：

但我建议您只需使用

文件\u忽略\u新行

标志，这样就不会首先附加它们：

$lines = file('ny2.txt', FILE_IGNORE_NEW_LINES);

不管怎样，它们几乎没有什么用处。

检查正在生成的$url的var\u转储。它可能像空格/换行符打断URL一样简单。尝试：

$html = file_get_contents(trim($url));

或者，如果你想非常彻底：

$html = file_get_contents(rawurlencode(trim($url)));

我注意到您的错误消息（）中的URL后面有一点空格，所以可能就是这样。

您所说的“http the URL.com”或“http the URL.com”是什么意思？你正在引用url吗？请检查你的php.ini中是否启用了

allow_url_fopen

，以及你的服务器是否能够实际解析

campersbarn.com

：

nslookup campersbarn.com

。如果没有它的{}功能，请不要编辑问题并删除日志格式。正确的格式使它更具可读性，并突出问题所在。还可以添加跳过空行。

$html = file_get_contents(rawurlencode(trim($url)));