使用PHP从文件中读取最后一行（即“tail”）的最佳方法是什么？_Php_Performance_Logging

使用PHP从文件中读取最后一行（即“tail”）的最佳方法是什么？

php performance logging

使用PHP从文件中读取最后一行（即“tail”）的最佳方法是什么？,php,performance,logging,Php,Performance,Logging,在我的PHP应用程序中，我需要读取多行代码，从许多文件（主要是日志）。有时候我只需要最后一个，有时候我需要几十或几百。基本上，我想要像Unixtail 指挥部这里有一些关于如何从文件中获取最后一行的问题（但是我需要N行），并给出了不同的解决方案。我不确定是哪个一个是最好的，哪个性能更好。方法概述在互联网上搜索时，我遇到了不同的解决方案。我可以把它们分组有三种方法：使用file（）PHP函数的天真者欺骗在系统上运行tail命令的人使用fseek（）我最终选择（或写）了五个

在我的PHP应用程序中，我需要读取多行代码，从许多文件（主要是日志）。有时候我只需要最后一个，有时候我需要几十或几百。基本上，我想要像Unix
tail
指挥部
这里有一些关于如何从文件中获取最后一行的问题（但是我需要N行），并给出了不同的解决方案。我不确定是哪个一个是最好的，哪个性能更好。
方法概述在互联网上搜索时，我遇到了不同的解决方案。我可以把它们分组有三种方法：

使用
file（）
PHP函数的天真者
欺骗在系统上运行
tail
命令的人
使用
fseek（）
我最终选择（或写）了五个解决方案，一个幼稚的，一个欺骗的还有三个强大的最简洁,，使用内置的数组函数这个一个小问题：如果tail不可用，即打开，则它不会运行非Unix（Windows）或在不允许系统访问的受限环境中功能从文件搜索结束时读取单个字节的解决方案对于（和计数）新行字符，找到找到了针对大型文件优化的多字节缓冲解决方案缓冲区长度为动态，根据要检索的行数决定所有解决方案都有效。从某种意义上说，它们从任何文件和我们要求的任何行数（解决方案#1除外，可以如果文件较大，则打破PHP内存限制，不返回任何内容）。但是哪一个呢更好吗性能测试为了回答这个问题，我进行了测试。这些事情就是这样做的，不是吗我准备了一个样本100KB文件将在中找到的不同文件连接在一起我的/var/log目录。然后我编写了一个PHP脚本，它使用检索1、2、…、10、20、….的五种解决方案。。。100、200、…、1000行从文件的末尾开始。每个测试重复十次（即大约5×28×10=1400次测试），测量平均运行时间时间以微秒为单位我在本地开发机器上运行脚本（Xubuntu 12.04， PHP 5.3.10，2.70 GHz双核CPU，2 GB RAM），使用PHP命令行口译译员结果如下：解决方案1和2似乎是更糟糕的。只有在我们需要的时候，解决方案3才是好的读几行解决方案4和5似乎是最好的。注意动态缓冲区大小如何优化算法：执行时间有点长由于缓冲区减少，因此对于几行来说更小让我们试着用一个更大的文件。如果我们必须读取10 MB的日志文件怎么办现在解决方案#1是最糟糕的一个：事实上，加载整个10MB文件进入记忆不是一个好主意。我也在1MB和100MB文件上运行测试，实际上情况也是一样对于微小的日志文件呢？这是10 KB文件的图表：解决方案#1现在是最好的！将10KB的内存加载到内存中并不是什么大问题对于PHP。此外，4号和5号表现良好。然而，这是一个边缘情况：一个10 KB的日志意思是大约150/200行您可以下载我的所有测试文件、源和结果最后的想法强烈建议用于一般用例：非常有效对于每种文件大小，在读取几行时性能都特别好如果需要，请避免使用应读取大于10 KB的文件解决方案和对于我运行的每个测试来说，不是最好的：#2从不在少于 2ms和#3严重受你问的行数（只有1行或2行才有效）。这也可以： $file = new SplFileObject("/path/to/file"); $file->seek(PHP_INT_MAX); // cheap trick to seek to EoF $total_lines = $file->key(); // last line number // output the last twenty lines $reader = new LimitIterator($file, $total_lines - 20); foreach ($reader as $line) { echo $line; // includes newlines } 或者不使用限制迭代器： $file = new SplFileObject($filepath); $file->seek(PHP_INT_MAX); $total_lines = $file->key(); $file->seek($total_lines - 20); while (!$file->eof()) { echo $file->current(); $file->next(); } 不幸的是，您的测试用例在我的机器上出现故障，因此我无法判断它的性能。这是一个经过修改的版本，还可以跳过最后几行： /** * Modified version of http://www.geekality.net/2011/05/28/php-tail-tackling-large-files/ and of https://gist.github.com/lorenzos/1711e81a9162320fde20 * @author Kinga the Witch (Trans-dating.com), Torleif Berger, Lorenzo Stanco * @link http://stackoverflow.com/a/15025877/995958 * @license http://creativecommons.org/licenses/by/3.0/ */ function tailWithSkip($filepath, $lines = 1, $skip = 0, $adaptive = true) { // Open file $f = @fopen($filepath, "rb"); if (@flock($f, LOCK_SH) === false) return false; if ($f === false) return false; if (!$adaptive) $buffer = 4096; else { // Sets buffer size, according to the number of lines to retrieve. // This gives a performance boost when reading a few lines from the file. $max=max($lines, $skip); $buffer = ($max < 2 ? 64 : ($max < 10 ? 512 : 4096)); } // Jump to last character fseek($f, -1, SEEK_END); // Read it and adjust line number if necessary // (Otherwise the result would be wrong if file doesn't end with a blank line) if (fread($f, 1) == "\n") { if ($skip > 0) { $skip++; $lines--; } } else { $lines--; } // Start reading $output = ''; $chunk = ''; // While we would like more while (ftell($f) > 0 && $lines >= 0) { // Figure out how far back we should jump $seek = min(ftell($f), $buffer); // Do the jump (backwards, relative to where we are) fseek($f, -$seek, SEEK_CUR); // Read a chunk $chunk = fread($f, $seek); // Calculate chunk parameters $count = substr_count($chunk, "\n"); $strlen = mb_strlen($chunk, '8bit'); // Move the file pointer fseek($f, -$strlen, SEEK_CUR); if ($skip > 0) { // There are some lines to skip if ($skip > $count) { $skip -= $count; $chunk=''; } // Chunk contains less new line symbols than else { $pos = 0; while ($skip > 0) { if ($pos > 0) $offset = $pos - $strlen - 1; // Calculate the offset - NEGATIVE position of last new line symbol else $offset=0; // First search (without offset) $pos = strrpos($chunk, "\n", $offset); // Search for last (including offset) new line symbol if ($pos !== false) $skip--; // Found new line symbol - skip the line else break; // "else break;" - Protection against infinite loop (just in case) } $chunk=substr($chunk, 0, $pos); // Truncated chunk $count=substr_count($chunk, "\n"); // Count new line symbols in truncated chunk } } if (strlen($chunk) > 0) { // Add chunk to the output $output = $chunk . $output; // Decrease our line counter $lines -= $count; } } // While we have too many lines // (Because of buffer size we might have read too many) while ($lines++ < 0) { // Find first newline and remove all text before that $output = substr($output, strpos($output, "\n") + 1); } // Close file and return @flock($f, LOCK_UN); fclose($f); return trim($output); } /** *修订版http://www.geekality.net/2011/05/28/php-tail-tackling-large-files/ 及https://gist.github.com/lorenzos/1711e81a9162320fde20 *@作者：女巫金卡（Trans-dating.com），托莱夫·伯杰，洛伦佐·斯坦科 *@linkhttp://stackoverflow.com/a/15025877/995958 *@许可证http://creativecommons.org/licenses/by/3.0/ */ 函数tailWithSkip（$filepath、$lines=1、$skip=0、$adaptive=true） { //打开文件 $f=@fopen（$filepath，“rb”）； if（@flock（$f，LOCK_SH）==false）返回false；如果（$f==false），则返回false；如果（！$adaptive）$buffer=4096；否则{ //根据要检索的行数设置缓冲区大小。 //当从文件中读取几行时，这将提高性能。 $max=max（$line，$skip）； $buffer=（$max<2？64:（$max<10？512:4096））； } //跳转到最后一个字符 fseek（$f，-1，SEEK_END）； //阅读并根据需要调整行号 //（否则，如果文件未以空行结尾，则结果将是错误的）如果（fread（$f，1）=“\n”）{ 如果（$skip>0）{$skip++；$lines--；} }否则{ 美元行--； } //开始阅读 $output=''； $chunk=''； //虽然我们想要更多而（ftell（$f）>0&&$line>=0）{ //想想我们应该向后跳多远 $seek=min（ftell（$f），$buffer）； //做跳跃（向后，相对于我们现在的位置） fseek（$f，-$seek，seek_CUR）； //读一段 $chunk=fread（$f，$seek）； //计算区块参数 $count=substr\u count（$chunk，“\n”）； $strlen=mb_strlen（$ch $last_rows_array = file_get_tail('logfile.log', 100, array( 'regex' => true, // use regex 'separator' => '#\n{2,}#', // separator: at least two newlines 'typical_item_size' => 200, // line length )); // public domain function file_get_tail( $file, $requested_num = 100, $args = array() ){ // default arg values $regex = true; $separator = null; $typical_item_size = 100; // estimated size $more_size_mul = 1.01; // +1% $max_more_size = 4000; extract( $args ); if( $separator === null ) $separator = $regex ? '#\n+#' : "\n"; if( is_string( $file )) $f = fopen( $file, 'rb'); else if( is_resource( $file ) && in_array( get_resource_type( $file ), array('file', 'stream'), true )) $f = $file; else throw new \Exception( __METHOD__.': file must be either filename or a file or stream resource'); // get file size fseek( $f, 0, SEEK_END ); $fsize = ftell( $f ); $fpos = $fsize; $bytes_read = 0; $all_items = array(); // array of array $all_item_num = 0; $remaining_num = $requested_num; $last_junk = ''; while( true ){ // calc size and position of next chunk to read $size = $remaining_num * $typical_item_size - strlen( $last_junk ); // reading a bit more can't hurt $size += (int)min( $size * $more_size_mul, $max_more_size ); if( $size < 1 ) $size = 1; // set and fix read position $fpos = $fpos - $size; if( $fpos < 0 ){ $size -= -$fpos; $fpos = 0; } // read chunk + add junk from prev iteration fseek( $f, $fpos, SEEK_SET ); $chunk = fread( $f, $size ); if( strlen( $chunk ) !== $size ) throw new \Exception( __METHOD__.": read error?"); $bytes_read += strlen( $chunk ); $chunk .= $last_junk; // chunk -> items, with at least one element $items = $regex ? preg_split( $separator, $chunk ) : explode( $separator, $chunk ); // first item is probably cut in half, use it in next iteration ("junk") instead // also skip very first '' item if( $fpos > 0 || $items[0] === ''){ $last_junk = $items[0]; unset( $items[0] ); } // … else noop, because this is the last iteration // ignore last empty item. end( empty [] ) === false if( end( $items ) === '') array_pop( $items ); // if we got items, push them $num = count( $items ); if( $num > 0 ){ $remaining_num -= $num; // if we read too much, use only needed items if( $remaining_num < 0 ) $items = array_slice( $items, - $remaining_num ); // don't fix $remaining_num, we will exit anyway $all_items[] = array_reverse( $items ); $all_item_num += $num; } // are we ready? if( $fpos === 0 || $remaining_num <= 0 ) break; // calculate a better estimate if( $all_item_num > 0 ) $typical_item_size = (int)max( 1, round( $bytes_read / $all_item_num )); } fclose( $f ); //tr( $all_items ); return call_user_func_array('array_merge', $all_items ); } <?php function lastLines($file, $lines) { $size = filesize($file); $fd=fopen($file, 'r+'); $pos = $size; $n=0; while ( $n < $lines+1 && $pos > 0) { fseek($fd, $pos); $a = fread($fd, 1); if ($a === "\n") { ++$n; }; $pos--; } $ret = array(); for ($i=0; $i<$lines; $i++) { array_push($ret, fgets($fd)); } return $ret; } print_r(lastLines('hola.php', 4)); ?> /** * @param $pathname */ private function tail($pathname) { $realpath = realpath($pathname); $fp = fopen($realpath, 'r', FALSE); $lastline = ''; fseek($fp, $this->tailonce($pathname, 1, false), SEEK_END); do { $line = fread($fp, 1000); if ($line == $lastline) { usleep(50); } else { $lastline = $line; echo $lastline; } } while ($fp); } /** * @param $pathname * @param $lines * @param bool $echo * @return int */ private function tailonce($pathname, $lines, $echo = true) { $realpath = realpath($pathname); $fp = fopen($realpath, 'r', FALSE); $flines = 0; $a = -1; while ($flines <= $lines) { fseek($fp, $a--, SEEK_END); $char = fread($fp, 1); if ($char == "\n") $flines++; } $out = fread($fp, 1000000); fclose($fp); if ($echo) echo $out; return $a+2; } echo join(array_slice(file("path/to/file"), -5)); echo join("\n",array_slice(explode("\n",file_get_contents("path/to/file")), -5)); echo join("<br>",array_slice(explode(PHP_EOL,file_get_contents("path/to/file")), -5)); echo join(PHP_EOL,array_slice(explode("\n",file_get_contents("path/to/file")), -5));