如何在Python中跟踪日志文件？_Python_Tail

如何在Python中跟踪日志文件？

python

如何在Python中跟踪日志文件？,python,tail,Python,Tail,我希望在Python中提供tail-F或类似的输出，而无需阻塞或锁定。我已经找到了一些非常古老的代码来做这件事，但我认为现在一定有更好的方法或库来做同样的事情。有人知道吗理想情况下，我会有类似于tail.getNewData（）的东西，每次需要更多数据时都可以调用它。使用（pip安装sh）： [更新] 由于带有\u iter=True的sh.tail是一个生成器，因此您可以： import sh tail = sh.tail("-f", "/var/log/some_log_file.log"

我希望在Python中提供tail-F或类似的输出，而无需阻塞或锁定。我已经找到了一些非常古老的代码来做这件事，但我认为现在一定有更好的方法或库来做同样的事情。有人知道吗

理想情况下，我会有类似于

tail.getNewData（）

的东西，每次需要更多数据时都可以调用它。

使用（pip安装sh）：

[更新]

由于带有

\u iter

=True的sh.tail是一个生成器，因此您可以：

import sh
tail = sh.tail("-f", "/var/log/some_log_file.log", _iter=True)

然后，您可以通过以下方式“获取新数据”：

请注意，如果尾部缓冲区为空，它将阻塞，直到有更多数据为止（根据您的问题，不清楚在这种情况下要执行什么操作）

[更新]

如果将-f替换为-f，这是可行的，但在Python中，这将是锁定。如果可能的话，我更希望有一个函数可以在需要时调用以获取新数据伊莱

容器生成器将尾部调用放在while True循环中并捕获最终的I/O异常，其效果几乎与-F相同

def tail_F(some_file):
    while True:
        try:
            for line in sh.tail("-f", some_file, _iter=True):
                yield line
        except sh.ErrorReturnCode_1:
            yield None

如果文件变得不可访问，生成器将不返回任何文件。但是，如果可以访问该文件，它仍然会阻塞，直到有新数据。我还不清楚在这种情况下你想做什么

Raymond Hettinger的方法似乎很好：

def tail_F(some_file):
    first_call = True
    while True:
        try:
            with open(some_file) as input:
                if first_call:
                    input.seek(0, 2)
                    first_call = False
                latest_data = input.read()
                while True:
                    if '\n' not in latest_data:
                        latest_data += input.read()
                        if '\n' not in latest_data:
                            yield ''
                            if not os.path.isfile(some_file):
                                break
                            continue
                    latest_lines = latest_data.split('\n')
                    if latest_data[-1] != '\n':
                        latest_data = latest_lines[-1]
                    else:
                        latest_data = input.read()
                    for line in latest_lines[:-1]:
                        yield line + '\n'
        except IOError:
            yield ''

如果文件无法访问或没有新数据，此生成器将返回“”

[更新]

从第二个到最后一个答案在文件顶部循环，似乎每当数据用完时都会如此伊莱

我认为，每当尾部进程结束时，第二个进程将输出最后十行，而

-f

是指每当出现I/O错误时。在类似unix的环境中，我所能想到的大多数情况下，

tail--follow--retry

行为与此并不遥远

也许如果你更新你的问题来解释你真正的目标是什么（你想模仿尾巴的原因——重试），你会得到一个更好的答案

最后一个答案实际上并不跟随尾部，只读取运行时可用的内容伊莱

当然，默认情况下，tail将显示最后10行。。。您可以使用file.seek将文件指针定位在文件末尾，我将给读者留下一个适当的实现作为练习

实际上，file.read（）方法远比基于子流程的解决方案优雅。

读取文件的唯一可移植方法似乎是从文件中读取，然后在

读取返回0时重试（在睡眠之后）。各种平台上的tail
实用程序使用特定于平台的技巧（例如BSD上的kqueue
）来高效地永远跟踪文件，而无需睡眠
因此，纯粹在Python中实现一个好的tail-f
可能不是一个好主意，因为您必须使用最小公分母实现（而不必求助于特定于平台的hack）。使用一个简单的子进程
打开tail-f
并在一个单独的线程中迭代这些行，您可以轻松地在Python中实现一个非阻塞tail
操作
实施示例：
import threading, Queue, subprocess
tailq = Queue.Queue(maxsize=10) # buffer at most 100 lines

def tail_forever(fn):
    p = subprocess.Popen(["tail", "-f", fn], stdout=subprocess.PIPE)
    while 1:
        line = p.stdout.readline()
        tailq.put(line)
        if not line:
            break

threading.Thread(target=tail_forever, args=(fn,)).start()

print tailq.get() # blocks
print tailq.get_nowait() # throws Queue.Empty if there are no lines to read

非阻塞
如果您在linux上（因为windows不支持在文件上调用select），则可以将子流程模块与select模块一起使用
import time
import subprocess
import select

f = subprocess.Popen(['tail','-F',filename],\
        stdout=subprocess.PIPE,stderr=subprocess.PIPE)
p = select.poll()
p.register(f.stdout)

while True:
    if p.poll(1):
        print f.stdout.readline()
    time.sleep(1)

这将轮询输出管道以获取新数据，并在数据可用时进行打印。通常，time.sleep（1）
和print f.stdout.readline（）
将被有用的代码替换
舞台调度
您可以使用子流程模块，而无需额外的select模块调用
import subprocess
f = subprocess.Popen(['tail','-F',filename],\
        stdout=subprocess.PIPE,stderr=subprocess.PIPE)
while True:
    line = f.stdout.readline()
    print line

这也会在添加新行时打印新行，但它会一直阻止，直到tail程序关闭，可能是使用f.kill（）

理想情况下，我会有一个类似tail.getNewData（）的东西，每次需要更多数据时都可以调用它
只要在需要更多数据时调用f.read（）。它将在上一次读取停止的位置开始读取，并将读取数据流的末尾：
f = open('somefile.log')
p = 0
while True:
    f.seek(p)
    latest_data = f.read()
    p = f.tell()
    if latest_data:
        print latest_data
        print str(p).center(10).center(80, '=')

要逐行读取，请使用f.readline（）。有时，正在读取的文件将以部分读取行结束。处理这种情况时，使用f.tell（）查找当前文件位置，并使用f.seek（）将文件指针移回未完成行的开头。有关工作代码，请参阅。
您也可以使用“AWK”命令。

更多信息请访问：

awk可用于跟踪文件中的最后一行、最后几行或任何一行。

这可以从python中调用。
因此，这已经很晚了，但我再次遇到了同样的问题，现在有了更好的解决方案。只需使用：
Pygtail读取尚未读取的日志文件行。它甚至会
处理已旋转的日志文件。基于logcheck的logtail2
（）
如果您使用的是linux，那么您可以按照以下方式在python中实现非阻塞实现
import subprocess
subprocess.call('xterm -title log -hold -e \"tail -f filename\"&', shell=True, executable='/bin/csh')
print "Done"

您可以使用“tailer”库：
它可以选择获取最后几行：
# Get the last 3 lines of the file
tailer.tail(open('test.txt'), 3)
# ['Line 9', 'Line 10', 'Line 11']

它还可以跟随一个文件：
# Follow the file as it grows
for line in tailer.follow(open('test.txt')):
    print line

如果想要类似尾部的行为，那么这似乎是一个不错的选择。
另一个选择是提供Python版本的tail
和head
实用程序和API的库，可以在您自己的模块中使用
最初基于tailer
模块，它的主要优点是能够按路径跟踪文件，即可以处理重新创建文件时的情况。此外，它还针对各种边缘情况修复了一些bug。
Python是“包括电池”的，它有一个很好的解决方案：
读取尚未读取的日志文件行。记得上次在哪里完成，并从那里继续
import sys
from pygtail import Pygtail

for line in Pygtail("some.log"):
    sys.stdout.write(line)

所有使用tail-f的答案都不是pythonic
这是python的方法：（不使用外部工具或库）
调整Ijaz Ahmad Khan的，仅在生产线
# Follow the file as it grows
for line in tailer.follow(open('test.txt')):
    print line

import sys
from pygtail import Pygtail

for line in Pygtail("some.log"):
    sys.stdout.write(line)

def follow(thefile):
     while True:
        line = thefile.readline()
        if not line or not line.endswith('\n'):
            time.sleep(0.1)
            continue
        yield line



if __name__ == '__main__':
    logfile = open("run/foo/access-log","r")
    loglines = follow(logfile)
    for line in loglines:
        print(line, end='')

def follow(file) -> Iterator[str]:
    """ Yield each line from a file as they are written. """
    line = ''
    while True:
        tmp = file.readline()
        if tmp is not None:
            line += tmp
            if line.endswith("\n"):
                yield line
                line = ''
        else:
            time.sleep(0.1)


if __name__ == '__main__':
    for line in follow(open("test.txt", 'r')):
        print(line, end='')