在不使用其他低级lib的情况下使用Python监视文件系统事件

在不使用其他低级lib的情况下使用Python监视文件系统事件,python,linux,filesystemwatcher,Python,Linux,Filesystemwatcher,例如,我需要捕获linux操作系统上某个目录上的删除和添加文件事件。我为它们找到了像inotify和python wrappers这样的lib,但是如果我想使用清晰的python代码,我应该每隔一秒关注os.listdir(path)输出还是有一些方法来完成这样的任务?来源: watch_directories()函数获取一个路径列表和一个可调用对象,然后重复遍历以这些路径为根的目录树,监视被删除或修改时间发生变化的文件。然后向可调用对象传递两个列表,其中包含已更改的文件和已删除的文件 from

例如,我需要捕获linux操作系统上某个目录上的删除和添加文件事件。我为它们找到了像inotify和python wrappers这样的lib,但是如果我想使用清晰的python代码,我应该每隔一秒关注
os.listdir(path)
输出还是有一些方法来完成这样的任务?

来源:

watch_directories()函数获取一个路径列表和一个可调用对象,然后重复遍历以这些路径为根的目录树,监视被删除或修改时间发生变化的文件。然后向可调用对象传递两个列表,其中包含已更改的文件和已删除的文件

from __future__ import nested_scopes

import os, time

def watch_directories (paths, func, delay=1.0):
    """(paths:[str], func:callable, delay:float)
    Continuously monitors the paths and their subdirectories
    for changes.  If any files or directories are modified,
    the callable 'func' is called with a list of the modified paths of both
    files and directories.  'func' can return a Boolean value
    for rescanning; if it returns True, the directory tree will be
    rescanned without calling func() for any found changes.
    (This is so func() can write changes into the tree and prevent itself
    from being immediately called again.)
    """

    # Basic principle: all_files is a dictionary mapping paths to
    # modification times.  We repeatedly crawl through the directory
    # tree rooted at 'path', doing a stat() on each file and comparing
    # the modification time.  

    all_files = {}
    def f (unused, dirname, files):
        # Traversal function for directories
        for filename in files:
            path = os.path.join(dirname, filename)

            try:
                t = os.stat(path)
            except os.error:
                # If a file has been deleted between os.path.walk()
                # scanning the directory and now, we'll get an
                # os.error here.  Just ignore it -- we'll report
                # the deletion on the next pass through the main loop.
                continue

            mtime = remaining_files.get(path)
            if mtime is not None:
                # Record this file as having been seen
                del remaining_files[path]
                # File's mtime has been changed since we last looked at it.
                if t.st_mtime > mtime:
                    changed_list.append(path)
            else:
                # No recorded modification time, so it must be
                # a brand new file.
                changed_list.append(path)

            # Record current mtime of file.
            all_files[path] = t.st_mtime

    # Main loop
    rescan = False
    while True:
        changed_list = []
        remaining_files = all_files.copy()
        all_files = {}
        for path in paths:
            os.path.walk(path, f, None)
        removed_list = remaining_files.keys()
        if rescan:
            rescan = False
        elif changed_list or removed_list:
            rescan = func(changed_list, removed_list)

        time.sleep(delay)

if __name__ == '__main__':
    def f (changed_files, removed_files):
        print changed_files
        print 'Removed', removed_files

    watch_directories(['.'], f, 1)
如果您希望以某种方式将作业发送到守护进程,但不希望使用某些IPC机制(如套接字或管道),则此方法非常有用。相反,守护进程可以坐在提交目录中观察,作业可以通过将文件或目录拖放到提交目录中来提交

不考虑锁定。watch_directories()函数本身并不需要进行锁定;如果它在一个过程中错过了修改,它将在下一个过程中注意到它。但是,如果作业直接写入到监视的目录中,则当作业文件仅写入一半时,可调用对象可能会开始运行。要解决这个问题,可以使用锁文件;可调用对象在运行时必须获得锁,提交者在希望添加新作业时必须获得锁。一种更简单的方法是依赖于rename()系统调用的原子性:将作业写入一个不被监视的临时目录,文件完成后,使用os.rename()将其移动到提交目录中。

来源:

watch_directories()函数获取一个路径列表和一个可调用对象,然后重复遍历以这些路径为根的目录树,监视被删除或修改时间发生变化的文件。然后向可调用对象传递两个列表,其中包含已更改的文件和已删除的文件

from __future__ import nested_scopes

import os, time

def watch_directories (paths, func, delay=1.0):
    """(paths:[str], func:callable, delay:float)
    Continuously monitors the paths and their subdirectories
    for changes.  If any files or directories are modified,
    the callable 'func' is called with a list of the modified paths of both
    files and directories.  'func' can return a Boolean value
    for rescanning; if it returns True, the directory tree will be
    rescanned without calling func() for any found changes.
    (This is so func() can write changes into the tree and prevent itself
    from being immediately called again.)
    """

    # Basic principle: all_files is a dictionary mapping paths to
    # modification times.  We repeatedly crawl through the directory
    # tree rooted at 'path', doing a stat() on each file and comparing
    # the modification time.  

    all_files = {}
    def f (unused, dirname, files):
        # Traversal function for directories
        for filename in files:
            path = os.path.join(dirname, filename)

            try:
                t = os.stat(path)
            except os.error:
                # If a file has been deleted between os.path.walk()
                # scanning the directory and now, we'll get an
                # os.error here.  Just ignore it -- we'll report
                # the deletion on the next pass through the main loop.
                continue

            mtime = remaining_files.get(path)
            if mtime is not None:
                # Record this file as having been seen
                del remaining_files[path]
                # File's mtime has been changed since we last looked at it.
                if t.st_mtime > mtime:
                    changed_list.append(path)
            else:
                # No recorded modification time, so it must be
                # a brand new file.
                changed_list.append(path)

            # Record current mtime of file.
            all_files[path] = t.st_mtime

    # Main loop
    rescan = False
    while True:
        changed_list = []
        remaining_files = all_files.copy()
        all_files = {}
        for path in paths:
            os.path.walk(path, f, None)
        removed_list = remaining_files.keys()
        if rescan:
            rescan = False
        elif changed_list or removed_list:
            rescan = func(changed_list, removed_list)

        time.sleep(delay)

if __name__ == '__main__':
    def f (changed_files, removed_files):
        print changed_files
        print 'Removed', removed_files

    watch_directories(['.'], f, 1)
如果您希望以某种方式将作业发送到守护进程,但不希望使用某些IPC机制(如套接字或管道),则此方法非常有用。相反,守护进程可以坐在提交目录中观察,作业可以通过将文件或目录拖放到提交目录中来提交


不考虑锁定。watch_directories()函数本身并不需要进行锁定;如果它在一个过程中错过了修改,它将在下一个过程中注意到它。但是,如果作业直接写入到监视的目录中,则当作业文件仅写入一半时,可调用对象可能会开始运行。要解决这个问题,可以使用锁文件;可调用对象在运行时必须获得锁,提交者在希望添加新作业时必须获得锁。一种更简单的方法是依赖于rename()系统调用的原子性:将作业写入一个不被监视的临时目录,一旦文件完成,就使用os.rename()命令将其移动到提交目录。

这些包装有什么问题?我只是有一些任务不允许我使用ext-libs这些包装有什么问题?我只是有一些任务不允许我使用ext-libs