Python 确定图像是否是序列的一部分的最佳方法是什么_Python_Regex_Sequential

Python 确定图像是否是序列的一部分的最佳方法是什么

python regex

Python 确定图像是否是序列的一部分的最佳方法是什么,python,regex,sequential,Python,Regex,Sequential,我有一个图像文件，我想用python检查它是否是图像序列的一部分例如，我从以下文件开始： /projects/image_0001.jpg 我想检查文件是否是序列的一部分，即 /projects/image_0001.jpg /projects/image_0002.jpg /projects/image_0003.jpg ... 如果我能确定文件名是否可以是序列的艺术，即是否有文件名的数字序列，那么检查是否有图像序列似乎很简单我的第一个想法是要求用户将######添加到数字应该位于的文

我有一个图像文件，我想用python检查它是否是图像序列的一部分

例如，我从以下文件开始：

/projects/image_0001.jpg

我想检查文件是否是序列的一部分，即

/projects/image_0001.jpg
/projects/image_0002.jpg
/projects/image_0003.jpg
...

如果我能确定文件名是否可以是序列的艺术，即是否有文件名的数字序列，那么检查是否有图像序列似乎很简单

我的第一个想法是要求用户将

######

添加到数字应该位于的文件路径中，并输入开始帧和结束帧编号以替换哈希值，但这显然不是非常方便用户使用的。有没有办法用正则表达式或类似的东西检查字符串中的数字序列？

使用python的

re

模块查看字符串是否包含数字序列相对容易。你可以这样做：

mo = re.findall('\d+', filename)

这将返回

filename

中所有数字序列的列表。如果：

只有一个结果（即文件名只包含一个数字序列），并且
后续文件名具有相同长度的一位数序列，并且
第二个数字序列比前一个数字序列大1

…那么它们可能是序列的一部分。

使用python的

re

模块查看字符串是否包含数字序列相对容易。你可以这样做：

mo = re.findall('\d+', filename)

这将返回

filename

中所有数字序列的列表。如果：

只有一个结果（即文件名只包含一个数字序列），并且
后续文件名具有相同长度的一位数序列，并且
第二个数字序列比前一个数字序列大1

…那么它们可能是序列的一部分。

我假设问题更多的是能够区分磁盘上的序列文件，而不是知道关于文件名本身的任何特定信息

如果是这样的话，那么你要寻找的是一种足够聪明的东西，可以列出如下清单：

/路径/to/file_1.png
/路径/to/file_2.png
/路径/to/file_3.png
/路径/to/file_10.png
/路径/to/image_1.png
/path/to/image_2.png
/路径/to/image_10.png

然后返回一个结果，结果是-我有两个文件序列：/path/to/file##png和/path/to/image##png您需要两个过程-第一个过程确定文件的有效表达式，第二个过程确定所有其他文件满足该要求

您还需要知道是否要支持间隙（是否需要连续）

/路径/to/file_1.png
/路径/to/file_2.png
/路径/to/file_3.png
/路径/to/file_5.png
/路径/to/file_6.png
/路径/to/file_7.png

这是1个序列（/path/to/file##.png）还是2个序列（/path/to/file_1-3.png，/path/to/file_5-7.png）

另外-您希望如何处理顺序中的数字文件

/path/to/file2_1.png
/path/to/file2_2.png
/path/to/file2_3.png

等等

考虑到这一点，我将这样做：

    import os.path
    import projex.sorting
    import re

    def find_sequences( filenames ):
        """
        Parse a list of filenames into a dictionary of sequences.  Filenames not
        part of a sequence are returned in the None key

        :param      filenames | [<str>, ..]

        :return     {<str> sequence: [<str> filename, ..], ..}
        """
        local_filenames   = filenames[:]
        sequence_patterns = {}
        sequences         = {None: []}

        # sort the files (by natural order) so we always generate a pattern
        # based on the first potential file in a sequence
        local_filenames.sort(projex.sorting.natural)

        # create the expression to determine if a sequence is possible
        # we are going to assume that its always going to be the 
        # last set of digits that makes a sequence, i.e.
        #
        #    test2_1.png
        #    test2_2.png
        #
        # test2 will be treated as part of the name
        # 
        #    test1.png
        #    test2.png
        #
        # whereas here the 1 and 2 are part of the sequence
        #
        # more advanced expressions would be needed to support
        # 
        #    test_01_2.png
        #    test_02_2.png
        #    test_03_2.png

        pattern_expr = re.compile('^(.*)(\d+)([^\d]*)$')

        # process the inputed files for sequences
        for filename in filenames:
            # first, check to see if this filename matches a sequence
            found = False
            for key, pattern in sequence_patterns.items():
                match = pattern.match(filename)
                if ( not match ):
                    continue

                sequences[key].append(filename)
                found = True
                break

            # if we've already been matched, then continue on
            if ( found ):
                continue

            # next, see if this filename should start a new sequence
            basename      = os.path.basename(filename)
            pattern_match = pattern_expr.match(basename)
            if ( pattern_match ):
                opts = (pattern_match.group(1), pattern_match.group(3))
                key  = '%s#%s' % opts

                # create a new pattern based on the filename
                sequence_pattern = re.compile('^%s\d+%s$' % opts)

                sequence_patterns[key] = sequence_pattern
                sequences[key] = [filename]
                continue

            # otherwise, add it to the list of non-sequences
            sequences[None].append(filename)

        # now that we have grouped everything, we'll merge back filenames
        # that were potential sequences, but only contain a single file to the
        # non-sequential list
        for key, filenames in sequences.items():
            if ( key is None or len(filenames) > 1 ):
                continue

            sequences.pop(key)
            sequences[None] += filenames

        return sequences

其中有一种方法涉及自然排序，这是一个单独的主题。我只是使用projex库中的自然排序方法。它是开源的，因此如果您想使用或查看它，请点击此处：

但是这个主题在论坛的其他地方已经讨论过了，所以只使用了库中的方法。

我认为问题更多的是能够区分磁盘上的顺序文件，而不是知道关于文件名本身的任何特定信息

如果是这样的话，那么你要寻找的是一种足够聪明的东西，可以列出如下清单：

/路径/to/file_1.png
/路径/to/file_2.png
/路径/to/file_3.png
/路径/to/file_10.png
/路径/to/image_1.png
/path/to/image_2.png
/路径/to/image_10.png

您还需要知道是否要支持间隙（是否需要连续）

/路径/to/file_1.png
/路径/to/file_2.png
/路径/to/file_3.png
/路径/to/file_5.png
/路径/to/file_6.png
/路径/to/file_7.png

这是1个序列（/path/to/file##.png）还是2个序列（/path/to/file_1-3.png，/path/to/file_5-7.png）

另外-您希望如何处理顺序中的数字文件

/path/to/file2_1.png
/path/to/file2_2.png
/path/to/file2_3.png

等等

考虑到这一点，我将这样做：

    import os.path
    import projex.sorting
    import re

    def find_sequences( filenames ):
        """
        Parse a list of filenames into a dictionary of sequences.  Filenames not
        part of a sequence are returned in the None key

        :param      filenames | [<str>, ..]

        :return     {<str> sequence: [<str> filename, ..], ..}
        """
        local_filenames   = filenames[:]
        sequence_patterns = {}
        sequences         = {None: []}

        # sort the files (by natural order) so we always generate a pattern
        # based on the first potential file in a sequence
        local_filenames.sort(projex.sorting.natural)

        # create the expression to determine if a sequence is possible
        # we are going to assume that its always going to be the 
        # last set of digits that makes a sequence, i.e.
        #
        #    test2_1.png
        #    test2_2.png
        #
        # test2 will be treated as part of the name
        # 
        #    test1.png
        #    test2.png
        #
        # whereas here the 1 and 2 are part of the sequence
        #
        # more advanced expressions would be needed to support
        # 
        #    test_01_2.png
        #    test_02_2.png
        #    test_03_2.png

        pattern_expr = re.compile('^(.*)(\d+)([^\d]*)$')

        # process the inputed files for sequences
        for filename in filenames:
            # first, check to see if this filename matches a sequence
            found = False
            for key, pattern in sequence_patterns.items():
                match = pattern.match(filename)
                if ( not match ):
                    continue

                sequences[key].append(filename)
                found = True
                break

            # if we've already been matched, then continue on
            if ( found ):
                continue

            # next, see if this filename should start a new sequence
            basename      = os.path.basename(filename)
            pattern_match = pattern_expr.match(basename)
            if ( pattern_match ):
                opts = (pattern_match.group(1), pattern_match.group(3))
                key  = '%s#%s' % opts

                # create a new pattern based on the filename
                sequence_pattern = re.compile('^%s\d+%s$' % opts)

                sequence_patterns[key] = sequence_pattern
                sequences[key] = [filename]
                continue

            # otherwise, add it to the list of non-sequences
            sequences[None].append(filename)

        # now that we have grouped everything, we'll merge back filenames
        # that were potential sequences, but only contain a single file to the
        # non-sequential list
        for key, filenames in sequences.items():
            if ( key is None or len(filenames) > 1 ):
                continue

            sequences.pop(key)
            sequences[None] += filenames

        return sequences

其中有一种方法涉及自然排序，这是一个单独的主题。我只是使用projex库中的自然排序方法。它是开源的，因此如果您想使用或查看它，请点击此处：

但是这个主题已经在论坛的其他地方讨论过了，所以只使用了图书馆的方法。

什么是图像序列？你能举个例子吗？所有的文件名都是某种形式的

picture\u xxxx

，或者是否有任何旧文件名混合在一起？可能是pic.xxxx.jpg或pic-xxx.jpg等。我想让脚本尽可能灵活，以考虑不同的人的偏好。什么是图像序列？你能举个例子吗？所有的文件名都是某种形式的

picture\u xxxx

，还是有任何旧文件名混合在一起？可能是pic.xxxx.jpg或pic-xxx.jpg等。我想让脚本尽可能灵活，以考虑不同的人的偏好