Python 确定图像是否是序列的一部分的最佳方法是什么
我有一个图像文件,我想用python检查它是否是图像序列的一部分 例如,我从以下文件开始:Python 确定图像是否是序列的一部分的最佳方法是什么,python,regex,sequential,Python,Regex,Sequential,我有一个图像文件,我想用python检查它是否是图像序列的一部分 例如,我从以下文件开始: /projects/image_0001.jpg 我想检查文件是否是序列的一部分,即 /projects/image_0001.jpg /projects/image_0002.jpg /projects/image_0003.jpg ... 如果我能确定文件名是否可以是序列的艺术,即是否有文件名的数字序列,那么检查是否有图像序列似乎很简单 我的第一个想法是要求用户将######添加到数字应该位于的文
/projects/image_0001.jpg
我想检查文件是否是序列的一部分,即
/projects/image_0001.jpg
/projects/image_0002.jpg
/projects/image_0003.jpg
...
如果我能确定文件名是否可以是序列的艺术,即是否有文件名的数字序列,那么检查是否有图像序列似乎很简单
我的第一个想法是要求用户将
######
添加到数字应该位于的文件路径中,并输入开始帧和结束帧编号以替换哈希值,但这显然不是非常方便用户使用的。有没有办法用正则表达式或类似的东西检查字符串中的数字序列?使用python的re
模块查看字符串是否包含数字序列相对容易。你可以这样做:
mo = re.findall('\d+', filename)
这将返回filename
中所有数字序列的列表。如果:
- 只有一个结果(即文件名只包含一个数字序列),并且
- 后续文件名具有相同长度的一位数序列,并且
- 第二个数字序列比前一个数字序列大1
…那么它们可能是序列的一部分。使用python的
re
模块查看字符串是否包含数字序列相对容易。你可以这样做:
mo = re.findall('\d+', filename)
这将返回filename
中所有数字序列的列表。如果:
- 只有一个结果(即文件名只包含一个数字序列),并且
- 后续文件名具有相同长度的一位数序列,并且
- 第二个数字序列比前一个数字序列大1
…那么它们可能是序列的一部分。我假设问题更多的是能够区分磁盘上的序列文件,而不是知道关于文件名本身的任何特定信息 如果是这样的话,那么你要寻找的是一种足够聪明的东西,可以列出如下清单:
- /路径/to/file_1.png
- /路径/to/file_2.png
- /路径/to/file_3.png
- /路径/to/file_10.png
- /路径/to/image_1.png
- /path/to/image_2.png
- /路径/to/image_10.png
- /路径/to/file_1.png
- /路径/to/file_2.png
- /路径/to/file_3.png
- /路径/to/file_5.png
- /路径/to/file_6.png
- /路径/to/file_7.png
- /path/to/file2_1.png
- /path/to/file2_2.png
- /path/to/file2_3.png
import os.path
import projex.sorting
import re
def find_sequences( filenames ):
"""
Parse a list of filenames into a dictionary of sequences. Filenames not
part of a sequence are returned in the None key
:param filenames | [<str>, ..]
:return {<str> sequence: [<str> filename, ..], ..}
"""
local_filenames = filenames[:]
sequence_patterns = {}
sequences = {None: []}
# sort the files (by natural order) so we always generate a pattern
# based on the first potential file in a sequence
local_filenames.sort(projex.sorting.natural)
# create the expression to determine if a sequence is possible
# we are going to assume that its always going to be the
# last set of digits that makes a sequence, i.e.
#
# test2_1.png
# test2_2.png
#
# test2 will be treated as part of the name
#
# test1.png
# test2.png
#
# whereas here the 1 and 2 are part of the sequence
#
# more advanced expressions would be needed to support
#
# test_01_2.png
# test_02_2.png
# test_03_2.png
pattern_expr = re.compile('^(.*)(\d+)([^\d]*)$')
# process the inputed files for sequences
for filename in filenames:
# first, check to see if this filename matches a sequence
found = False
for key, pattern in sequence_patterns.items():
match = pattern.match(filename)
if ( not match ):
continue
sequences[key].append(filename)
found = True
break
# if we've already been matched, then continue on
if ( found ):
continue
# next, see if this filename should start a new sequence
basename = os.path.basename(filename)
pattern_match = pattern_expr.match(basename)
if ( pattern_match ):
opts = (pattern_match.group(1), pattern_match.group(3))
key = '%s#%s' % opts
# create a new pattern based on the filename
sequence_pattern = re.compile('^%s\d+%s$' % opts)
sequence_patterns[key] = sequence_pattern
sequences[key] = [filename]
continue
# otherwise, add it to the list of non-sequences
sequences[None].append(filename)
# now that we have grouped everything, we'll merge back filenames
# that were potential sequences, but only contain a single file to the
# non-sequential list
for key, filenames in sequences.items():
if ( key is None or len(filenames) > 1 ):
continue
sequences.pop(key)
sequences[None] += filenames
return sequences
其中有一种方法涉及自然排序,这是一个单独的主题。我只是使用projex库中的自然排序方法。它是开源的,因此如果您想使用或查看它,请点击此处:
但是这个主题在论坛的其他地方已经讨论过了,所以只使用了库中的方法。我认为问题更多的是能够区分磁盘上的顺序文件,而不是知道关于文件名本身的任何特定信息 如果是这样的话,那么你要寻找的是一种足够聪明的东西,可以列出如下清单:
- /路径/to/file_1.png
- /路径/to/file_2.png
- /路径/to/file_3.png
- /路径/to/file_10.png
- /路径/to/image_1.png
- /path/to/image_2.png
- /路径/to/image_10.png
- /路径/to/file_1.png
- /路径/to/file_2.png
- /路径/to/file_3.png
- /路径/to/file_5.png
- /路径/to/file_6.png
- /路径/to/file_7.png
- /path/to/file2_1.png
- /path/to/file2_2.png
- /path/to/file2_3.png
import os.path
import projex.sorting
import re
def find_sequences( filenames ):
"""
Parse a list of filenames into a dictionary of sequences. Filenames not
part of a sequence are returned in the None key
:param filenames | [<str>, ..]
:return {<str> sequence: [<str> filename, ..], ..}
"""
local_filenames = filenames[:]
sequence_patterns = {}
sequences = {None: []}
# sort the files (by natural order) so we always generate a pattern
# based on the first potential file in a sequence
local_filenames.sort(projex.sorting.natural)
# create the expression to determine if a sequence is possible
# we are going to assume that its always going to be the
# last set of digits that makes a sequence, i.e.
#
# test2_1.png
# test2_2.png
#
# test2 will be treated as part of the name
#
# test1.png
# test2.png
#
# whereas here the 1 and 2 are part of the sequence
#
# more advanced expressions would be needed to support
#
# test_01_2.png
# test_02_2.png
# test_03_2.png
pattern_expr = re.compile('^(.*)(\d+)([^\d]*)$')
# process the inputed files for sequences
for filename in filenames:
# first, check to see if this filename matches a sequence
found = False
for key, pattern in sequence_patterns.items():
match = pattern.match(filename)
if ( not match ):
continue
sequences[key].append(filename)
found = True
break
# if we've already been matched, then continue on
if ( found ):
continue
# next, see if this filename should start a new sequence
basename = os.path.basename(filename)
pattern_match = pattern_expr.match(basename)
if ( pattern_match ):
opts = (pattern_match.group(1), pattern_match.group(3))
key = '%s#%s' % opts
# create a new pattern based on the filename
sequence_pattern = re.compile('^%s\d+%s$' % opts)
sequence_patterns[key] = sequence_pattern
sequences[key] = [filename]
continue
# otherwise, add it to the list of non-sequences
sequences[None].append(filename)
# now that we have grouped everything, we'll merge back filenames
# that were potential sequences, but only contain a single file to the
# non-sequential list
for key, filenames in sequences.items():
if ( key is None or len(filenames) > 1 ):
continue
sequences.pop(key)
sequences[None] += filenames
return sequences
其中有一种方法涉及自然排序,这是一个单独的主题。我只是使用projex库中的自然排序方法。它是开源的,因此如果您想使用或查看它,请点击此处:
但是这个主题已经在论坛的其他地方讨论过了,所以只使用了图书馆的方法。什么是图像序列?你能举个例子吗?所有的文件名都是某种形式的
picture\u xxxx
,或者是否有任何旧文件名混合在一起?可能是pic.xxxx.jpg或pic-xxx.jpg等。我想让脚本尽可能灵活,以考虑不同的人的偏好。什么是图像序列?你能举个例子吗?所有的文件名都是某种形式的picture\u xxxx
,还是有任何旧文件名混合在一起?可能是pic.xxxx.jpg或pic-xxx.jpg等。我想让脚本尽可能灵活,以考虑不同的人的偏好