Python 根据预定义的文件夹结构查找回文件夹
我们的动态文件夹结构语法设置如下:Python 根据预定义的文件夹结构查找回文件夹,python,regex,reverse-engineering,Python,Regex,Reverse Engineering,我们的动态文件夹结构语法设置如下: :projectRoot:/dev/model/:parentHierarchy:/:assetName:/data/obj :projectRoot:/dev/model/:parentHierarchy:/:assetName:/data/fbx :projectRoot:/asset/shots/:parentHierarchy:/animation/:assetName:/scenes :projectRoot:/asset/shots/:parent
:projectRoot:/dev/model/:parentHierarchy:/:assetName:/data/obj
:projectRoot:/dev/model/:parentHierarchy:/:assetName:/data/fbx
:projectRoot:/asset/shots/:parentHierarchy:/animation/:assetName:/scenes
:projectRoot:/asset/shots/:parentHierarchy:/rendering/:assetName:/scenes
其中,两个冒号“:”之间的单词是变量。现在,基于单个路径,我想要检索projectRoot、parentHierarchy和assetName
projectRoot变量和parentHierarchy变量允许存在一个或多个文件夹,以便可以保存子文件夹。assetName变量仅限于单个文件夹。这些是我在下面的try中定义的
让我们假设我通过:
C:/Projects/foo/dev/model/props/furniture/couch/data/
这应返回:
projectRoot = C:/Projects/foo/
parentHierarchy = props/furniture
assetName = couch
然后,我们还可以检查规则集中定义的所有路径是否都存在(但这相对简单):
以下是我当前的测试实现:
import os
import re
variableRegex = re.compile(":[\w]*:")
def getRules():
return [":projectRoot:/dev/model/:parentHierarchy:/:assetName:/data/obj",
":projectRoot:/dev/model/:parentHierarchy:/:assetName:/data/fbx",
":projectRoot:/asset/shots/:parentHierarchy:/animation/:assetName:/scenes",
":projectRoot:/dev/model/:parentHierarchy:/:assetName:/data/obj"]
def getVarRules():
"""
These rules define how many hierarchy depth each variable represents.
(This is simplified from the actual code I'm working with to ease the example for the question)
-1 defines that the variable can hold one or more folders (nested folders), thus never zero
1 defines it will can't have subfolders and will define a single folder
"""
return {":projectRoot:": -1,
":parentHierarchy:": -1,
":assetName:": 1}
def reversePath(path, rule):
"""
Returns the variables within rule by getting them from the path.
This will only work if the given path is valid for the given rule.
This is currently a dummy function.
This is a part where I get stuck. How to slice it up based on the rules?
"""
varHierarchyDepth = getVarRules()
return path
def reverseEngineerWorkspaces(path):
"""
This function should check if the given path is valid for any of the rules.
Note that static parts (end parts of the rule not defined by a variable) may be omitted from the input path.
That input path can still be validated (only if the folder exists and contains the required static end as
subdirectories/files.)
"""
rules = getRules()
varHierarchyDepth = getVarRules()
path = os.path.realpath(path)
path = path.replace("\\","/") # force forward slashes so it's similar to our rules definitions.
path = path.rstrip("/") # remove any trailing slashes
for rule in rules:
# 1.
# First we check if any of the static parts that are in front of the last variables are present in the path.
# If not present it certainly isn't the correct path.
# We skip checking the end static part because we could easily check whether those exist within the current folder
staticParts = [ part for part in variableRegex.split(rule) if part != "" and part != "/" ]
if not all([x in path for x in staticParts[:-1]]):
continue
if rule.endswith(staticParts[-1]):
# If this this rule ends with a static part we can use that to check if the given path is fully valid
# Or if the path concatenated with that static part exists. If so we have a valid path for the rule.
if path.endswith(staticParts[-1]):
return reversePath(path, rule)
else:
lastPartSubfolders = staticParts[-1].split("/")
for x in range(len(lastPartSubfolders)):
tempPath = os.path.join(path, *lastPartSubfolders[:-x])
if os.path.exists(tempPath):
return reversePath(tempPath, rule)
else:
raise NotImplementedError("There's no implementation for rules without a static end part.")
print reverseEngineerWorkspaces("""C:/Projects/foo/dev/model/props/furniture/couch/data/""")
print reverseEngineerWorkspaces("""C:/Projects/foo/dev/model/props/furniture/couch/data/fbx""")
print reverseEngineerWorkspaces("""C:/Projects/foo/dev/model/props/furniture/couch/data/obj""")
print reverseEngineerWorkspaces("""C:/Projects/foo/asset/shots/props/furniture/animation/couch/scenes""")
目前它只找到由静态部分组成的路径(变量规则没有被检查,我不知道如何在这里添加)
而且它不会从任何遵循规则的完整路径中解析出变量。我认为您可以使用一个正则表达式完成所有操作:
In [24]: re.search(r'(.+)(?:dev/model/|asset/shots/)(.+)/(.+?)(?:/data|/scenes)', path).groups()
Out[24]: ('C:/Projects/foo/', 'props/furniture', 'couch')
我在找像那样整洁的东西!它并不像我希望的那样完全安全,但根据规则,我可以创建一个适合该场景的正则表达式。这是一个很好的例子。我现在得跑了,回家后一定会检查我是否能用它工作!谢谢您可以使用字符串解包来完成此操作。@Asad:您能举个例子吗?对不起,我想我说得太快了。显然,python没有像
格式
那样的scanf
。
In [24]: re.search(r'(.+)(?:dev/model/|asset/shots/)(.+)/(.+?)(?:/data|/scenes)', path).groups()
Out[24]: ('C:/Projects/foo/', 'props/furniture', 'couch')