在python中验证文件名_Python_Filenames

在python中验证文件名

python

在python中验证文件名,python,filenames,Python,Filenames,我正在用Python编写一个个人wiki风格的程序，它将文本文件存储在用户可配置的目录中程序应该能够从用户处获取字符串（例如foo），并创建foo.txt文件名。用户只能在wiki目录中创建文件，斜杠将创建一个子目录（例如，foo/bar变成（wiki路径）/foo/bar.txt）检查输入是否尽可能安全的最佳方法是什么？我需要注意什么？我知道一些常见的陷阱是：目录遍历：。/ 空字节：\0 我意识到，用户输入文件名从来都不是100%安全的，但程序只能在本地运行，我只想防止任何常见错误/

我正在用Python编写一个个人wiki风格的程序，它将文本文件存储在用户可配置的目录中

程序应该能够从用户处获取字符串（例如

foo

），并创建

foo.txt

文件名。用户只能在wiki目录中创建文件，斜杠将创建一个子目录（例如，

foo/bar

变成

（wiki路径）/foo/bar.txt

）

检查输入是否尽可能安全的最佳方法是什么？我需要注意什么？我知道一些常见的陷阱是：

目录遍历：
```
。/
```
空字节：
```
\0
```

我意识到，用户输入文件名从来都不是100%安全的，但程序只能在本地运行，我只想防止任何常见错误/小故障。

您可以强制用户在wiki内创建文件/目录，方法是使用规范化路径，然后检查路径是否以say'（指向wiki的路径）开头

为了确保用户输入的路径/文件名不包含任何令人讨厌的内容，您可以强制用户输入一个路径或文件名，该路径或文件名可以是字母、数字的上下限，也可以是连字符或下划线

然后，您可以始终使用类似的正则表达式检查规范化文件名

userpath=os.path.normpath('(path-to-wiki)/foo/bar.txt')
re.findall(r'[^A-Za-z0-9_\-\\]',userpath)

概括

如果

userpath=os.path.normpath（'（wiki路径）/foo/bar.txt'）

if not os.path.normpath('(path-to-wiki)/foo/bar.txt').startswith('(path-to-wiki)')  
   or re.search(r'[^A-Za-z0-9_\-\\]',userpath):
  ... Do what ever you want with an invalid path

您只需验证除“，”和“/”字符外的所有字符都是可打印的字母数字ascii字符，然后删除所有错误组合的实例

safe_string = str()
for c in user_supplied_string:
    if c.isalnum() or c in [' ','.','/']:
        safe_string = safe_string + c

while safe_string.count("../"):
    # I use a loop because only replacing once would 
    # leave a hole in that a bad guy could enter ".../"
    # which would be replaced to "../" so the loop 
    # prevents tricks like this!
    safe_string = safe_string.replace("../","./")
# Get rid of leading "./" combinations...
safe_string = safe_string.lstrip("./")

这就是我要做的，我不知道它有多像蟒蛇，但它会让你很安全。如果您想验证而不是转换，那么您可以在验证之后进行相等性测试，如下所示：

valid = save_string == user_supplied_string
if not valid:
     raise Exception("Sorry the string %s contains invalid characters" % user_supplied_string )

最后，这两种方法可能都会起作用，我发现这种方法感觉更加明确，并且应该筛选出任何奇怪/不合适的字符，如“\t”、“r”或“\n” 干杯

Armin Ronacher（和其他人）

这些想法作为Flask中的功能实现：

def safe_join（目录，文件名）：
“”“安全地加入'directory'和'filename'。”。
用法示例：：
@app.route（“/wiki/”）
def wiki_页面（文件名）：
filename=safe\u join（app.config['WIKI\u FOLDER']，文件名）
打开（文件名为“rb”）作为fd时：
content=fd.read（）#读取并处理文件内容。。。
：param directory：基本目录。
：param filename：相对于该目录的不受信任的文件名。
：raises:：class:`~werkzeug.exceptions.NotFound`如果结果路径
将从“目录”中掉出。
"""
filename=posixpath.normpath（文件名）
对于sep中的其他sep：
如果文件名中有sep：
未找到的提升（）
如果os.path.isbs（文件名）或filename.startswith（“../”）：
未找到的提升（）
返回os.path.join（目录，文件名）

现在有一个完整的库来验证字符串：

从路径验证导入清理\u文件路径
fpath=“fi:l*e/p\”a？t>h|.t{}”。格式（fpath，sanitize_文件路径（fpath）））
fpath=“\0\u a*b:ce%f/（g）h+i\u 0.txt”
打印（“{}->{}”。格式（fpath，sanitize_filepath（fpath）））

输出：

fi:l*e/p"a?t>h|.t<xt -> file/path.txt
_a*b:c<d>e%f/(g)h+i_0.txt -> _abcde%f/(g)h+i_0.txt

fi:l*e/p“a？t>h|.t文件/path.txt
_a*b:ce%f/（g）h+i_0.txt->u abcde%f/（g）h+i_0.txt

什么是目标操作系统？python的哪些版本？@IgnacioVazquez Abrams:是的，但文件系统中的纯文本文件还有其他好处。@g.d.d.c:python 2.7和/或3.2，主要是MacOS/Linux。您的正则表达式有点极端！路径中允许的字符太多了！另外，要在路径中不允许正斜杠，可以使用非ASCII文件名。

from pathvalidate import sanitize_filepath

fpath = "fi:l*e/p\"a?t>h|.t<xt"
print("{} -> {}".format(fpath, sanitize_filepath(fpath)))

fpath = "\0_a*b:c<d>e%f/(g)h+i_0.txt"
print("{} -> {}".format(fpath, sanitize_filepath(fpath)))

fi:l*e/p"a?t>h|.t<xt -> file/path.txt
_a*b:c<d>e%f/(g)h+i_0.txt -> _abcde%f/(g)h+i_0.txt