在没有realpath（）的情况下清理PHP中的文件路径_Php_Security_Sanitization

在没有realpath（）的情况下清理PHP中的文件路径

php security

在没有realpath（）的情况下清理PHP中的文件路径,php,security,sanitization,Php,Security,Sanitization,有没有一种方法可以在不使用realpath（）的情况下安全地清理路径输入目的是防止恶意输入，如。/../../../../../path/to/file $handle = fopen($path . '/' . $filename, 'r'); 不确定您为什么不想使用realpath，但路径名清理是一个非常简单的概念，大致如下：如果路径是相对路径（不是以/开头），请在其前面加上当前工作目录和/，使其成为绝对路径将多个/的所有序列替换为单个序列（a）将所有出现的/./替换为/ 如果

有没有一种方法可以在不使用

realpath（）

的情况下安全地清理路径输入

目的是防止恶意输入，如

。/../../../../../path/to/file

 $handle = fopen($path . '/' . $filename, 'r');

不确定您为什么不想使用

realpath

，但路径名清理是一个非常简单的概念，大致如下：

如果路径是相对路径（不是以
```
/
```
开头），请在其前面加上当前工作目录和
```
/
```
，使其成为绝对路径
将多个
```
/
```
的所有序列替换为单个序列（a）
将所有出现的
```
/./
```
替换为
```
/
```
如果在末尾，请删除
```
/。
```
将
```
/anything/./
```
替换为
```
/
```
删除
```
/anything/。
```
（如果在末尾）

本例中的文本

anything

表示非

的最长字符序列

请注意，这些规则应持续应用，直到没有任何规则导致更改为止。换句话说，完成所有六个步骤（一次）。如果字符串更改，则返回并再次执行所有六个操作（另一个过程）。继续这样做，直到字符串与刚才执行的过程相同

完成这些步骤后，您就有了一个可以检查有效模式的规范路径名。最有可能的情况是，任何东西都不是以

开头的。/

（换句话说，它不会试图超出起点。您可能希望应用其他规则，但这超出了此问题的范围

（a）如果您使用的系统将路径开头的

视为特殊字符，请确保将开头的多个

字符替换为其中两个。这是POSIX唯一允许（但不强制）使用的位置对多个字符的特殊处理，在所有其他情况下，多个

字符等同于一个字符。

简单形式：

$filename = str_replace('..', '', $filename);

if (file_exists($path . '/' . $filename)) {
    $handle = fopen($path . '/' . $filename, 'r');
}

复杂表格（来源）：

中描述了一个用于在相对URI引用解析过程中解释和删除引用路径中的特殊

和

。

完整路径段

您也可以将此算法用于文件系统路径：

// as per RFC 3986
// @see http://tools.ietf.org/html/rfc3986#section-5.2.4
function remove_dot_segments($input) {
    // 1.  The input buffer is initialized with the now-appended path
    //     components and the output buffer is initialized to the empty
    //     string.
    $output = '';

    // 2.  While the input buffer is not empty, loop as follows:
    while ($input !== '') {
        // A.  If the input buffer begins with a prefix of "`../`" or "`./`",
        //     then remove that prefix from the input buffer; otherwise,
        if (
            ($prefix = substr($input, 0, 3)) == '../' ||
            ($prefix = substr($input, 0, 2)) == './'
           ) {
            $input = substr($input, strlen($prefix));
        } else

        // B.  if the input buffer begins with a prefix of "`/./`" or "`/.`",
        //     where "`.`" is a complete path segment, then replace that
        //     prefix with "`/`" in the input buffer; otherwise,
        if (
            ($prefix = substr($input, 0, 3)) == '/./' ||
            ($prefix = $input) == '/.'
           ) {
            $input = '/' . substr($input, strlen($prefix));
        } else

        // C.  if the input buffer begins with a prefix of "/../" or "/..",
        //     where "`..`" is a complete path segment, then replace that
        //     prefix with "`/`" in the input buffer and remove the last
        //     segment and its preceding "/" (if any) from the output
        //     buffer; otherwise,
        if (
            ($prefix = substr($input, 0, 4)) == '/../' ||
            ($prefix = $input) == '/..'
           ) {
            $input = '/' . substr($input, strlen($prefix));
            $output = substr($output, 0, strrpos($output, '/'));
        } else

        // D.  if the input buffer consists only of "." or "..", then remove
        //     that from the input buffer; otherwise,
        if ($input == '.' || $input == '..') {
            $input = '';
        } else

        // E.  move the first path segment in the input buffer to the end of
        //     the output buffer, including the initial "/" character (if
        //     any) and any subsequent characters up to, but not including,
        //     the next "/" character or the end of the input buffer.
        {
            $pos = strpos($input, '/');
            if ($pos === 0) $pos = strpos($input, '/', $pos+1);
            if ($pos === false) $pos = strlen($input);
            $output .= substr($input, 0, $pos);
            $input = (string) substr($input, $pos);
        }
    }

    // 3.  Finally, the output buffer is returned as the result of remove_dot_segments.
    return $output;
}

以下函数规范化URI的文件系统路径和路径组件。它比

笔记

它不会剥离多个
```
/
```
的序列，因为这不符合
显然，这不适用于
```
。\backslash\path
```
我不确定这个函数是100%安全的，但我还没能想出一个影响其输出的输入

由于您只要求进行消毒，可能您需要的只是“在棘手的路径上失败”的事情。如果路径输入中通常没有任何。/../stuff/。/like/this，您只需检查以下内容：

function isTricky($p) {
    if(strpos("/$p/","/../")===false) return false;
    return true;
}

或者只是

function isTricky($p) {return strpos("-/$p/","/../");}

这种快速而肮脏的方式可以阻止任何向后移动，在大多数情况下，这就足够了。（第二个版本返回一个非零而不是true，但是嘿，为什么不呢！…破折号是对字符串的索引0的攻击。）

旁注：还要记住斜杠与反斜杠-我建议先将反斜杠转换为简单斜杠。但这取决于平台。

由于上述函数在某种程度上不适用于我（或相当长），我尝试了自己的代码：

function clean_path( $A_path="", $A_echo=false )
{
    // IF YOU WANT TO LEAN CODE, KILL ALL "if" LINES and $A_echo in ARGS
    $_p                            = func_get_args();
    // HOW IT WORKS:
    // REMOVING EMPTY ELEMENTS AT THE END ALLOWS FOR "BUFFERS" AND HANDELLING START & END SPEC. SEQUENCES
    // BLANK ELEMENTS AT START & END MAKE SURE WE COVER SPECIALS AT BEGIN & END
    // REPLACING ":" AGAINST "://" MAKES AN EMPTY ELEMENT TO ALLOW FOR CORRECT x:/../<path> USE (which, in principle is faulty)

    // 1.) "normalize" TO "slashed" AND MAKE SOME SPECIALS, ALSO DUMMY ELEMENTS AT BEGIN & END 
        $_s                        = array( "\\", ":", ":./", ":../");
        $_r                        = array( "/", "://", ":/", ":/" );
        $_p['sr']                = "/" . str_replace( $_s, $_r, $_p[0] ) . "/";
        $_p['arr']                = explode('/', $_p['sr'] );
                                                                                if ( $A_echo ) $_p['arr1']    = $_p['arr'];
    // 2.) GET KEYS OF ".." ELEMENTS, REMOVE THEM AND THE ONE BEFORE (!) AS THAT MEANS "UP" AND THAT DISABLES STEP BEFORE
        $_p['pp']                = array_keys( $_p['arr'], '..' );
        foreach($_p['pp'] as $_pos )
        {
            $_p['arr'][ $_pos-1 ] = $_p['arr'][ $_pos ] ="";
        }
                                                                                if ( $A_echo ) $_p['arr2']    = $_p['arr'];
    // 3.) REMOVE ALL "/./" PARTS AS THEY ARE SIMPLY OVERFLUENT
        $_p['p']                = array_keys( $_p['arr'], '.' );
        foreach($_p['p'] as $_pos )
        {
            unset( $_p['arr'][ $_pos ] );
        }
                                                                                if ( $A_echo ) $_p['arr3']    = $_p['arr'];
    // 4.) CLEAN OUT EMPTY ONES INCLUDING OUR DUMMIES
        $_p['arr']                = array_filter( $_p['arr'] );
    // 5) MAKE FINAL STRING
        $_p['clean']            = implode( DIRECTORY_SEPARATOR, $_p['arr'] );
                                                                                if ($A_echo){ echo "arr=="; print_R( $_p  ); };
    return $_p['clean'];    
}

函数clean_path（$A_path=“”，$A_echo=false）
{
//如果您想精简代码，请删除ARGS中的所有“IF”行和$A_echo
$\u p=func\u get\u args（）；
//工作原理：
//删除末尾的空元素可以实现“缓冲区”和HANDELLING开始和结束规范序列
//开始和结束时的空白元素确保我们在开始和结束时涵盖特价商品
//将“：”替换为“：//”将生成一个空元素，以允许正确使用x:///（原则上是错误的）
//1.）将“规格化”改为“斜线化”，并制作一些特殊元素，以及开始和结束时的虚拟元素
$\u s=数组（“\\”、“：”、“：./”、“：../”）；
$_r=数组（“/”、“：/”、“：/”、“：/”、“：/”；
$\u p['sr']=“/”。str\u替换（$\u s，$\u r，$\u p[0]）。“/”；
$_p['arr']=爆炸（'/'，$_p['sr']）；
如果（$A_echo）$p['arr1']=$p['arr']；
//2.）获取“.”元素的键，删除它们和（！）之前的键，因为这意味着“向上”，并且禁用了“向上”之前的步骤
$_p['pp']=数组_键（$_p['arr']，'..）；
foreach（$\u p['pp']作为$\u pos）
{
$\u p['arr'][$\u pos-1]=$\u p['arr'][$\u pos]=''；
}
如果（$A_echo）$p['arr2']=$p['arr']；
//3.）移除所有“/。/”零件，因为它们只是溢出
$_p['p']=数组_键（$_p['arr']，'）；
foreach（$\u p['p']作为$\u pos）
{
未设置（$_p['arr'][$_pos]）；
}
如果（$A_echo）$p['arr3']=$p['arr']；
//4.）清理空的，包括我们的假人
$\u p['arr']=数组过滤器（$\u p['arr']）；
//5）制作最终字符串
$\u p['clean']=内爆（目录分隔符，$\u p['arr']）；
if（$A_echo）{echo“arr==”；print_R（$_p）；}；
返回$p['clean']；
}

我更喜欢内爆/爆炸解决方案：

public function sanitize(string $path = null, string $separator = DIRECTORY_SEPARATOR) : string
{
    $pathArray = explode($separator, $path);
    foreach ($pathArray as $key => $value)
    {
        if ($value === '.' || $value === '..')
        {
            $pathArray[$key] = null;
        }
    }
    return implode($separator, array_map('trim', array_filter($pathArray)));
}

以前的版本看起来是这样的：

public function sanitize(string $path = null, string $separator = DIRECTORY_SEPARATOR) : string
{
    $output = str_replace(
    [
        ' ',
        '..',
    ], null, $path);
    $output = preg_replace('~' . $separator . '+~', $separator, $output);
    $output = ltrim($output, '.');
    $output = trim($output, $separator);
    return $output;
}

这两种方法都已成功地针对数据提供商进行了测试。享受吧

为什么不想使用

realpath（）

？您可以

realpath（）

文件名，然后检查它是否以

public function sanitize(string $path = null, string $separator = DIRECTORY_SEPARATOR) : string
{
    $pathArray = explode($separator, $path);
    foreach ($pathArray as $key => $value)
    {
        if ($value === '.' || $value === '..')
        {
            $pathArray[$key] = null;
        }
    }
    return implode($separator, array_map('trim', array_filter($pathArray)));
}

public function sanitize(string $path = null, string $separator = DIRECTORY_SEPARATOR) : string
{
    $output = str_replace(
    [
        ' ',
        '..',
    ], null, $path);
    $output = preg_replace('~' . $separator . '+~', $separator, $output);
    $output = ltrim($output, '.');
    $output = trim($output, $separator);
    return $output;
}