Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/url/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Php 清理字符串以使其URL和文件名安全?_Php_Url_Filenames_Sanitization - Fatal编程技术网

Php 清理字符串以使其URL和文件名安全?

Php 清理字符串以使其URL和文件名安全?,php,url,filenames,sanitization,Php,Url,Filenames,Sanitization,我正试图想出一个函数,它可以很好地清理某些字符串,这样它们就可以安全地在URL中使用(比如post slug),也可以安全地用作文件名。例如,当有人上载文件时,我希望确保从名称中删除所有危险字符 到目前为止,我已经提出了以下功能,我希望解决这个问题,并允许外国UTF-8数据也 /** * Convert a string to the file/URL safe "slug" form * * @param string $string the string to clean * @pa

我正试图想出一个函数,它可以很好地清理某些字符串,这样它们就可以安全地在URL中使用(比如post slug),也可以安全地用作文件名。例如,当有人上载文件时,我希望确保从名称中删除所有危险字符

到目前为止,我已经提出了以下功能,我希望解决这个问题,并允许外国UTF-8数据也

/**
 * Convert a string to the file/URL safe "slug" form
 *
 * @param string $string the string to clean
 * @param bool $is_filename TRUE will allow additional filename characters
 * @return string
 */
function sanitize($string = '', $is_filename = FALSE)
{
 // Replace all weird characters with dashes
 $string = preg_replace('/[^\w\-'. ($is_filename ? '~_\.' : ''). ']+/u', '-', $string);

 // Only allow one dash separator at a time (and make string lowercase)
 return mb_strtolower(preg_replace('/--+/u', '-', $string), 'UTF-8');
}
是否有人有任何棘手的样本数据可供我使用?或者是否知道更好的方法来保护我们的应用程序免受恶意攻击?

$is文件名允许一些附加字符,如临时vim文件


更新:删除了星号字符,因为我想不出有效的用法

我在代码中找到了这个更大的函数:

/**
*功能:消毒
*返回经过清理的字符串,通常用于URL。
*
*参数:
*$string-要清理的字符串。
*$force_lowercase-强制字符串为小写?
*$anal-如果设置为*true*,将删除所有非字母数字字符。
*/
函数清理($string,$force\u lowercase=true,$anal=false){
$strip=array(“~”、“`、“!”、“@”、“#”、“$”、“%”、“^”、“&”、“*”、“(”、“”)、“"”、“=”、“+”、“[”、“{”、“]”,
"}", "\\", "|", ";", ":", "\"", "'", "‘", "’", "“", "”", "–", "—",
"—", "–", ",", "", "/", "?");
$clean=trim(str_replace($strip,“,strip_标签($string));
$clean=preg_replace('/\s+/',“-”,$clean);
$clean=($anal)?preg_替换(“/[^a-zA-Z0-9]/”,“,$clean):$clean;
返回($force_小写)?
(函数_存在('mb_strtolower'))?
mb_strtolower($clean,'UTF-8'):
strtolower($clean):
$clean;
}
代码中的这个

/**
*清理文件名,用破折号替换空白
*
*删除某些服务器上文件名中非法的特殊字符
*需要特殊转义的操作系统和特殊字符
*在命令行上操作。替换空格和连续字符
*用单破折号划线。从开头修剪句点、破折号和下划线
*和文件名的结尾。
*
*@自2.1.0以来
*
*@param string$filename要清理的文件名
*@return string已清除的文件名
*/
函数sanitize_file_name($filename){
$filename_raw=$filename;
$special\u chars=array(“?”、“[”、“]”、“/”、“\\”、“=”、“:”、“;”、“、”、“、”、“$”、“\”、“*”、“(“、”)、“、”、“、”、“、”、“、“!”、“{、“}”);
$special_chars=apply_filters('sanitize_file_name_chars',$special_chars,$filename_raw);
$filename=str_replace($special_chars,,$filename);
$filename=preg_replace('/[\s-]+/','-',$filename);
$filename=trim($filename',.-');
返回apply_filters('sanitize_file_name',$filename,$filename_raw);
}
2012年9月更新 他在这方面做了一些令人难以置信的工作。他的phunction框架包括几个伟大的文本过滤器和转换


我不认为列出要移除的字符列表是安全的。我宁愿使用以下命令:

对于文件名:使用内部ID或文件内容的散列。将文档名保存在数据库中。这样,您可以保留原始文件名,但仍然可以找到文件

对于url参数:使用
urlencode()
对任何特殊字符进行编码。

尝试以下操作:

function normal_chars($string)
{
    $string = htmlentities($string, ENT_QUOTES, 'UTF-8');
    $string = preg_replace('~&([a-z]{1,2})(acute|cedil|circ|grave|lig|orn|ring|slash|th|tilde|uml);~i', '$1', $string);
    $string = html_entity_decode($string, ENT_QUOTES, 'UTF-8');
    $string = preg_replace(array('~[^0-9a-z]~i', '~[ -]+~'), ' ', $string);

    return trim($string, ' -');
}

Examples:

echo normal_chars('Álix----_Ãxel!?!?'); // Alix Axel
echo normal_chars('áéíóúÁÉÍÓÚ'); // aeiouAEIOU
echo normal_chars('üÿÄËÏÖÜŸåÅ'); // uyAEIOUYaA

基于此线程中选择的答案:

对解决方案的一些观察:

  • 模式末尾的“u”意味着模式,而不是其匹配的文本将被解释为UTF-8(我猜您假定为后者?)
  • \w匹配下划线字符。您专门将其包含在文件中,这会导致假定您不希望在URL中使用它们,但在代码中,您的URL将允许包含下划线
  • “外部UTF-8”的包含似乎取决于区域设置。不清楚这是服务器的区域设置还是客户端的区域设置。从PHP文档:
  • “word”字符是任何字母或数字或下划线字符,即可以是Perl“word”的一部分的任何字符。字母和数字的定义由PCRE的字符表控制,并且在进行特定于语言环境的匹配时可能会有所不同。例如,在“fr”(法语)中在区域设置中,某些大于128的字符代码用于重音字母,这些字符代码由\w匹配

    创造鼻涕虫 您可能不应该在post slug中包含重音等字符,因为从技术上讲,它们应该是百分比编码的(按照URL编码规则),所以您的URL看起来很难看

    因此,如果我是你,在小写后,我会将任何“特殊”字符转换为它们的等效字符(例如,é->e),并将非[a-z]字符替换为“-”,就像你所做的那样限制为单个“-”的运行。这里有一个转换特殊字符的实现:

    一般卫生处理 OWASP有其企业安全API的PHP实现,其中包括用于安全编码和解码应用程序中输入和输出的方法

    编码器接口提供:

    canonicalize (string $input, [bool $strict = true])
    decodeFromBase64 (string $input)
    decodeFromURL (string $input)
    encodeForBase64 (string $input, [bool $wrap = false])
    encodeForCSS (string $input)
    encodeForHTML (string $input)
    encodeForHTMLAttribute (string $input)
    encodeForJavaScript (string $input)
    encodeForOS (Codec $codec, string $input)
    encodeForSQL (Codec $codec, string $input)
    encodeForURL (string $input)
    encodeForVBScript (string $input)
    encodeForXML (string $input)
    encodeForXMLAttribute (string $input)
    encodeForXPath (string $input)
    
    我一直在想

    方便的
    UTF8::translitate_to_ascii()
    将像ñ=>n这样的东西


    当然,您可以用mb_*函数替换另一个
    UTF8::*
    东西。

    根据您将如何使用它,您可能希望添加一个长度限制以防止缓冲区溢出。

    对于文件上载,您将最安全地防止用户控制文件名。正如已经暗示的,存储数据库中的规范化文件名以及随机选择的唯一名称,您将使用该名称作为实际文件名

    使用OWASP ESAPI,这些名称可以通过
    /**
     * Sanitizes a filename replacing whitespace with dashes
     *
     * Removes special characters that are illegal in filenames on certain
     * operating systems and special characters requiring special escaping
     * to manipulate at the command line. Replaces spaces and consecutive
     * dashes with a single dash. Trim period, dash and underscore from beginning
     * and end of filename.
     *
     * @since 2.1.0
     *
     * @param string $filename The filename to be sanitized
     * @return string The sanitized filename
     */
    function sanitize_file_name( $filename ) {
        $filename_raw = $filename;
        $special_chars = array("?", "[", "]", "/", "\\", "=", "<", ">", ":", ";", ",", "'", "\"", "&", "$", "#", "*", "(", ")", "|", "~", "`", "!", "{", "}");
        $special_chars = apply_filters('sanitize_file_name_chars', $special_chars, $filename_raw);
        $filename = str_replace($special_chars, '', $filename);
        $filename = preg_replace('/[\s-]+/', '-', $filename);
        $filename = trim($filename, '.-_');
        return apply_filters('sanitize_file_name', $filename, $filename_raw);
    }
    
    function normal_chars($string)
    {
        $string = htmlentities($string, ENT_QUOTES, 'UTF-8');
        $string = preg_replace('~&([a-z]{1,2})(acute|cedil|circ|grave|lig|orn|ring|slash|th|tilde|uml);~i', '$1', $string);
        $string = html_entity_decode($string, ENT_QUOTES, 'UTF-8');
        $string = preg_replace(array('~[^0-9a-z]~i', '~[ -]+~'), ' ', $string);
    
        return trim($string, ' -');
    }
    
    Examples:
    
    echo normal_chars('Álix----_Ãxel!?!?'); // Alix Axel
    echo normal_chars('áéíóúÁÉÍÓÚ'); // aeiouAEIOU
    echo normal_chars('üÿÄËÏÖÜŸåÅ'); // uyAEIOUYaA
    
    canonicalize (string $input, [bool $strict = true])
    decodeFromBase64 (string $input)
    decodeFromURL (string $input)
    encodeForBase64 (string $input, [bool $wrap = false])
    encodeForCSS (string $input)
    encodeForHTML (string $input)
    encodeForHTMLAttribute (string $input)
    encodeForJavaScript (string $input)
    encodeForOS (Codec $codec, string $input)
    encodeForSQL (Codec $codec, string $input)
    encodeForURL (string $input)
    encodeForVBScript (string $input)
    encodeForXML (string $input)
    encodeForXMLAttribute (string $input)
    encodeForXPath (string $input)
    
    public static function title($title, $separator = '-', $ascii_only = FALSE)
    {
    if ($ascii_only === TRUE)
    {
    // Transliterate non-ASCII characters
    $title = UTF8::transliterate_to_ascii($title);
    
    // Remove all characters that are not the separator, a-z, 0-9, or whitespace
    $title = preg_replace('![^'.preg_quote($separator).'a-z0-9\s]+!', '', strtolower($title));
    }
    else
    {
    // Remove all characters that are not the separator, letters, numbers, or whitespace
    $title = preg_replace('![^'.preg_quote($separator).'\pL\pN\s]+!u', '', UTF8::strtolower($title));
    }
    
    // Replace all separator characters and whitespace by a single separator
    $title = preg_replace('!['.preg_quote($separator).'\s]+!u', $separator, $title);
    
    // Trim separators from the beginning and end
    return trim($title, $separator);
    }
    
    $userFilename   = ESAPI::getEncoder()->canonicalize($input_string);
    $safeFilename   = ESAPI::getRandomizer()->getRandomFilename();
    
    $safeForURL     = ESAPI::getEncoder()->encodeForURL($input_string);
    
    $string = preg_replace(array('/\s/', '/\.[\.]+/', '/[^\w_\.\-]/'), array('_', '.', ''), $string);
    
    // Remove special accented characters - ie. sí.
    $clean_name = strtr($string, array('Š' => 'S','Ž' => 'Z','š' => 's','ž' => 'z','Ÿ' => 'Y','À' => 'A','Á' => 'A','Â' => 'A','Ã' => 'A','Ä' => 'A','Å' => 'A','Ç' => 'C','È' => 'E','É' => 'E','Ê' => 'E','Ë' => 'E','Ì' => 'I','Í' => 'I','Î' => 'I','Ï' => 'I','Ñ' => 'N','Ò' => 'O','Ó' => 'O','Ô' => 'O','Õ' => 'O','Ö' => 'O','Ø' => 'O','Ù' => 'U','Ú' => 'U','Û' => 'U','Ü' => 'U','Ý' => 'Y','à' => 'a','á' => 'a','â' => 'a','ã' => 'a','ä' => 'a','å' => 'a','ç' => 'c','è' => 'e','é' => 'e','ê' => 'e','ë' => 'e','ì' => 'i','í' => 'i','î' => 'i','ï' => 'i','ñ' => 'n','ò' => 'o','ó' => 'o','ô' => 'o','õ' => 'o','ö' => 'o','ø' => 'o','ù' => 'u','ú' => 'u','û' => 'u','ü' => 'u','ý' => 'y','ÿ' => 'y'));
    $clean_name = strtr($clean_name, array('Þ' => 'TH', 'þ' => 'th', 'Ð' => 'DH', 'ð' => 'dh', 'ß' => 'ss', 'Œ' => 'OE', 'œ' => 'oe', 'Æ' => 'AE', 'æ' => 'ae', 'µ' => 'u'));
    
    $clean_name = preg_replace(array('/\s/', '/\.[\.]+/', '/[^\w_\.\-]/'), array('_', '.', ''), $clean_name);
    
    $clean_name = strtolower($clean_name);
    
    /**
     * Convert a string into a url safe address.
     *
     * @param string $unformatted
     * @return string
     */
    public function formatURL($unformatted) {
    
        $url = strtolower(trim($unformatted));
    
        //replace accent characters, forien languages
        $search = array('À', 'Á', 'Â', 'Ã', 'Ä', 'Å', 'Æ', 'Ç', 'È', 'É', 'Ê', 'Ë', 'Ì', 'Í', 'Î', 'Ï', 'Ð', 'Ñ', 'Ò', 'Ó', 'Ô', 'Õ', 'Ö', 'Ø', 'Ù', 'Ú', 'Û', 'Ü', 'Ý', 'ß', 'à', 'á', 'â', 'ã', 'ä', 'å', 'æ', 'ç', 'è', 'é', 'ê', 'ë', 'ì', 'í', 'î', 'ï', 'ñ', 'ò', 'ó', 'ô', 'õ', 'ö', 'ø', 'ù', 'ú', 'û', 'ü', 'ý', 'ÿ', 'Ā', 'ā', 'Ă', 'ă', 'Ą', 'ą', 'Ć', 'ć', 'Ĉ', 'ĉ', 'Ċ', 'ċ', 'Č', 'č', 'Ď', 'ď', 'Đ', 'đ', 'Ē', 'ē', 'Ĕ', 'ĕ', 'Ė', 'ė', 'Ę', 'ę', 'Ě', 'ě', 'Ĝ', 'ĝ', 'Ğ', 'ğ', 'Ġ', 'ġ', 'Ģ', 'ģ', 'Ĥ', 'ĥ', 'Ħ', 'ħ', 'Ĩ', 'ĩ', 'Ī', 'ī', 'Ĭ', 'ĭ', 'Į', 'į', 'İ', 'ı', 'IJ', 'ij', 'Ĵ', 'ĵ', 'Ķ', 'ķ', 'Ĺ', 'ĺ', 'Ļ', 'ļ', 'Ľ', 'ľ', 'Ŀ', 'ŀ', 'Ł', 'ł', 'Ń', 'ń', 'Ņ', 'ņ', 'Ň', 'ň', 'ʼn', 'Ō', 'ō', 'Ŏ', 'ŏ', 'Ő', 'ő', 'Œ', 'œ', 'Ŕ', 'ŕ', 'Ŗ', 'ŗ', 'Ř', 'ř', 'Ś', 'ś', 'Ŝ', 'ŝ', 'Ş', 'ş', 'Š', 'š', 'Ţ', 'ţ', 'Ť', 'ť', 'Ŧ', 'ŧ', 'Ũ', 'ũ', 'Ū', 'ū', 'Ŭ', 'ŭ', 'Ů', 'ů', 'Ű', 'ű', 'Ų', 'ų', 'Ŵ', 'ŵ', 'Ŷ', 'ŷ', 'Ÿ', 'Ź', 'ź', 'Ż', 'ż', 'Ž', 'ž', 'ſ', 'ƒ', 'Ơ', 'ơ', 'Ư', 'ư', 'Ǎ', 'ǎ', 'Ǐ', 'ǐ', 'Ǒ', 'ǒ', 'Ǔ', 'ǔ', 'Ǖ', 'ǖ', 'Ǘ', 'ǘ', 'Ǚ', 'ǚ', 'Ǜ', 'ǜ', 'Ǻ', 'ǻ', 'Ǽ', 'ǽ', 'Ǿ', 'ǿ'); 
        $replace = array('A', 'A', 'A', 'A', 'A', 'A', 'AE', 'C', 'E', 'E', 'E', 'E', 'I', 'I', 'I', 'I', 'D', 'N', 'O', 'O', 'O', 'O', 'O', 'O', 'U', 'U', 'U', 'U', 'Y', 's', 'a', 'a', 'a', 'a', 'a', 'a', 'ae', 'c', 'e', 'e', 'e', 'e', 'i', 'i', 'i', 'i', 'n', 'o', 'o', 'o', 'o', 'o', 'o', 'u', 'u', 'u', 'u', 'y', 'y', 'A', 'a', 'A', 'a', 'A', 'a', 'C', 'c', 'C', 'c', 'C', 'c', 'C', 'c', 'D', 'd', 'D', 'd', 'E', 'e', 'E', 'e', 'E', 'e', 'E', 'e', 'E', 'e', 'G', 'g', 'G', 'g', 'G', 'g', 'G', 'g', 'H', 'h', 'H', 'h', 'I', 'i', 'I', 'i', 'I', 'i', 'I', 'i', 'I', 'i', 'IJ', 'ij', 'J', 'j', 'K', 'k', 'L', 'l', 'L', 'l', 'L', 'l', 'L', 'l', 'l', 'l', 'N', 'n', 'N', 'n', 'N', 'n', 'n', 'O', 'o', 'O', 'o', 'O', 'o', 'OE', 'oe', 'R', 'r', 'R', 'r', 'R', 'r', 'S', 's', 'S', 's', 'S', 's', 'S', 's', 'T', 't', 'T', 't', 'T', 't', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'W', 'w', 'Y', 'y', 'Y', 'Z', 'z', 'Z', 'z', 'Z', 'z', 's', 'f', 'O', 'o', 'U', 'u', 'A', 'a', 'I', 'i', 'O', 'o', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'A', 'a', 'AE', 'ae', 'O', 'o'); 
        $url = str_replace($search, $replace, $url);
    
        //replace common characters
        $search = array('&', '£', '$'); 
        $replace = array('and', 'pounds', 'dollars'); 
        $url= str_replace($search, $replace, $url);
    
        // remove - for spaces and union characters
        $find = array(' ', '&', '\r\n', '\n', '+', ',', '//');
        $url = str_replace($find, '-', $url);
    
        //delete and replace rest of special chars
        $find = array('/[^a-z0-9\-<>]/', '/[\-]+/', '/<[^>]*>/');
        $replace = array('', '-', '');
        $uri = preg_replace($find, $replace, $url);
    
        return $uri;
    }
    
    $file_name = trim(basename(stripslashes($name)), ".\x00..\x20");
    
    /var/www/uploads/123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345/
    
    (0 + 0 + 244 + 11 chars) C:\1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234\1234567.txt
    (0 + 3 + 240 + 11 chars) C:\123\123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890\1234567.txt
    (3 + 3 + 236 + 11 chars) C:\123\456\12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456\1234567.txt
    
    (255 chars) E:\12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901.txt
    
    "*/:<>?\|
    
    Warning: touch(): Unable to create file ... because No error in ... on line ...
    
    /**
     * Sanitize Filename
     *
     * @param   string  $str        Input file name
     * @param   bool    $relative_path  Whether to preserve paths
     * @return  string
     */
    public function sanitize_filename($str, $relative_path = FALSE)
    {
        $bad = array(
            '../', '<!--', '-->', '<', '>',
            "'", '"', '&', '$', '#',
            '{', '}', '[', ']', '=',
            ';', '?', '%20', '%22',
            '%3c',      // <
            '%253c',    // <
            '%3e',      // >
            '%0e',      // >
            '%28',      // (
            '%29',      // )
            '%2528',    // (
            '%26',      // &
            '%24',      // $
            '%3f',      // ?
            '%3b',      // ;
            '%3d'       // =
        );
    
        if ( ! $relative_path)
        {
            $bad[] = './';
            $bad[] = '/';
        }
    
        $str = remove_invisible_characters($str, FALSE);
        return stripslashes(str_replace($bad, '', $str));
    }
    
    function remove_invisible_characters($str, $url_encoded = TRUE)
    {
        $non_displayables = array();
    
        // every control character except newline (dec 10),
        // carriage return (dec 13) and horizontal tab (dec 09)
        if ($url_encoded)
        {
            $non_displayables[] = '/%0[0-8bcef]/';  // url encoded 00-08, 11, 12, 14, 15
            $non_displayables[] = '/%1[0-9a-f]/';   // url encoded 16-31
        }
    
        $non_displayables[] = '/[\x00-\x08\x0B\x0C\x0E-\x1F\x7F]+/S';   // 00-08, 11, 12, 14-31, 127
    
        do
        {
            $str = preg_replace($non_displayables, '', $str, -1, $count);
        }
        while ($count);
    
        return $str;
    }
    
    <?php
    
    namespace COil\Bundle\COilCoreBundle\Component\HttpKernel\Util;
    
    use Symfony\Component\HttpKernel\Util\Filesystem as BaseFilesystem;
    
    /**
     * Extends the Symfony filesystem object.
     */
    class Filesystem extends BaseFilesystem
    {
        /**
         * Make a filename safe to use in any function. (Accents, spaces, special chars...)
         * The iconv function must be activated.
         *
         * @param string  $fileName       The filename to sanitize (with or without extension)
         * @param string  $defaultIfEmpty The default string returned for a non valid filename (only special chars or separators)
         * @param string  $separator      The default separator
         * @param boolean $lowerCase      Tells if the string must converted to lower case
         *
         * @author COil <https://github.com/COil>
         * @see    http://stackoverflow.com/questions/2668854/sanitizing-strings-to-make-them-url-and-filename-safe
         *
         * @return string
         */
        public function sanitizeFilename($fileName, $defaultIfEmpty = 'default', $separator = '_', $lowerCase = true)
        {
        // Gather file informations and store its extension
        $fileInfos = pathinfo($fileName);
        $fileExt   = array_key_exists('extension', $fileInfos) ? '.'. strtolower($fileInfos['extension']) : '';
    
        // Removes accents
        $fileName = @iconv('UTF-8', 'us-ascii//TRANSLIT', $fileInfos['filename']);
    
        // Removes all characters that are not separators, letters, numbers, dots or whitespaces
        $fileName = preg_replace("/[^ a-zA-Z". preg_quote($separator). "\d\.\s]/", '', $lowerCase ? strtolower($fileName) : $fileName);
    
        // Replaces all successive separators into a single one
        $fileName = preg_replace('!['. preg_quote($separator).'\s]+!u', $separator, $fileName);
    
        // Trim beginning and ending seperators
        $fileName = trim($fileName, $separator);
    
        // If empty use the default string
        if (empty($fileName)) {
            $fileName = $defaultIfEmpty;
        }
    
        return $fileName. $fileExt;
        }
    }
    
    <?php
    
    namespace COil\Bundle\COilCoreBundle\Tests\Unit\Helper;
    
    use COil\Bundle\COilCoreBundle\Component\HttpKernel\Util\Filesystem;
    
    /**
     * Test the Filesystem custom class.
     */
    class FilesystemTest extends \PHPUnit_Framework_TestCase
    {
        /**
         * test sanitizeFilename()
         */
        public function testFilesystem()
        {
        $fs = new Filesystem();
    
        $this->assertEquals('logo_orange.gif', $fs->sanitizeFilename('--logö  _  __   ___   ora@@ñ--~gé--.gif'), '::sanitizeFilename() handles complex filename with specials chars');
        $this->assertEquals('coilstack', $fs->sanitizeFilename('cOiLsTaCk'), '::sanitizeFilename() converts all characters to lower case');
        $this->assertEquals('cOiLsTaCk', $fs->sanitizeFilename('cOiLsTaCk', 'default', '_', false), '::sanitizeFilename() lower case can be desactivated, passing false as the 4th argument');
        $this->assertEquals('coil_stack', $fs->sanitizeFilename('coil stack'), '::sanitizeFilename() convert a white space to a separator');
        $this->assertEquals('coil-stack', $fs->sanitizeFilename('coil stack', 'default', '-'), '::sanitizeFilename() can use a different separator as the 3rd argument');
        $this->assertEquals('coil_stack', $fs->sanitizeFilename('coil          stack'), '::sanitizeFilename() removes successive white spaces to a single separator');
        $this->assertEquals('coil_stack', $fs->sanitizeFilename('       coil stack'), '::sanitizeFilename() removes spaces at the beginning of the string');
        $this->assertEquals('coil_stack', $fs->sanitizeFilename('coil   stack         '), '::sanitizeFilename() removes spaces at the end of the string');
        $this->assertEquals('coilstack', $fs->sanitizeFilename('coil,,,,,,stack'), '::sanitizeFilename() removes non-ASCII characters');
        $this->assertEquals('coil_stack', $fs->sanitizeFilename('coil_stack  '), '::sanitizeFilename() keeps separators');
        $this->assertEquals('coil_stack', $fs->sanitizeFilename(' coil________stack'), '::sanitizeFilename() converts successive separators into a single one');
        $this->assertEquals('coil_stack.gif', $fs->sanitizeFilename('cOil Stack.GiF'), '::sanitizeFilename() lower case filename and extension');
        $this->assertEquals('copy_of_coil.stack.exe', $fs->sanitizeFilename('Copy of coil.stack.exe'), '::sanitizeFilename() keeps dots before the extension');
        $this->assertEquals('default.doc', $fs->sanitizeFilename('____________.doc'), '::sanitizeFilename() returns a default file name if filename only contains special chars');
        $this->assertEquals('default.docx', $fs->sanitizeFilename('     ___ -  --_     __%%%%__¨¨¨***____      .docx'), '::sanitizeFilename() returns a default file name if filename only contains special chars');
        $this->assertEquals('logo_edition_1314352521.jpg', $fs->sanitizeFilename('logo_edition_1314352521.jpg'), '::sanitizeFilename() returns the filename untouched if it does not need to be modified');
        $userId = rand(1, 10);
        $this->assertEquals('user_doc_'. $userId. '.doc', $fs->sanitizeFilename('亐亐亐亐亐.doc', 'user_doc_'. $userId), '::sanitizeFilename() returns the default string (the 2nd argument) if it can\'t be sanitized');
        }
    }
    
    All tests pass:
    
    phpunit -c app/ src/COil/Bundle/COilCoreBundle/Tests/Unit/Helper/FilesystemTest.php
    PHPUnit 3.6.10 by Sebastian Bergmann.
    
    Configuration read from /var/www/strangebuzz.com/app/phpunit.xml.dist
    
    .
    
    Time: 0 seconds, Memory: 5.75Mb
    
    OK (1 test, 17 assertions)
    
    replaceAccentedChars
    
    str2url
    
    function replaceAccentedChars($str)
    {
        $patterns = array(
            /* Lowercase */
            '/[\x{0105}\x{00E0}\x{00E1}\x{00E2}\x{00E3}\x{00E4}\x{00E5}]/u',
            '/[\x{00E7}\x{010D}\x{0107}]/u',
            '/[\x{010F}]/u',
            '/[\x{00E8}\x{00E9}\x{00EA}\x{00EB}\x{011B}\x{0119}]/u',
            '/[\x{00EC}\x{00ED}\x{00EE}\x{00EF}]/u',
            '/[\x{0142}\x{013E}\x{013A}]/u',
            '/[\x{00F1}\x{0148}]/u',
            '/[\x{00F2}\x{00F3}\x{00F4}\x{00F5}\x{00F6}\x{00F8}]/u',
            '/[\x{0159}\x{0155}]/u',
            '/[\x{015B}\x{0161}]/u',
            '/[\x{00DF}]/u',
            '/[\x{0165}]/u',
            '/[\x{00F9}\x{00FA}\x{00FB}\x{00FC}\x{016F}]/u',
            '/[\x{00FD}\x{00FF}]/u',
            '/[\x{017C}\x{017A}\x{017E}]/u',
            '/[\x{00E6}]/u',
            '/[\x{0153}]/u',
    
            /* Uppercase */
            '/[\x{0104}\x{00C0}\x{00C1}\x{00C2}\x{00C3}\x{00C4}\x{00C5}]/u',
            '/[\x{00C7}\x{010C}\x{0106}]/u',
            '/[\x{010E}]/u',
            '/[\x{00C8}\x{00C9}\x{00CA}\x{00CB}\x{011A}\x{0118}]/u',
            '/[\x{0141}\x{013D}\x{0139}]/u',
            '/[\x{00D1}\x{0147}]/u',
            '/[\x{00D3}]/u',
            '/[\x{0158}\x{0154}]/u',
            '/[\x{015A}\x{0160}]/u',
            '/[\x{0164}]/u',
            '/[\x{00D9}\x{00DA}\x{00DB}\x{00DC}\x{016E}]/u',
            '/[\x{017B}\x{0179}\x{017D}]/u',
            '/[\x{00C6}]/u',
            '/[\x{0152}]/u');
    
        $replacements = array(
                'a', 'c', 'd', 'e', 'i', 'l', 'n', 'o', 'r', 's', 'ss', 't', 'u', 'y', 'z', 'ae', 'oe',
                'A', 'C', 'D', 'E', 'L', 'N', 'O', 'R', 'S', 'T', 'U', 'Z', 'AE', 'OE'
            );
    
        return preg_replace($patterns, $replacements, $str);
    }
    
    function str2url($str)
    {
        if (function_exists('mb_strtolower'))
            $str = mb_strtolower($str, 'utf-8');
    
        $str = trim($str);
        if (!function_exists('mb_strtolower'))
            $str = replaceAccentedChars($str);
    
        // Remove all non-whitelist chars.
        $str = preg_replace('/[^a-zA-Z0-9\s\'\:\/\[\]-\pL]/u', '', $str);
        $str = preg_replace('/[\s\'\:\/\[\]-]+/', ' ', $str);
        $str = str_replace(array(' ', '/'), '-', $str);
    
        // If it was not possible to lowercase the string with mb_strtolower, we do it after the transformations.
        // This way we lose fewer special chars.
        if (!function_exists('mb_strtolower'))
            $str = strtolower($str);
    
        return $str;
    }
    
    public static function makeSafe($file)
    {
        // Remove any trailing dots, as those aren't ever valid file names.
        $file = rtrim($file, '.');
    
        $regex = array('#(\.){2,}#', '#[^A-Za-z0-9\.\_\- ]#', '#^\.#');
    
        return trim(preg_replace($regex, '', $file));
    }
    
    public function getFriendlyURL($string) {
        setlocale(LC_CTYPE, 'en_US.UTF8');
        $string = iconv('UTF-8', 'ASCII//TRANSLIT//IGNORE', $string);
        $string = preg_replace('~[^\-\pL\pN\s]+~u', '-', $string);
        $string = str_replace(' ', '-', $string);
        $string = trim($string, "-");
        $string = strtolower($string);
        return $string;
    } 
    
        function sanitize($string,$force_lowercase=true) {
        //Clean up titles for filenames
        $clean = strip_tags($string);
        $clean = strtr($clean, array('Š' => 'S','Ž' => 'Z','š' => 's','ž' => 'z','Ÿ' => 'Y','À' => 'A','Á' => 'A','Â' => 'A','Ã' => 'A','Ä' => 'A','Å' => 'A','Ç' => 'C','È' => 'E','É' => 'E','Ê' => 'E','Ë' => 'E','Ì' => 'I','Í' => 'I','Î' => 'I','Ï' => 'I','Ñ' => 'N','Ò' => 'O','Ó' => 'O','Ô' => 'O','Õ' => 'O','Ö' => 'O','Ø' => 'O','Ù' => 'U','Ú' => 'U','Û' => 'U','Ü' => 'U','Ý' => 'Y','à' => 'a','á' => 'a','â' => 'a','ã' => 'a','ä' => 'a','å' => 'a','ç' => 'c','è' => 'e','é' => 'e','ê' => 'e','ë' => 'e','ì' => 'i','í' => 'i','î' => 'i','ï' => 'i','ñ' => 'n','ò' => 'o','ó' => 'o','ô' => 'o','õ' => 'o','ö' => 'o','ø' => 'o','ù' => 'u','ú' => 'u','û' => 'u','ü' => 'u','ý' => 'y','ÿ' => 'y'));
        $clean = strtr($clean, array('Þ' => 'TH', 'þ' => 'th', 'Ð' => 'DH', 'ð' => 'dh', 'ß' => 'ss', 'Œ' => 'OE', 'œ' => 'oe', 'Æ' => 'AE', 'æ' => 'ae', 'µ' => 'u','—' => '-'));
        $clean = str_replace("--", "-", preg_replace("/[^a-z0-9-]/i", "", preg_replace(array('/\s/', '/[^\w-\.\-]/'), array('-', ''), $clean)));
    
        return ($force_lowercase) ?
            (function_exists('mb_strtolower')) ?
                mb_strtolower($clean, 'UTF-8') :
                strtolower($clean) :
            $clean;
    }
    
    <?php
    
    echo URLify::filter (' J\'étudie le français ');
    // "jetudie-le-francais"
    
    echo URLify::filter ('Lo siento, no hablo español.');
    // "lo-siento-no-hablo-espanol"
    
    ?>
    
    <?php
    
    echo URLify::filter ('фото.jpg', 60, "", true);
    // "foto.jpg"
    
    ?>