Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/php/285.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Php 如何检测给定URL是否为当前URL?_Php_Regex_Url - Fatal编程技术网

Php 如何检测给定URL是否为当前URL?

Php 如何检测给定URL是否为当前URL?,php,regex,url,Php,Regex,Url,我需要检测提供的URL是否与当前导航到的URL匹配。请注意,以下所有URL都是有效的,但在语义上是等效的: https://www.example.com/path/to/page/index.php?parameter=value https://www.example.com/path/to/page/index.php https://www.example.com/path/to/page/ https://www.example.com/path/to/page http://www.

我需要检测提供的URL是否与当前导航到的URL匹配。请注意,以下所有URL都是有效的,但在语义上是等效的:

https://www.example.com/path/to/page/index.php?parameter=value
https://www.example.com/path/to/page/index.php
https://www.example.com/path/to/page/
https://www.example.com/path/to/page
http://www.example.com/path/to/page
//www.example.com/path/to/page
//www/path/to/page
../../../path/to/page
../../to/page
../page
./
如果给定URL指向当前页面,则最后一个函数必须返回
true
,否则返回
false
我没有预期URL的列表;这将用于只希望在链接到当前页面时禁用链接的客户端请注意,我希望忽略参数,因为这些参数并不表示此站点上的当前页面。我使用了以下正则表达式:

/^((https?:)?\/\/www(\.example\.com)\/path\/to\/page\/?(index.php)?(\?.+=.*(\&.+=.*)*)?)|(\.\/)$/i
其中
https?
www
\.example\.com
\/path\/to\/page
,和
index.php
通过
$\u服务器[“php\u SELF”]
动态检测,并生成正则表达式,但与
./../to/page
等相对URL不匹配

编辑:我对正则表达式有了进一步的了解:
现在,我只需要PHP为任何给定页面动态创建正则表达式。

实际上不需要正则表达式来去除所有查询参数。您可以使用:

要检查URL数组的输出,请执行以下操作:

$url_list = <<<URL
https://www.example.com/path/to/page/index.php?parameter=value
https://www.example.com/path/to/page/index.php    
...    
./?parameter=value
./
URL;

$urls = explode("\n", $url_list);
foreach ($urls as $url) {
    $url = strtok($url, '?'); // remove everything after ?
    echo $url."\n";
}

您可以使用以下方法:

function checkURL($me, $s) {
   $dir = dirname($me) . '/';
   // you may need to refine this
   $s = preg_filter(array('~^//~', '~/$~', '~\?.*$~', '~\.\./~'),
                    array('', '', '', $dir), $s);
   // parse resulting URL
   $url = parse_url($s);
   var_dump($url);
   // match parsed URL's path with self
   return ($url['path'] === $me);
}

// your page's URL with stripped out .php    
$me = str_replace('.php', '', $_SERVER['PHP_SELF']);

// assume this is the URL you are matching against
$s = '../page/';

// compare $me with $s
$ret = checkURL($me, $s);

var_dump($ret);

现场演示:

过去几天我一直在做这项工作,我并不是坐在那里等待答案。我已经想出了一个在我的测试平台上工作的方法;其他人怎么想?它感觉有点臃肿,但也感觉防弹

调试回显,以防你想回显一些东西

global $debug;$debug = false; // toggle debug echoes and var_dumps


/**
 * Returns a boolean indicating whether the given URL is the current one.
 * 
 * @param $otherURL the other URL, as a string. Can be any URL, relative or canonical. Invalid URLs will not match.
 * 
 * @return true iff the given URL points to the same place as the current one
 */
function isCurrentURL($otherURL)
{global $debug;
    if($debug)echo"<!--\r\nisCurrentURL($otherURL)\r\n{\r\n";

    if ($thisURL == $otherURL) // unlikely, but possible. Might as well check.
        return true;

    // BEGIN Parse other URL
    $otherProtocol = parse_url($otherURL);
    $otherHost = $otherProtocol["host"] or null; // if $otherProtocol["host"] is set and is not null, use it. Else, use null.
    $otherDomain = explode(".", $otherHost) or $otherDomain;
    $otherSubdomain = array_shift($otherDomain); // subdom only
    $otherDomain = implode(".", $otherDomain); // domain only
    $otherFilepath = $otherProtocol["path"] or null;
    $otherProtocol = $otherProtocol["scheme"] or null;
    // END Parse other URL

    // BEGIN Get current URL
    #if($debug){echo '$_SERVER == '; var_dump($_SERVER);}
    $thisProtocol = $_SERVER["HTTP_X_FORWARDED_PROTO"]; // http or https
    $thisHost = $_SERVER["HTTP_HOST"]; // subdom or subdom.domain.tld
    $thisDomain = explode(".", $thisHost);
    $thisSubdomain = array_shift($thisDomain); // subdom only
    $thisDomain = implode(".", $thisDomain); // domain only
    if ($thisDomain == "")
        $thisDomain = $otherDomain;
    $thisFilepath = $_SERVER["PHP_SELF"]; // /path/to/file.php
    $thisURL = "$thisProtocol://$thisHost$thisFilepath";
    // END Get current URL

    if($debug)echo"Current URL is $thisURL ($thisProtocol, $thisSubdomain, $thisDomain, $thisFilepath).\r\n";
    if($debug)echo"Other URL is $otherURL ($otherProtocol, $otherHost, $otherFilepath).\r\n";

    $thisDomainRegexed = isset($thisDomain) && $thisDomain != null && $thisDomain != "" ? "(\." . str_replace(".","\.",$thisDomain) . ")?" : ""; // prepare domain for insertion into regex
    //                                                                                                      v this makes the last slash before index.php optional
    $regex = "/^(($thisProtocol:)?\/\/$thisSubdomain$thisDomainRegexed)?" . preg_replace('/index\\\..+$/i','?(index\..+)?', str_replace(array(".", "/"), array("\.", "\/"), $thisFilepath)) . '$/i';

    if($debug)echo "\r\nregex is $regex\r\nComparing regex against $otherURL";
    if (preg_match($regex, $otherURL))
    {
        if($debug)echo"\r\n\tIt's a match! Returning true...\r\n}\r\n-->";
        return true;
    }
    else
    {
        if($debug)echo"\r\n\tOther URL is NOT a fully-qualified URL in this subdomain. Checking if it is relative...";
        if($otherURL == $thisFilepath) // somewhat likely
        {
            if($debug)echo"\r\n\t\tOhter URL and this filepath are an exact match! Returning true...\r\n}\r\n-->";
            return true;
        }
        else
        {
            if($debug)echo"\r\n\t\tFilepath is not an exact match. Testing against regex...";
            $regex = regexFilepath($thisFilepath);
            if($debug)echo"\r\n\t\tNew Regex is $regex";
            if($debug)echo"\r\n\t\tComparing regex against $otherFilepath...";
            if (preg_match($regex, $otherFilepath))
            {
                if($debug)echo"\r\n\t\t\tIt's a match! Returning true...\r\n}\r\n-->";
                return true;
            }
        }
    }
    if($debug)echo"\r\nI tried my hardest, but couldn't match $otherURL to $thisURL. Returning false...\r\n}\r\n-->";
    return false;
}

/**
 * Uses the given filepath to create a regex that will match it in any of its relative representations.
 * 
 * @param $path the filepath to be converted
 * 
 * @return a regex that matches a all relative forms of the given filepath
 */
function regexFilepath($path)
{global $debug;
    if($debug)echo"\r\nregexFilepath($path)\r\n{\r\n";

    $filepathArray = explode("/", $path);
    if (count($filepathArray) == 0)
        throw new Exception("given parameter not a filepath: $path");
    if ($filepathArray[0] == "") // this can happen if the path starts with a "/"
        array_shift($filepathArray); // strip the first element off the array
    $isIndex = preg_match("/^index\..+$/i", end($filepathArray));
    $filename = array_pop($filepathArray);

    if($debug){var_dump($filepathArray);}

$ret = '';
foreach($filepathArray as $i)
    $ret = "(\.\.\/$ret$i\/)?"; // make a pseudo-recursive relative filepath
if($debug)echo "\r\n$ret";
$ret = preg_replace('/\)\?$/', '?)', $ret); // remove the last '?' and add one before the last '\/'
if($debug)echo "\r\n$ret";
$ret = '/^' . ($ret == '' ? '\.\/' : "((\.\/)|$ret)") . ($isIndex ? '(index\..+)?' : str_replace('.', '\.', $filename)) . '$/i'; // if this filepath leads to an index.php (etc.), then that filename is implied and irrelevant.

if($debug)echo'\r\n}\r\n';
}
global$debug$debug=false;//切换调试回显和变量转储
/**
*返回一个布尔值,指示给定URL是否为当前URL。
* 
*@param$otherURL另一个URL,作为字符串。可以是任何URL,相对的或规范的。无效的URL将不匹配。
* 
*@return true如果给定URL指向与当前URL相同的位置
*/
函数isCurrentURL($otherURL)
{全局$调试;
如果($debug)echo“”;
返回true;
}
其他的
{
if($debug)echo“\r\n\t另一个URL在此子域中不是完全限定的URL。正在检查它是否是相对的…”;
if($otherURL==$thisFilepath)//有点像
{
如果($debug)echo“\r\n\t\t URL和此文件路径完全匹配!返回true…\r\n}\r\n-->”;
返回true;
}
其他的
{
如果($debug)echo“\r\n\t\t文件路径不完全匹配。针对正则表达式进行测试…”;
$regex=regexFilepath($thisFilepath);
如果($debug)echo“\r\n\t\t新正则表达式是$Regex”;
if($debug)echo“\r\n\t\t将regex与$otherFilepath进行比较…”;
if(preg_match($regex,$otherFilepath))
{
如果($debug)echo“\r\n\t\t\t匹配!返回true…\r\n}\r\n-->”;
返回true;
}
}
}
如果($debug)echo“\r\n我尽了最大努力,但无法将$otherURL与$thisURL匹配。返回false…\r\n}\r\n-->”;
返回false;
}
/**
*使用给定的文件路径创建一个正则表达式,该正则表达式将在其任何相对表示中与之匹配。
* 
*@param$path要转换的文件路径
* 
*@返回与给定文件路径的所有相对形式匹配的正则表达式
*/
函数regexFilepath($path)
{全局$调试;
if($debug)echo“\r\nregexFilepath($path)\r\n{\r\n”;
$filepathArray=explode(“/”,$path);
如果(计数($filepathArray)==0)
抛出新异常(“给定参数不是文件路径:$path”);
if($filepathArray[0]==“”)//如果路径以“/”开头,则可能发生这种情况
array_shift($filepathArray);//从数组中删除第一个元素
$isIndex=preg_match(“/^index\…+$/i”,结束($filepathArray));
$filename=array\u pop($filepathArray);
if($debug){var_dump($filepathArray);}
$ret='';
foreach($filepathArray作为$i)
$ret=“(\.\.\/$ret$i\/)?”;//创建一个伪递归相对文件路径
如果($debug)回显“\r\n$ret”;
$ret=preg\u replace('/\)\?$/','?)',$ret);//删除最后一个'?'并在最后一个'\/'之前添加一个'?'
如果($debug)回显“\r\n$ret”;
$ret='/^'($ret=='''''.\.\/':“(\.\.\/)\.$ret)”)。($isIndex?'(索引\…+)):str_replace('.','\.',$filename))。$/i';//如果此文件路径指向index.php(等),则该文件名是隐含的且不相关的。
if($debug)echo'\r\n}\r\n';
}

这似乎匹配了我需要它匹配的所有内容,而不是我不需要它匹配的内容。

首先,无法预测将导致显示当前页面的有效URL的总列表,因为您无法预测(或控制)可能链接回页面的外部链接。如果有人使用TinyURL或bit.ly怎么办?regex不会切芥末

如果您需要的是确保链接不会产生相同的页面,那么您需要对其进行测试。以下是一个基本概念:

  • 每一页都有一个唯一的ID。称之为序列号。它应该是持久的。序列号应该嵌入页面中可预测的位置(尽管可能不可见)

  • 创建页面时,PHP需要遍历每个页面的所有链接,访问每个页面,并确定链接是否解析为序列号与调用页面序列号匹配的页面

  • 如果序列号不匹配,则将链接显示为链接。否则,显示其他内容

  • 显然,对于页面制作来说,这将是一个艰巨的、资源密集型的过程。你真的不想这样解决你的问题

    考虑到你的“最终目标”评论,我怀疑你最好的方法是近似。以下是一些策略

    第一个选项也是最简单的。如果您正在构建一个通常以一种格式创建链接的内容管理系统,只需支持该格式即可。维基百科的认可
    function checkURL($me, $s) {
       $dir = dirname($me) . '/';
       // you may need to refine this
       $s = preg_filter(array('~^//~', '~/$~', '~\?.*$~', '~\.\./~'),
                        array('', '', '', $dir), $s);
       // parse resulting URL
       $url = parse_url($s);
       var_dump($url);
       // match parsed URL's path with self
       return ($url['path'] === $me);
    }
    
    // your page's URL with stripped out .php    
    $me = str_replace('.php', '', $_SERVER['PHP_SELF']);
    
    // assume this is the URL you are matching against
    $s = '../page/';
    
    // compare $me with $s
    $ret = checkURL($me, $s);
    
    var_dump($ret);
    
    global $debug;$debug = false; // toggle debug echoes and var_dumps
    
    
    /**
     * Returns a boolean indicating whether the given URL is the current one.
     * 
     * @param $otherURL the other URL, as a string. Can be any URL, relative or canonical. Invalid URLs will not match.
     * 
     * @return true iff the given URL points to the same place as the current one
     */
    function isCurrentURL($otherURL)
    {global $debug;
        if($debug)echo"<!--\r\nisCurrentURL($otherURL)\r\n{\r\n";
    
        if ($thisURL == $otherURL) // unlikely, but possible. Might as well check.
            return true;
    
        // BEGIN Parse other URL
        $otherProtocol = parse_url($otherURL);
        $otherHost = $otherProtocol["host"] or null; // if $otherProtocol["host"] is set and is not null, use it. Else, use null.
        $otherDomain = explode(".", $otherHost) or $otherDomain;
        $otherSubdomain = array_shift($otherDomain); // subdom only
        $otherDomain = implode(".", $otherDomain); // domain only
        $otherFilepath = $otherProtocol["path"] or null;
        $otherProtocol = $otherProtocol["scheme"] or null;
        // END Parse other URL
    
        // BEGIN Get current URL
        #if($debug){echo '$_SERVER == '; var_dump($_SERVER);}
        $thisProtocol = $_SERVER["HTTP_X_FORWARDED_PROTO"]; // http or https
        $thisHost = $_SERVER["HTTP_HOST"]; // subdom or subdom.domain.tld
        $thisDomain = explode(".", $thisHost);
        $thisSubdomain = array_shift($thisDomain); // subdom only
        $thisDomain = implode(".", $thisDomain); // domain only
        if ($thisDomain == "")
            $thisDomain = $otherDomain;
        $thisFilepath = $_SERVER["PHP_SELF"]; // /path/to/file.php
        $thisURL = "$thisProtocol://$thisHost$thisFilepath";
        // END Get current URL
    
        if($debug)echo"Current URL is $thisURL ($thisProtocol, $thisSubdomain, $thisDomain, $thisFilepath).\r\n";
        if($debug)echo"Other URL is $otherURL ($otherProtocol, $otherHost, $otherFilepath).\r\n";
    
        $thisDomainRegexed = isset($thisDomain) && $thisDomain != null && $thisDomain != "" ? "(\." . str_replace(".","\.",$thisDomain) . ")?" : ""; // prepare domain for insertion into regex
        //                                                                                                      v this makes the last slash before index.php optional
        $regex = "/^(($thisProtocol:)?\/\/$thisSubdomain$thisDomainRegexed)?" . preg_replace('/index\\\..+$/i','?(index\..+)?', str_replace(array(".", "/"), array("\.", "\/"), $thisFilepath)) . '$/i';
    
        if($debug)echo "\r\nregex is $regex\r\nComparing regex against $otherURL";
        if (preg_match($regex, $otherURL))
        {
            if($debug)echo"\r\n\tIt's a match! Returning true...\r\n}\r\n-->";
            return true;
        }
        else
        {
            if($debug)echo"\r\n\tOther URL is NOT a fully-qualified URL in this subdomain. Checking if it is relative...";
            if($otherURL == $thisFilepath) // somewhat likely
            {
                if($debug)echo"\r\n\t\tOhter URL and this filepath are an exact match! Returning true...\r\n}\r\n-->";
                return true;
            }
            else
            {
                if($debug)echo"\r\n\t\tFilepath is not an exact match. Testing against regex...";
                $regex = regexFilepath($thisFilepath);
                if($debug)echo"\r\n\t\tNew Regex is $regex";
                if($debug)echo"\r\n\t\tComparing regex against $otherFilepath...";
                if (preg_match($regex, $otherFilepath))
                {
                    if($debug)echo"\r\n\t\t\tIt's a match! Returning true...\r\n}\r\n-->";
                    return true;
                }
            }
        }
        if($debug)echo"\r\nI tried my hardest, but couldn't match $otherURL to $thisURL. Returning false...\r\n}\r\n-->";
        return false;
    }
    
    /**
     * Uses the given filepath to create a regex that will match it in any of its relative representations.
     * 
     * @param $path the filepath to be converted
     * 
     * @return a regex that matches a all relative forms of the given filepath
     */
    function regexFilepath($path)
    {global $debug;
        if($debug)echo"\r\nregexFilepath($path)\r\n{\r\n";
    
        $filepathArray = explode("/", $path);
        if (count($filepathArray) == 0)
            throw new Exception("given parameter not a filepath: $path");
        if ($filepathArray[0] == "") // this can happen if the path starts with a "/"
            array_shift($filepathArray); // strip the first element off the array
        $isIndex = preg_match("/^index\..+$/i", end($filepathArray));
        $filename = array_pop($filepathArray);
    
        if($debug){var_dump($filepathArray);}
    
    $ret = '';
    foreach($filepathArray as $i)
        $ret = "(\.\.\/$ret$i\/)?"; // make a pseudo-recursive relative filepath
    if($debug)echo "\r\n$ret";
    $ret = preg_replace('/\)\?$/', '?)', $ret); // remove the last '?' and add one before the last '\/'
    if($debug)echo "\r\n$ret";
    $ret = '/^' . ($ret == '' ? '\.\/' : "((\.\/)|$ret)") . ($isIndex ? '(index\..+)?' : str_replace('.', '\.', $filename)) . '$/i'; // if this filepath leads to an index.php (etc.), then that filename is implied and irrelevant.
    
    if($debug)echo'\r\n}\r\n';
    }