PHP修剪www（如果存在）并删除路径_Php

PHP修剪www（如果存在）并删除路径

php

PHP修剪www（如果存在）并删除路径,php,Php,我有一系列域名，如： array( 'http://example.co.uk/foo/bar', 'http://www.example.com/foo/bar', 'http://example.net/foo/bar') 等等我正在使用 parse_url($url, PHP_URL_HOST); 为了删减所有内容，只保留域名，它部分工作，但是，如果它存在，它保留了www部分。如果“www”存在，我如何删除它。我试图从数组中的域名中显式删除它，但当它解析时，它会恢复为www.exa

我有一系列域名，如：

array(
'http://example.co.uk/foo/bar',
'http://www.example.com/foo/bar',
'http://example.net/foo/bar')

等等

我正在使用

parse_url($url, PHP_URL_HOST);

为了删减所有内容，只保留域名，它部分工作，但是，如果它存在，它保留了www部分。如果“www”存在，我如何删除它。我试图从数组中的域名中显式删除它，但当它解析时，它会恢复为www.example.com

所以我想返回：

www.example.com/foo/bar > example
www.example.co.uk/foo/bar > example
example.com/foo/bar > example
example.net/foo/bar > example

如果您只想剥离“www”，那么可以使用str_replace for strpos检查字符串中是否有“www”

$url = "";
if (strpos($url,'www') !== false) {
    $url = str_replace("www", "", $url);
}

编辑：要删除几乎所有的url（包括域扩展和www（如果存在）），您可以执行以下操作：

$result = preg_split('/(?=\.[^.]+$)/', "example.com/foo/bar")[0];
if (strpos($result,'www') !== false) {
    $result = str_replace("www.", "", $result);
}
var_dump($result);

如果您只想剥离“www”，那么可以使用str_replace for strpos检查字符串中是否有“www”

$url = "";
if (strpos($url,'www') !== false) {
    $url = str_replace("www", "", $url);
}

编辑：要删除几乎所有的url（包括域扩展和www（如果存在）），您可以执行以下操作：

$result = preg_split('/(?=\.[^.]+$)/', "example.com/foo/bar")[0];
if (strpos($result,'www') !== false) {
    $result = str_replace("www.", "", $result);
}
var_dump($result);

下面的函数不是用于获取FQDN的域名或域部分的通用函数。相反，如果不是www，它将返回第一个标签（从左到右），如果是www，则返回第二个标签。如上所述

<?php

function get_domain_from_host($host)
{
    $parts = explode('.', $host);
    $domain = strpos($host, 'www') === 0
        ? next($parts)
        : current($parts);

    return $domain;
}

function test()
{
    $urls_wanted = array(
        'http://www.example.com/foo/bar' => 'example',
        'http://www.example.co.uk/foo/bar' => 'example',
        'http://example.com/foo/bar' => 'example',
        'http://example.net/foo/bar' => 'example'
    );

    foreach($urls_wanted as $url => $wanted) {
        $host   = parse_url($url, PHP_URL_HOST);
        $domain = get_domain_from_host($host);
        print assert($wanted == $domain);
    }
}

test(); // Outputs: 1111

<?php

function get_domain_from_host($host)
{
    $parts = explode('.', $host);
    $domain = strpos($host, 'www') === 0
        ? next($parts)
        : current($parts);

    return $domain;
}

function test()
{
    $urls_wanted = array(
        'http://www.example.com/foo/bar' => 'example',
        'http://www.example.co.uk/foo/bar' => 'example',
        'http://example.com/foo/bar' => 'example',
        'http://example.net/foo/bar' => 'example'
    );

    foreach($urls_wanted as $url => $wanted) {
        $host   = parse_url($url, PHP_URL_HOST);
        $domain = get_domain_from_host($host);
        print assert($wanted == $domain);
    }
}

test(); // Outputs: 1111

您可以使用regex

~（？：https？：/）（？：www\）（[^\./]+）~i

进行匹配

限制：

<?php

function getSiteName($url) {
    if (preg_match('~(?:https?://)?(?:www\.)?([^\./]+)~i', $url, $match)) {
        return $match[1];
    }

    throw new \Exception(sprintf('Could not match URL "%s"', $url));
}

$siteName = getSiteName('http://www.example.com/foo/bar');

请注意，它将错误地分析有效域

www.com

，并返回

com

，而不是

www

。只有当名称部分是

www

（

www.net

，

www.co.uk

等）时，它才会错误地解析它们

尸检：

<?php

function getSiteName($url) {
    if (preg_match('~(?:https?://)?(?:www\.)?([^\./]+)~i', $url, $match)) {
        return $match[1];
    }

    throw new \Exception(sprintf('Could not match URL "%s"', $url));
}

$siteName = getSiteName('http://www.example.com/foo/bar');

```
~
```
我们指定我们的修饰符-正则表达式将知道下次看到这个字符时，我们只指定修饰符
```
（？：https？：/）？
```
可选的非捕获组：
- ```
？：
```
  意味着它是一个非捕获组（没有它，我们将不得不使用
```
返回$match[3]
```
  ）
- ```
http
```
  文本字符串
```
http
```
- ```
s？
```
  字符
```
s
```
  匹配0到1次（可选）
- ```
：//
```
  文本字符串
```
：//
```
- ```
（..）
```
  整个组匹配0到1次（可选）
```
（？：www\）？
```
可选的非捕获组：
- ```
？：
```
  意味着它是一个非捕获组（没有它，我们将不得不使用
```
返回$match[3]
```
  ）
- ```
www\.
```
  文本字符串
```
www.
```
  -我们需要用斜杠转义点，因为点在正则表达式（任何字符）中有特殊含义
- ```
（..）
```
  整个组匹配0到1次（可选）
```
（[^\./]+）
```
捕获组：
- ```
[^\./]+
```
  任何不是
  或
```
/
```
  匹配1到无限次的字符
```
~i
```
-*我们的结尾修饰符字符-
```
i
```
意味着整个正则表达式是incase敏感的（因此我们匹配
```
HTTPS
```
和
```
WwW
```
）

功能：

<?php

function getSiteName($url) {
    if (preg_match('~(?:https?://)?(?:www\.)?([^\./]+)~i', $url, $match)) {
        return $match[1];
    }

    throw new \Exception(sprintf('Could not match URL "%s"', $url));
}

$siteName = getSiteName('http://www.example.com/foo/bar');