Php 如何在字符串中附加到所有URL?

Php 如何在字符串中附加到所有URL?,php,Php,我应该如何在即将以电子邮件形式发送的html字符串中的所有URL的末尾追加内容?我想像这样添加google analytics活动跟踪: ?utm\u来源=电子邮件和utm\u媒体=电子邮件和utm\u活动=产品通知 99%的页面不会以“.html”结尾,某些URL的结尾可能已经有类似于?sr=1的内容。您可以使用以下代码段,将您的google analytics GET参数附加到当前脚本URI的现有参数中 function getQuery() { $url = parse_url($_S

我应该如何在即将以电子邮件形式发送的html字符串中的所有URL的末尾追加内容?我想像这样添加google analytics活动跟踪:
?utm\u来源=电子邮件和utm\u媒体=电子邮件和utm\u活动=产品通知


99%的页面不会以“.html”结尾,某些URL的结尾可能已经有类似于
?sr=1
的内容。

您可以使用以下代码段,将您的google analytics GET参数附加到当前脚本URI的现有参数中

function getQuery() {

 $url = parse_url($_SERVER['REQUEST_URI']);

 return $url['query'].'&utm_source=email&utm_medium=email&utm_campaign=product_notify';
}

嗯。。。你可以这样做:

function AppendCampaignToString($string) {
    $regex = '#(<a href=")([^"]*)("[^>]*?>)#i';
    return preg_replace_callback($regex, '_appendCampaignToString', $string);
}
function _AppendCampaignToString($match) {
    $url = $match[2];
    if (strpos($url, '?') === false) {
        $url .= '?';
    }
    $url .= '&utm_source=email&utm_medium=email&utm_campaign=product_notify';
    return $match[1].$url.$match[3];
}
函数appendActivationString($string){
$regex='#(

我昨晚构建并测试的解决方案:

我只匹配那些还没有类似于“utm_”的查询参数的链接,但包含带有“utm_”的链接作为路径的一部分:在查询参数之前或另一个参数名称(如“xutm_”)的子字符串

为此,我结合使用了积极和消极的正则表达式前瞻断言()

我还允许标记在href前后具有其他属性

$pattern='/]*href=“(?=(((?!(\?|&)utm|))*?>)[^”]*/i';
它匹配href标记中没有“?utm_uu”或“&utm_u”的所有链接

然后我使用类回调函数解决方案,以便能够传递要追加的查询参数(作为回调的额外参数)

类链接参数{
私人参数;
函数构造($params){
$this->parameters=$params;
}
函数回调($matches){
返回$matches[0]。(preg\u match('/\\?[^“]/',$matches[0])?'和':'?')。http\u build\u query($this->parameters);
}
}
准备要添加到链接的查询参数:

$params\u to\u add=数组(
“utm_来源”=>“时事通讯-sep13”,
“utm_媒体”=>“电子邮件”,
“utm_活动”=>“产品-X”
);
$callback\u helper=新链接参数($params\u to\u add);
最后,我应用preg_replace_回调函数,如下所示:

function AppendCampaignToString($string) {
    $regex = '#(<a href=")([^"]*)("[^>]*?>)#i';
    return preg_replace_callback($regex, '_appendCampaignToString', $string);
}
function _AppendCampaignToString($match) {
    $url = $match[2];
    if (strpos($url, '?') === false) {
        $url .= '?';
    }
    $url .= '&utm_source=email&utm_medium=email&utm_campaign=product_notify';
    return $match[1].$url.$match[3];
}
$html=preg\u replace\u callback($pattern,array($callback\u helper,'callback'),$html);

更新@ircmaxell的答案,现在即使在before href&code简化上有属性,regex也会匹配

/**
 * @param string $body
 * @param string $campaign
 * @param string $medium
 * @return mixed
 */
protected function add_analytics_tracking_to_urls($body, $campaign, $medium = 'email') {
    return preg_replace_callback('#(<a.*?href=")([^"]*)("[^>]*?>)#i', function($match) use ($campaign, $medium) {
        $url = $match[2];
        if (strpos($url, '?') === false) {
            $url .= '?';
        } else {
            $url .= '&';
        }
        $url .= 'utm_source=' . $medium . '&utm_medium=' . $medium . '&utm_campaign=' . urlencode($campaign);
        return $match[1] . $url . $match[3];
    }, $body);
}
/**
*@param string$body
*@param string$campaign
*@param字符串$medium
*@返回混合
*/
受保护的函数添加分析跟踪到URL($body,$campaign,$medium='email')){

return preg_replace_callback('#)(这是我的解决方案,这是一个简单的问题,但相当复杂的解决方案,适用于所有URL类型

$campaign = (object)['utm_source' => 'email', 'utm_medium' => 'email', 'utm_campaign' => 'abc'];
$host = 'www.me.com';

$html = preg_replace_callback(
        '#(<a.*?href=["\']?)(?<href>https?://[^\s"\']+)(["\']?.*?>.*?</a>)#si', function ($matches) use ($campaign, $host) {
    $url = parse_url($matches['href']);
    // if (isset($url['host']) && $url['host'] !== $host) return $matches[0];
    parse_str(isset($url['query']) ? $url['query'] : '', $query);
    $query = array_merge(
        $query, array_filter(
                  [
                      'utm_source' => $campaign->utm_source,
                      'utm_medium' => $campaign->utm_medium,
                      'utm_term' => $campaign->utm_term,
                      'utm_content' => $campaign->utm_content,
                      'utm_campaign' => $campaign->utm_campaign,
                  ]
              )
    );
    return $matches[1] . // anchor part before url
    (isset($url['scheme']) ? $url['scheme'] . '://' : '') .
    (isset($url['user']) ? $url['user'] : '') .
    (isset($url['pass']) ? (isset($url['user']) ? ':' : '') . $url['pass'] : '') .
    (isset($url['user']) || isset($url['pass']) ? '@' : '').
    (isset($url['host']) ? $url['host'] : '') .
    (isset($url['port']) ? ':' . $url['port'] : '') .
    (isset($url['path']) ? $url['path'] : '') .
    '?' . http_build_query($query) .
    (isset($url['fragment']) ? '#' . $url['fragment'] : '') .
    $matches[3]; // anchor part after URL
}, $html
);
结果如下:

<a href="http://www.me.com?utm_source=email&utm_medium=email&utm_campaign=abc">Lorem</a>
<a href="http://www.me.com/?utm_source=email&utm_medium=email&utm_campaign=abc">ipsum</a>
<a href="http://www.me.com/?utm_source=email&utm_medium=email&utm_campaign=abc#section-2">dolor</a>
<a href="http://www.me.com/path-to-somewhere/file.php?utm_source=email&utm_medium=email&utm_campaign=abc">sit</a>
<a href="http://www.me.com/?utm_source=email&utm_medium=email&utm_campaign=abc">amet</a>
<a href="http://www.me.com/?foo=bar&utm_source=email&utm_medium=email&utm_campaign=abc">consectetur</a>
<a href="http://www.me.com/?foo=bar&bar=foo&utm_source=email&utm_medium=email&utm_campaign=abc">consectetur</a>
<a href="http://www.NOTME.com?utm_source=email&utm_medium=email&utm_campaign=abc">existing utm params</a>
<a href="http://user:password@www.me.com/?foo=bar&utm_source=email&utm_medium=email&utm_campaign=abc#section-3">elit</a>
<a href="http://user:@www.me.com/?foo=bar&utm_source=email&utm_medium=email&utm_campaign=abc#section-3">elit</a>
<a href="http://user@www.me.com?foo=bar&utm_source=email&utm_medium=email&utm_campaign=abc#section-3">elit</a>


正如您所注意到的,如果您想在
parse_url()

之后立即取消注释过滤主机名行,那么我的代码适用于HTML中的所有链接(不仅仅是me.com),但不是我要查找的内容。我想附加的url是HTML字符串。嗯,如果utm_源可能覆盖已经给定的url,请附加(PHP无法在$\u GET数组中处理这个问题),或者相反?为我节省了大量时间,非常好。非常好。这是处理问题的方法。不是完整答案,但我正在寻找类似的方法。
<a href="http://www.me.com?utm_source=email&utm_medium=email&utm_campaign=abc">Lorem</a>
<a href="http://www.me.com/?utm_source=email&utm_medium=email&utm_campaign=abc">ipsum</a>
<a href="http://www.me.com/?utm_source=email&utm_medium=email&utm_campaign=abc#section-2">dolor</a>
<a href="http://www.me.com/path-to-somewhere/file.php?utm_source=email&utm_medium=email&utm_campaign=abc">sit</a>
<a href="http://www.me.com/?utm_source=email&utm_medium=email&utm_campaign=abc">amet</a>
<a href="http://www.me.com/?foo=bar&utm_source=email&utm_medium=email&utm_campaign=abc">consectetur</a>
<a href="http://www.me.com/?foo=bar&bar=foo&utm_source=email&utm_medium=email&utm_campaign=abc">consectetur</a>
<a href="http://www.NOTME.com?utm_source=email&utm_medium=email&utm_campaign=abc">existing utm params</a>
<a href="http://user:password@www.me.com/?foo=bar&utm_source=email&utm_medium=email&utm_campaign=abc#section-3">elit</a>
<a href="http://user:@www.me.com/?foo=bar&utm_source=email&utm_medium=email&utm_campaign=abc#section-3">elit</a>
<a href="http://user@www.me.com?foo=bar&utm_source=email&utm_medium=email&utm_campaign=abc#section-3">elit</a>