如何在Perl中生成URL段塞？_Perl_Seo_Url Rewriting_Cpan

如何在Perl中生成URL段塞？

perl seo url-rewriting

如何在Perl中生成URL段塞？,perl,seo,url-rewriting,cpan,Perl,Seo,Url Rewriting,Cpan,Rails和Django等Web框架内置了对“slug”的支持，这些“slug”用于生成可读和SEO友好的URL：段塞字符串通常只包含字符A-z、0-9和-，因此可以在不进行URL转义的情况下写入（想想“foo%20bar”）我正在寻找一个Perl slug函数，该函数给定任何有效的Unicode字符串都将返回slug表示（a-z、0-9和-）一个超平凡的slug函数可以是以下几点： $input = lc($input), $input =~ s/[^a-z0-9-]//g;

Rails和Django等Web框架内置了对“slug”的支持，这些“slug”用于生成可读和SEO友好的URL：

段塞字符串通常只包含字符

A-z

、

0-9

和

，因此可以在不进行URL转义的情况下写入（想想“foo%20bar”）

我正在寻找一个Perl slug函数，该函数给定任何有效的Unicode字符串都将返回slug表示（

a-z

、

0-9

和

）

一个超平凡的slug函数可以是以下几点：

$input = lc($input),
$input =~ s/[^a-z0-9-]//g;

但是，这个实现不会处理国际化和重音（我希望

ë

变成

）。解决这一问题的一种方法是列举所有特殊情况，但这并不十分优雅。我正在寻找一些更为深思熟虑和全面的东西

我的问题:

在Perl中生成Django/Rails类型段塞的最通用/实用的方法是什么？这就是我用Java解决同样问题的方法

添加到链的开头看起来可以满足您的需要。

您是否在寻找类似的内容？

Django中当前使用的代码（大致）转换为以下Perl代码：

use Unicode::Normalize;

sub slugify($) {
    my ($input) = @_;

    $input = NFKD($input);         # Normalize (decompose) the Unicode string
    $input =~ tr/\000-\177//cd;    # Strip non-ASCII characters (>127)
    $input =~ s/[^\w\s-]//g;       # Remove all characters that are not word characters (includes _), spaces, or hyphens
    $input =~ s/^\s+|\s+$//g;      # Trim whitespace from both ends
    $input = lc($input);
    $input =~ s/[-\s]+/-/g;        # Replace all occurrences of spaces and hyphens with a single hyphen

    return $input;
}

由于您还希望将重音字符更改为非重音字符，因此在剥离非ASCII字符之前调用

unidecode

（在

Text:：unidecode

中定义）似乎是最佳选择（）

在这种情况下，函数可能如下所示：

use Unicode::Normalize;
use Text::Unidecode;

sub slugify_unidecode($) {
    my ($input) = @_;

    $input = NFC($input);          # Normalize (recompose) the Unicode string
    $input = unidecode($input);    # Convert non-ASCII characters to closest equivalents
    $input =~ s/[^\w\s-]//g;       # Remove all characters that are not word characters (includes _), spaces, or hyphens
    $input =~ s/^\s+|\s+$//g;      # Trim whitespace from both ends
    $input = lc($input);
    $input =~ s/[-\s]+/-/g;        # Replace all occurrences of spaces and hyphens with a single hyphen

    return $input;
}

前者适用于主要为ASCII的字符串，但当整个字符串由非ASCII字符组成时，则会出现不足，因为它们都会被剥离，留下一个空字符串

样本输出：

string        | slugify       | slugify_unidecode
-------------------------------------------------
hello world     hello world     hello world
北亰                            bei-jing
liberté         liberta         liberte

注意如何北亰使用受Django启发的实现将slugifies设置为零。还要注意NFC规范化所带来的差异——在去掉分解字符的第二部分后，liberté变成了NFKD的“liberta”，但是，在剥离NFC重新组装的“é”之后，它将变得“自由”。

用于在博客软件Movable Type/Melody中制作Slug。

最关键的解决方案是使用你需要的东西。这是一个很好地为您提供

slagify

功能的少量代码

它依赖于删除字符中的重音。

使用与Java相同的方法。布莱恩：是的，我不知道如何翻译的操作是“String Normalizer=Normalizer.normalize（nowhitespace，Form.NFD）；”。Unicode:：Normalize解决了这个问题。看看卡梅伦的答案。用一些中国人测试，这似乎是我需要的。