Regex 如何从Python字符串中删除除空格和破折号以外的所有特殊字符？_Regex_Python 2.7

Regex 如何从Python字符串中删除除空格和破折号以外的所有特殊字符？

regex python-2.7

Regex 如何从Python字符串中删除除空格和破折号以外的所有特殊字符？,regex,python-2.7,Regex,Python 2.7,我想从Python字符串中去掉除破折号和空格以外的所有特殊字符这是正确的吗 import re my_string = "Web's GReat thing-ok" pattern = re.compile('[^A-Za-z0-9 -]') new_string = pattern.sub('',my_string) new_string >> 'Webs GReat thing-ok' # then make it lowercase and replace spaces wi

我想从Python字符串中去掉除破折号和空格以外的所有特殊字符

这是正确的吗

import re
my_string = "Web's GReat thing-ok"
pattern = re.compile('[^A-Za-z0-9 -]')
new_string = pattern.sub('',my_string)
new_string
>> 'Webs GReat thing-ok'
# then make it lowercase and replace spaces with underscores
# new_string = new_string.lower().replace (" ", "_")
# new_string
# >> 'webs_great_thing-ok'

如图所示，在删除其他特殊字符后，我最终希望用下划线替换空格，但我想我会分阶段完成。有没有一种蟒蛇式的方法可以一蹴而就

对于上下文，我将此输入用于MongoDB集合名称，因此希望最后一个字符串的约束为：允许使用带破折号和下划线的字母数字

实际上，您正在尝试“缓动”字符串

如果您不介意使用第三方（以及特定于Python 2的）库，您可以使用

slagify

（

pip安装slagify

）：

您可以自己实现它。所有

slagify

的代码都是

import re
import unicodedata

def slugify(string):
    return re.sub(r'[-\s]+', '-',
            unicode(
                    re.sub(r'[^\w\s-]', '',
                           unicodedata.normalize('NFKD', string)
                           .encode('ascii', 'ignore'))
                           .strip()
                           .lower())

请注意，这是特定于Python 2的

回到你的例子，你可以把它做成一行。它是否具有足够的Pythonic功能取决于您的决定（注意缩短的范围

A-z

，而不是

A-Za-z

）：

更新似乎有一个更健壮且与Python 3兼容的“slugify”库。

您实际上是在尝试“slugify”字符串

如果您不介意使用第三方（以及特定于Python 2的）库，您可以使用

slagify

（

pip安装slagify

）：

您可以自己实现它。所有

slagify

的代码都是

import re
import unicodedata

def slugify(string):
    return re.sub(r'[-\s]+', '-',
            unicode(
                    re.sub(r'[^\w\s-]', '',
                           unicodedata.normalize('NFKD', string)
                           .encode('ascii', 'ignore'))
                           .strip()
                           .lower())

请注意，这是特定于Python 2的

回到你的例子，你可以把它做成一行。它是否具有足够的Pythonic功能取决于您的决定（注意缩短的范围

A-z

，而不是

A-Za-z

）：

更新似乎有更健壮、与Python 3兼容的“slugify”库。

一行程序，根据要求：

>>> import re, unicodedata
>>> value = "Web's GReat thing-ok"
>>> re.sub('[\s]+', '_', re.sub('[^\w\s-]', '', unicodedata.normalize('NFKD', unicode(value)).encode('ascii', 'ignore').decode('ascii')).strip().lower())
u'webs_great_thing-ok'

一艘班轮，按要求：

>>> import re, unicodedata
>>> value = "Web's GReat thing-ok"
>>> re.sub('[\s]+', '_', re.sub('[^\w\s-]', '', unicodedata.normalize('NFKD', unicode(value)).encode('ascii', 'ignore').decode('ascii')).strip().lower())
u'webs_great_thing-ok'