Python 将字符串从特殊字符转换为非特殊字符_Python_Regex_Python 2.7_Boto

Python 将字符串从特殊字符转换为非特殊字符

python regex python-2.7

Python 将字符串从特殊字符转换为非特殊字符,python,regex,python-2.7,boto,Python,Regex,Python 2.7,Boto,我正在使用python 2.7。如果我给name变量分配了一个字符串，如下所示 name = "Test with-name and_underscore" 如何将其转换为可分配给name变量的字符串 name = "TestWithNameAndUnderscore" 正则表达式是一种方法，还是python有任何内置函数来实现这一点所以我要寻找的是，当字符串中包含下划线、破折号、空格或任何特殊字符时，它们被转换成相同的东西，但没有下划线/破折号/空格/特殊字符，并且该单词的首字母应该以大

我正在使用python 2.7。如果我给name变量分配了一个字符串，如下所示

name = "Test with-name and_underscore"

如何将其转换为可分配给name变量的字符串

name = "TestWithNameAndUnderscore"

正则表达式是一种方法，还是python有任何内置函数来实现这一点

所以我要寻找的是，当字符串中包含下划线、破折号、空格或任何特殊字符时，它们被转换成相同的东西，但没有下划线/破折号/空格/特殊字符，并且该单词的首字母应该以大写字母开头，就像“test name-this_here”到“testnameisthithere”

如果没有空间或没有特殊的字符，那么不要做任何事情。因此，如果字符串是“Helloworld”，请跳过它并继续

我这样做的原因是，我正在使用python boto为AWS编写一些东西，并且对资源的调用有命名限制。它不能是非字母数字的

>>> import re
>>> name = "Test with-name and_underscore"
>>> print(''.join(x.capitalize() for x in re.compile(r'[^a-zA-Z0-9]').split(name)))
TestWithNameAndUnderscore

如果需要的话，你也可以去掉前导数字。下面是一个更健壮的示例，它可以做到这一点，并确保生成的字符串不是空的：

>>> import re
>>> def fix_id(s, split=re.compile('[^a-zA-Z0-9]+|^[0-9]+').split):
...     result = ''.join(x.capitalize() for x in split(s))
...     if not result:
...         raise ValueError('Invalid ID (empty after edits)')
...     return result
... 
>>> fix_id("Test with-name and_underscore")
'TestWithNameAndUnderscore'
>>> fix_id("123 Test 456 with-name and_underscore 789")
'Test456WithNameAndUnderscore789'
>>> fix_id("Thisshouldbeunmolested")
'Thisshouldbeunmolested'
>>> fix_id('123')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in fix_id
ValueError: Invalid ID (empty after edits)

>>重新导入
>>>def fix_id（s，split=re.compile（“[^a-zA-Z0-9]+|^[0-9]+”）.split）：
...     结果=''.join（x.capitalize（）表示拆分中的x）
...     如果没有结果：
...         raise VALUERROR（'无效ID（编辑后为空）'）
...     返回结果
... 
>>>修复id（“使用名称和下划线进行测试”）
“TestWithNameandUnderline”
>>>修复id（“带有名称和下划线789的123测试456”）
'使用名称和下划线789测试456'
>>>修复id（“这应该不受干扰”）
“这应该不受干扰”
>>>固定id（'123'）
回溯（最近一次呼叫最后一次）：
文件“”，第1行，在
文件“”，第4行，在修复id中
ValueError:无效ID（编辑后为空）

请注意，这两者都不能保证标识符的唯一性，例如，“Mary Sue”和“Mary Sue”将映射到同一标识符。如果需要将这些符号映射到不同的标识符，可以添加缓存字典，在其中映射符号，并在必要时添加后缀

如果需要的话，你也可以去掉前导数字。下面是一个更健壮的示例，它可以做到这一点，并确保生成的字符串不是空的：

>>> import re
>>> def fix_id(s, split=re.compile('[^a-zA-Z0-9]+|^[0-9]+').split):
...     result = ''.join(x.capitalize() for x in split(s))
...     if not result:
...         raise ValueError('Invalid ID (empty after edits)')
...     return result
... 
>>> fix_id("Test with-name and_underscore")
'TestWithNameAndUnderscore'
>>> fix_id("123 Test 456 with-name and_underscore 789")
'Test456WithNameAndUnderscore789'
>>> fix_id("Thisshouldbeunmolested")
'Thisshouldbeunmolested'
>>> fix_id('123')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in fix_id
ValueError: Invalid ID (empty after edits)

>>重新导入
>>>def fix_id（s，split=re.compile（“[^a-zA-Z0-9]+|^[0-9]+”）.split）：
...     结果=''.join（x.capitalize（）表示拆分中的x）
...     如果没有结果：
...         raise VALUERROR（'无效ID（编辑后为空）'）
...     返回结果
... 
>>>修复id（“使用名称和下划线进行测试”）
“TestWithNameandUnderline”
>>>修复id（“带有名称和下划线789的123测试456”）
'使用名称和下划线789测试456'
>>>修复id（“这应该不受干扰”）
“这应该不受干扰”
>>>固定id（'123'）
回溯（最近一次呼叫最后一次）：
文件“”，第1行，在
文件“”，第4行，在修复id中
ValueError:无效ID（编辑后为空）

请注意，这两者都不能保证标识符的唯一性，例如，“Mary Sue”和“Mary Sue”将映射到同一标识符。如果需要将它们映射到不同的标识符，可以添加缓存字典，在其中映射符号，并在必要时添加后缀。

这可以在不使用正则表达式的情况下使用Python中的isalnum（）函数来完成

name = "Test with-name and_underscore"
new_name = ''.join(name for name in string if e.isalnum())

当然，如果您坚持使用正则表达式，也可以使用适当的正则表达式函数替换isalnum（）。

在Python中不使用isalnum（）函数的情况下，也可以使用正则表达式

name = "Test with-name and_underscore"
new_name = ''.join(name for name in string if e.isalnum())

当然，如果您坚持使用正则表达式，也可以用适当的正则表达式函数替换isalnum（）。

我知道一种愚蠢的方法

name.replace('_',' ').replace('-',' ')
name = name.title().replace(' ','')

我知道一个愚蠢的方法

name.replace('_',' ').replace('-',' ')
name = name.title().replace(' ','')

一种可能较小的re方法是使用以下方法：

  import re
   string = '123 this is a test_of the-sub method 33'
   varString = re.sub('_?-? ?', '', string)

它应该会回来

>>> sub('_?-? ?','',string) 
'123thisisatestofthesubmethod33'

如果您试图将其用作变量名，但可能会遇到一些问题，例如太长（符合pep8）或其他外来字符，如！？$%等等，上面的isalpha方法可能有助于解决这些问题。我会小心地让我们相信字符串的值会变成变量名，并包装一些约束以避免任何类型的溢出

一种可能较小的re方法是使用以下方法：

  import re
   string = '123 this is a test_of the-sub method 33'
   varString = re.sub('_?-? ?', '', string)

它应该会回来

>>> sub('_?-? ?','',string) 
'123thisisatestofthesubmethod33'

我认为这可能不起作用…请参见下面的>>>name=“d-12312340_work”>>>print（''.join（x.capitalize（））代表重新编译（r'[^a-zA-Z]'）。split（name）））DWork>>@maxscalf我添加了0-9。然而，如果从0-9开始，这将是一个问题。如果您预计会发生这种情况，则需要更多的代码。我认为这可能不起作用…请参见下面的>>>name=“d-12312340_work”>>>>打印（''.join（x.capitalize（），表示重新编译（r'[^a-zA-Z]'）。拆分（name）））DWork>>@maxscalf我添加了0-9。然而，如果从0-9开始，这将是一个问题。如果您预期会发生这种情况，则需要更多的代码。