如何转换python字符串_Python_String_Escaping

如何转换python字符串

python string

如何转换python字符串,python,string,escaping,Python,String,Escaping,如何转换此字符串 '\\n this is a docstring for\\n the main function.\\n a,\\n b,\\n c\\n ' 进入请记住，我还希望对“\t”和所有其他转义字符执行此操作。相反方向的代码是 def fix_string(s): """ takes the string and replaces any `\n` with `\\n` so that the read file will be r

如何转换此字符串

'\\n    this is a docstring for\\n    the main function.\\n    a,\\n    b,\\n    c\\n    '

进入

请记住，我还希望对“\t”和所有其他转义字符执行此操作。相反方向的代码是

def fix_string(s):
    """ takes the string and replaces any `\n` with `\\n` so that the read file will be recognized """
    # escape chars = \t , \b , \n , \r , \f , \' , \" , \\
    new_s = ''
    for i in s:
            if i == '\t':
                    new_s += '\\t'
            elif i == '\b':
                    new_s += '\\b'
            elif i == '\n':
                    new_s += '\\n'
            elif i == '\r':
                    new_s += '\\r'
            elif i == '\f':
                    new_s += '\\f'
            elif i == '\'':
                    new_s += "\\'"
            elif i == '\"':
                    new_s += '\\"'
            else:
                    new_s += i
    return new_s

我是否可能需要查看字符的实际数值并检查下一个字符，比如我是否发现（'\'，92）字符后跟（'n'，110）

不要在这里重新发明轮子。Python支持你。此外，正确处理转义语法比看起来要困难

正确的处理方法在Python 2中，使用str-to-str：

这将为您解释任何Python识别的字符串转义序列，包括

\n

和

\t

演示：

在Python 3中，必须使用和

unicode\u escape

编解码器：

codecs.decode(string, 'unicode_escape')

因为没有

str.decode（）

方法，这不是str->bytes转换

演示：

为什么直截了当的

str.replace（）

无法解决问题您可以尝试自己使用

str.replace（）

来实现这一点，但是您还需要实现正确的转义解析；以

\\\n

为例；这是

\\n

，已转义。如果您天真地按顺序应用

str.replace（）

，那么您最终会得到

\n

或

\\\n

：

>>> '\\\\n'.decode('string_escape')
'\\n'
>>> '\\\\n'.replace('\\n', '\n').replace('\\\\', '\\')
'\\\n'
>>> '\\\\n'.replace('\\\\', '\\').replace('\\n', '\n')
'\n'

\\

对应仅替换为一个

字符，使

不被解释。但是replace选项要么最终用换行符替换尾随的

和

，要么最终用

替换

，然后用换行符替换

和

。无论哪种方式，最终都会得到错误的输出

手动处理此问题的慢方法您必须逐个处理字符，根据需要输入更多字符：

_map = {
    '\\\\': '\\',
    "\\'": "'",
    '\\"': '"',
    '\\a': '\a',
    '\\b': '\b',
    '\\f': '\f',
    '\\n': '\n',
    '\\r': '\r',
    '\\t': '\t',
}

def unescape_string(s):
    output = []
    i = 0
    while i < len(s):
        c = s[i]
        i += 1
        if c != '\\':
            output.append(c)
            continue
        c += s[i]
        i += 1
        if c in _map:
            output.append(_map[c])
            continue
        if c == '\\x' and i < len(s) - 2:  # hex escape
            point = int(s[i] + s[i + 1], 16)
            i += 2
            output.append(chr(point))
            continue
        if c == '\\0':  # octal escape
            while len(c) < 4 and i < len(s) and s[i].isdigit():
                c += s[i]
                i += 1
            point = int(c[1:], 8)
            output.append(chr(point))
    return ''.join(output)

要知道这比使用内置编解码器要慢得多。

不要在这里重新发明轮子。Python支持你。此外，正确处理转义语法比看起来要困难

正确的处理方法在Python 2中，使用str-to-str：

这将为您解释任何Python识别的字符串转义序列，包括

\n

和

\t

演示：

在Python 3中，必须使用和

unicode\u escape

编解码器：

codecs.decode(string, 'unicode_escape')

因为没有

str.decode（）

方法，这不是str->bytes转换

演示：

为什么直截了当的

str.replace（）

无法解决问题您可以尝试自己使用

str.replace（）

来实现这一点，但是您还需要实现正确的转义解析；以

\\\n

为例；这是

\\n

，已转义。如果您天真地按顺序应用

str.replace（）

，那么您最终会得到

\n

或

\\\n

：

>>> '\\\\n'.decode('string_escape')
'\\n'
>>> '\\\\n'.replace('\\n', '\n').replace('\\\\', '\\')
'\\\n'
>>> '\\\\n'.replace('\\\\', '\\').replace('\\n', '\n')
'\n'

\\

对应仅替换为一个

字符，使

不被解释。但是replace选项要么最终用换行符替换尾随的

和

，要么最终用

替换

，然后用换行符替换

和

。无论哪种方式，最终都会得到错误的输出

手动处理此问题的慢方法您必须逐个处理字符，根据需要输入更多字符：

_map = {
    '\\\\': '\\',
    "\\'": "'",
    '\\"': '"',
    '\\a': '\a',
    '\\b': '\b',
    '\\f': '\f',
    '\\n': '\n',
    '\\r': '\r',
    '\\t': '\t',
}

def unescape_string(s):
    output = []
    i = 0
    while i < len(s):
        c = s[i]
        i += 1
        if c != '\\':
            output.append(c)
            continue
        c += s[i]
        i += 1
        if c in _map:
            output.append(_map[c])
            continue
        if c == '\\x' and i < len(s) - 2:  # hex escape
            point = int(s[i] + s[i + 1], 16)
            i += 2
            output.append(chr(point))
            continue
        if c == '\\0':  # octal escape
            while len(c) < 4 and i < len(s) and s[i].isdigit():
                c += s[i]
                i += 1
            point = int(c[1:], 8)
            output.append(chr(point))
    return ''.join(output)

要知道这比使用内置编解码器慢得多。

解决这一问题的最简单方法就是使用str.replace（）调用

输出

最简单的解决方案就是使用str.replace（）调用

输出

输出：

    this is a docstring for
    the main function.
    a,
    b,
    c

输出：

    this is a docstring for
    the main function.
    a,
    b,
    c

你有两个字符串向后的顺序吗？考虑使用<代码> STR.EX/<代码> .Stand防御-不，我只是举一个例子，我将如何逆向地做这件事。你的字符串实际上包含了三个字符<代码> \\n′/COD>？或者它是以某种转义形式出现的吗？@ HugdBrad，它抛出一个错误，你有两个字符串向后的顺序吗？考虑使用<代码> Str.Exchange < /Cord>。或者它是以某种转义形式出现的？@hughdbrown，这引发了一个错误不是我的否决票，但可能是因为

string.decode（）

的参数是错误的。我想说它也设计过度了，使询问者无法学习“字符串替换”之类的基本原理。@EugeneK：这是怎么设计过度的？编解码器就是为了这个目的而存在的。@EugeneK:这就像说，当用户真的应该学习如何构建哈希表时，使用字典是过度设计的。@EugeneK:在这里，也添加了正确的手动方式。但是，没有使用

str.replace（）

。这不是我的反对票，但可能是因为

string.decode（）

的参数是错误的。我想说，它也设计过度，使询问者无法学习类似“string replace”的基本原理。@EugeneK：这是怎么设计过度的？编解码器就是为了这个目的而存在的。@EugeneK:这就像说，当用户真的应该学习如何构建哈希表时，使用字典是过度设计的。@EugeneK:在这里，也添加了正确的手动方式。但是，不使用str.replace（）。这是实际输出吗？因为这让你的解决方案看起来好像不起作用。（它们应该在单独的行中）这是实际输出吗？因为这让你的解决方案看起来好像不起作用。（它们应该在单独的行上）

s = '\\n    this is a docstring for\\n    the main function.\\n    a,\\n    b,\\n    c\\n    '
s1 = s.replace('\\n','\n')
s1

'\n    this is a docstring for\n    the main function.\n    a,\n    b,\n    c\n    '

def convert_text(text):
    return text.replace("\\n","\n").replace("\\t","\t")


text = '\\n    this is a docstring for\\n    the main function.\\n    a,\\n    b,\\n    c\\n    '
print convert_text(text)

    this is a docstring for
    the main function.
    a,
    b,
    c