如何使用正则表达式在python中提取子字符串_Python_Regex_Python 3.x_Python 3.5

如何使用正则表达式在python中提取子字符串

python regex python-3.x

如何使用正则表达式在python中提取子字符串,python,regex,python-3.x,python-3.5,Python,Regex,Python 3.x,Python 3.5,我有一个字符串，这是title[[this is translated title]]，我需要提取这两个子字段这是标题，这是翻译的标题我尝试使用正则表达式，但无法完成 def translate(value): # Values are paseed in the form of # "This is text [[This is translated text]]" import re regex = r"(.+)(\[\[.*\]\])" matc

我有一个字符串，

这是title[[this is translated title]]

，我需要提取这两个子字段<代码>这是标题，

这是翻译的标题

我尝试使用正则表达式，但无法完成

def translate(value):
    # Values are paseed in the form of 
    # "This is text [[This is translated text]]"
    import re
    regex = r"(.+)(\[\[.*\]\])"
    match = re.match(regex, value)
    # Return text
    first = match.group(1)

    # Return translated text
    second = match.group(2).lstrip("[[").rstrip("]]")

    return first, second

但这失败了。当字符串为“简单纯文本”时，我找到了一种不使用正则表达式的简单方法

def trns(value):
    first, second =  value.rstrip("]]").split("[[")
    return first, second

我找到了一种不使用正则表达式的简单方法

def trns(value):
    first, second =  value.rstrip("]]").split("[[")
    return first, second

您必须使用正则表达式

r'（（\w.*）\[\[（\w.*）\]\]\]（\w.*））

生成这是
组（1）
中的标题，而这是
组（2）
中的翻译标题，因此您的代码应该是

def translate(value): # value = "This is text [[This is translated text]]" import re regex = r'((\w.*)\[\[(\w.*)\]\]|(\w.*))' match = re.match(regex, value) result = [x for x in match.groups() if x and x!=value] return result if result else value
这会像您预期的那样返回

为了测试正则表达式，您可以使用
您必须使用regex
r'（（\w.*）\[\[（\w.*）\]\]\]|（\w.*））
产生这是
组（1）
中的标题，而这是
组（2）
中的翻译标题，因此您的代码应该是

def translate(value): # value = "This is text [[This is translated text]]" import re regex = r'((\w.*)\[\[(\w.*)\]\]|(\w.*))' match = re.match(regex, value) result = [x for x in match.groups() if x and x!=value] return result if result else value
这会像您预期的那样返回

为了测试正则表达式，您可以使用看起来有效的方法。有什么问题吗？你所做的似乎有用。有什么问题吗？我想这个get失败的原因是value=“这是唯一的文本”我想这个get失败的原因是value=“这是唯一的文本”