python正则表达式-某些字符之间的字符_Python_Regex_Char_Newline_Lookahead

python正则表达式-某些字符之间的字符

python regex

python正则表达式-某些字符之间的字符,python,regex,char,newline,lookahead,Python,Regex,Char,Newline,Lookahead,编辑：我应该补充一点，测试中的字符串应该包含所有可能的字符（即.*+$§€/等）。所以我认为regexp应该最有帮助我使用正则表达式查找某些字符（[“和”]）之间的所有字符。我的示例如下： test = """["this is a text and its supposed to contain every possible char."], ["another one after a newline."], ["and another one even with

编辑：我应该补充一点，测试中的字符串应该包含所有可能的字符（即.*+$§€/等）。所以我认为regexp应该最有帮助

我使用正则表达式查找某些字符（[“和”]）之间的所有字符。我的示例如下：

test = """["this is a text and its supposed to contain every possible char."], 
    ["another one after a newline."], 

    ["and another one even with
    newlines

    in it."]"""

['this is a text and its supposed to contain every possible char.', 'another one after a newline.', 'and another one even with newlines in it.']

import re
my_list = re.findall(r'(?<=\[").*(?="\])*[^ ,\n]', test)
print (my_list)

假定的输出应如下所示：

test = """["this is a text and its supposed to contain every possible char."], 
    ["another one after a newline."], 

    ["and another one even with
    newlines

    in it."]"""

['this is a text and its supposed to contain every possible char.', 'another one after a newline.', 'and another one even with newlines in it.']

import re
my_list = re.findall(r'(?<=\[").*(?="\])*[^ ,\n]', test)
print (my_list)

我的代码（包括正则表达式）如下所示：

test = """["this is a text and its supposed to contain every possible char."], 
    ["another one after a newline."], 

    ["and another one even with
    newlines

    in it."]"""

['this is a text and its supposed to contain every possible char.', 'another one after a newline.', 'and another one even with newlines in it.']

import re
my_list = re.findall(r'(?<=\[").*(?="\])*[^ ,\n]', test)
print (my_list)

因此有两个问题：

1）它不会像我希望它对

（？=“\]）

那样删除文本末尾的

。

2）它没有捕获括号中的第三个文本，猜测是因为换行。但到目前为止，当我尝试

*\n

时，无法捕获这些内容，因为它返回了一个空字符串

我非常感谢在这个问题上给予的任何帮助或提示。先谢谢你

顺便说一句，iam在anaconda spyder和最新的正则表达式（2018）上使用python 3.6

编辑2：对测试进行一次修改：

test = """[
    "this is a text and its supposed to contain every possible char."
    ], 
    [
    "another one after a newline."
    ], 

    [
    "and another one even with
    newlines

    in it."
    ]"""

我再一次很难从中删除换行符，我想可以用\s删除空格，这样的regexp就可以解决这个问题了

my_list = re.findall(r'(?<=\[\S\s\")[\w\W]*(?=\"\S\s\])', test)
print (my_list)

你可以试试这个伴侣。

(?<=\[\")[\w\s.]+(?=\"\])

（？你可以试试这个伴侣。
(?<=\[\")[\w\s.]+(?=\"\])

（？如果您可能也接受not regex解决方案，您可以尝试
result = []
for l in eval(' '.join(test.split())):
    result.extend(l)

print(result)
#  ['this is a text and its supposed to contain every possible char.', 'another one after a newline.', 'and another one even with newlines in it.']

如果您可能也接受not regex解决方案，您可以尝试
result = []
for l in eval(' '.join(test.split())):
    result.extend(l)

print(result)
#  ['this is a text and its supposed to contain every possible char.', 'another one after a newline.', 'and another one even with newlines in it.']

我想说的是：
test = """["this is a text and its supposed to contain every possible char."], 
    ["another one after a newline."], 

    ["and another one even with
    newlines

    in it."]"""

for i in test.replace('\n', '').replace('    ', ' ').split(','):
    print(i.lstrip(r' ["').rstrip(r'"]'))

这将导致以下内容被打印到屏幕上
this is a text and its supposed to contain every possible char.
another one after a newline.
and another one even with newlines in it.

如果您想要这些-精确-字符串的列表，我们可以将其修改为-
newList = []
for i in test.replace('\n', '').replace('    ', ' ').split(','):
  newList.append(i.lstrip(r' ["').rstrip(r'"]'))

我想说的是：
test = """["this is a text and its supposed to contain every possible char."], 
    ["another one after a newline."], 

    ["and another one even with
    newlines

    in it."]"""

for i in test.replace('\n', '').replace('    ', ' ').split(','):
    print(i.lstrip(r' ["').rstrip(r'"]'))

这将导致以下内容被打印到屏幕上
this is a text and its supposed to contain every possible char.
another one after a newline.
and another one even with newlines in it.

如果您想要这些-精确-字符串的列表，我们可以将其修改为-
newList = []
for i in test.replace('\n', '').replace('    ', ' ').split(','):
  newList.append(i.lstrip(r' ["').rstrip(r'"]'))

假设的输出应该是这样的
那么除了匹配之外，您还想删除输出中的换行符？看起来您需要一个.sub
或者类似的东西假设的输出应该是这样的
那么除了匹配之外，您还想删除输出中的换行符？看起来您需要一个.sub
 或者，谢谢你的回答。是的，我想基本上包括所有字符。我想“.”会包括除换行符以外的所有字符，但我想它只会在到达换行符时停止。顺便问一句，现在有没有办法去掉输出中的这些字符？还有中间的额外空格（…有换行符…）？我知道如何用for循环替换它，但如果可以在regexp中完成，我想知道。@MikeTwain是的，你可以。检查我已更新的答案。如果它有助于你选择正确答案：pthanks。尽管输出仍然包含这些\n字符。没有办法也删除它们？顺便说一句，最后一个al如果可以的话，请修改。我在问题中编辑了它。修改为测试。我想匹配[”和“]”之间的所有内容。顺便说一句，我刚刚编辑了问题。@MikeTwain yes mate you can.result.replace（/（？：\n+\s{2，}）/，“”）使用此正则表达式，您将获得所需的输出答案。是的，我希望基本上包括所有字符。我原以为“.”将包括除换行符以外的所有字符，但我猜它只会在到达换行符时停止。顺便问一下，现在有没有办法去掉输出中的\n这些字符？以及中间的额外空格（…带有\n换行符…）？我知道如何用for循环替换它，但如果可以在regexp中完成，我想知道。@MikeTwain是的，你可以。检查我已更新的答案。如果它有助于你选择正确答案：pthanks。尽管输出仍然包含这些\n字符。没有办法也删除它们？顺便说一句，最后一个al如果可以的话，请修改。我在问题中编辑了它。修改为测试。我想匹配[”和“]”之间的所有内容。顺便说一句，我刚刚编辑了问题。@MikeTwain yes mate you can.result.replace（/（？：\n+\s{2，}）/，“”）使用这个正则表达式，你会得到你想要的输出感兴趣的方法，到目前为止还没有考虑过。但是对于我的例子，它似乎工作得很好。谢谢你！有趣的方法，到目前为止还没有考虑过。但是对于我的例子，它似乎工作得很好。谢谢你！