python正则表达式-某些字符之间的字符

python正则表达式-某些字符之间的字符,python,regex,char,newline,lookahead,Python,Regex,Char,Newline,Lookahead,编辑:我应该补充一点,测试中的字符串应该包含所有可能的字符(即.*+$§€/等)。所以我认为regexp应该最有帮助 我使用正则表达式查找某些字符([“和”])之间的所有字符。我的示例如下: test = """["this is a text and its supposed to contain every possible char."], ["another one after a newline."], ["and another one even with

编辑:我应该补充一点,测试中的字符串应该包含所有可能的字符(即.*+$§€/等)。所以我认为regexp应该最有帮助

我使用正则表达式查找某些字符([“和”])之间的所有字符。我的示例如下:

test = """["this is a text and its supposed to contain every possible char."], 
    ["another one after a newline."], 

    ["and another one even with
    newlines

    in it."]"""
['this is a text and its supposed to contain every possible char.', 'another one after a newline.', 'and another one even with newlines in it.']
import re
my_list = re.findall(r'(?<=\[").*(?="\])*[^ ,\n]', test)
print (my_list)
假定的输出应如下所示:

test = """["this is a text and its supposed to contain every possible char."], 
    ["another one after a newline."], 

    ["and another one even with
    newlines

    in it."]"""
['this is a text and its supposed to contain every possible char.', 'another one after a newline.', 'and another one even with newlines in it.']
import re
my_list = re.findall(r'(?<=\[").*(?="\])*[^ ,\n]', test)
print (my_list)
我的代码(包括正则表达式)如下所示:

test = """["this is a text and its supposed to contain every possible char."], 
    ["another one after a newline."], 

    ["and another one even with
    newlines

    in it."]"""
['this is a text and its supposed to contain every possible char.', 'another one after a newline.', 'and another one even with newlines in it.']
import re
my_list = re.findall(r'(?<=\[").*(?="\])*[^ ,\n]', test)
print (my_list)
因此有两个问题:

1) 它不会像我希望它对
(?=“\])
那样删除文本末尾的

2) 它没有捕获括号中的第三个文本,猜测是因为换行。但到目前为止,当我尝试
*\n
时,无法捕获这些内容,因为它返回了一个空字符串

我非常感谢在这个问题上给予的任何帮助或提示。先谢谢你

顺便说一句,iam在anaconda spyder和最新的正则表达式(2018)上使用python 3.6

编辑2:对测试进行一次修改:

test = """[
    "this is a text and its supposed to contain every possible char."
    ], 
    [
    "another one after a newline."
    ], 

    [
    "and another one even with
    newlines

    in it."
    ]"""
我再一次很难从中删除换行符,我想可以用\s删除空格,这样的regexp就可以解决这个问题了

my_list = re.findall(r'(?<=\[\S\s\")[\w\W]*(?=\"\S\s\])', test)
print (my_list)

你可以试试这个伴侣。

(?<=\[\")[\w\s.]+(?=\"\])

(?你可以试试这个伴侣。

(?<=\[\")[\w\s.]+(?=\"\])

(?如果您可能也接受not regex解决方案,您可以尝试

result = []
for l in eval(' '.join(test.split())):
    result.extend(l)

print(result)
#  ['this is a text and its supposed to contain every possible char.', 'another one after a newline.', 'and another one even with newlines in it.']

如果您可能也接受not regex解决方案,您可以尝试

result = []
for l in eval(' '.join(test.split())):
    result.extend(l)

print(result)
#  ['this is a text and its supposed to contain every possible char.', 'another one after a newline.', 'and another one even with newlines in it.']

我想说的是:

test = """["this is a text and its supposed to contain every possible char."], 
    ["another one after a newline."], 

    ["and another one even with
    newlines

    in it."]"""

for i in test.replace('\n', '').replace('    ', ' ').split(','):
    print(i.lstrip(r' ["').rstrip(r'"]'))
这将导致以下内容被打印到屏幕上

this is a text and its supposed to contain every possible char.
another one after a newline.
and another one even with newlines in it.
如果您想要这些-精确-字符串的列表,我们可以将其修改为-

newList = []
for i in test.replace('\n', '').replace('    ', ' ').split(','):
  newList.append(i.lstrip(r' ["').rstrip(r'"]'))

我想说的是:

test = """["this is a text and its supposed to contain every possible char."], 
    ["another one after a newline."], 

    ["and another one even with
    newlines

    in it."]"""

for i in test.replace('\n', '').replace('    ', ' ').split(','):
    print(i.lstrip(r' ["').rstrip(r'"]'))
这将导致以下内容被打印到屏幕上

this is a text and its supposed to contain every possible char.
another one after a newline.
and another one even with newlines in it.
如果您想要这些-精确-字符串的列表,我们可以将其修改为-

newList = []
for i in test.replace('\n', '').replace('    ', ' ').split(','):
  newList.append(i.lstrip(r' ["').rstrip(r'"]'))

假设的输出应该是这样的
那么除了匹配之外,您还想删除输出中的换行符?看起来您需要一个
.sub
或者类似的东西
假设的输出应该是这样的
那么除了匹配之外,您还想删除输出中的换行符?看起来您需要一个
.sub
或者,谢谢你的回答。是的,我想基本上包括所有字符。我想“.”会包括除换行符以外的所有字符,但我想它只会在到达换行符时停止。顺便问一句,现在有没有办法去掉输出中的这些字符?还有中间的额外空格(…有换行符…)?我知道如何用for循环替换它,但如果可以在regexp中完成,我想知道。@MikeTwain是的,你可以。检查我已更新的答案。如果它有助于你选择正确答案:pthanks。尽管输出仍然包含这些\n字符。没有办法也删除它们?顺便说一句,最后一个al如果可以的话,请修改。我在问题中编辑了它。修改为测试。我想匹配[”和“]”之间的所有内容。顺便说一句,我刚刚编辑了问题。@MikeTwain yes mate you can.result.replace(/(?:\n+\s{2,})/,“”)使用此正则表达式,您将获得所需的输出答案。是的,我希望基本上包括所有字符。我原以为“.”将包括除换行符以外的所有字符,但我猜它只会在到达换行符时停止。顺便问一下,现在有没有办法去掉输出中的\n这些字符?以及中间的额外空格(…带有\n换行符…)?我知道如何用for循环替换它,但如果可以在regexp中完成,我想知道。@MikeTwain是的,你可以。检查我已更新的答案。如果它有助于你选择正确答案:pthanks。尽管输出仍然包含这些\n字符。没有办法也删除它们?顺便说一句,最后一个al如果可以的话,请修改。我在问题中编辑了它。修改为测试。我想匹配[”和“]”之间的所有内容。顺便说一句,我刚刚编辑了问题。@MikeTwain yes mate you can.result.replace(/(?:\n+\s{2,})/,“”)使用这个正则表达式,你会得到你想要的输出感兴趣的方法,到目前为止还没有考虑过。但是对于我的例子,它似乎工作得很好。谢谢你!有趣的方法,到目前为止还没有考虑过。但是对于我的例子,它似乎工作得很好。谢谢你!