python在\\之后重新拆分正则表达式。os.linesep用于介于和之间的空格_Python_Regex

python在\\之后重新拆分正则表达式。os.linesep用于介于和之间的空格

python regex

python在\\之后重新拆分正则表达式。os.linesep用于介于和之间的空格,python,regex,Python,Regex,我想在句点后分割测试行是否以空格结尾 Test=""" This works. but this does not because of the whitespace after the period. This line not sepearated. test. don't split here! or here or here """ cmds = re.split("(?<=\\.)%s[\t]*" % os.linesep, Test, flags=re.M)

我想在句点后分割测试行是否以空格结尾

Test="""
This works.
but this does not because of the whitespace after the period. 

This line not sepearated.
test. don't split here!  
or here
or here
"""

cmds = re.split("(?<=\\.)%s[\t]*"   %    os.linesep, Test, flags=re.M)

我的尝试在没有空格的情况下失败

Test="""
This works.
but this does not because of the whitespace after the period. 

This line not sepearated.
test. don't split here!  
or here
or here
"""

cmds = re.split("(?<=\\.)%s[\t]*"   %    os.linesep, Test, flags=re.M)

cmds=re.split（（？如果还可以从以空格结尾的行中删除尾随空格，则可以使用以下命令：
re.split(r'(?<=\.)[ \t]*%s' % os.linesep, Test, flags=re.M)

re.split（r'（？）？
编辑
\A
和^
是零长度匹配符号，

它们匹配位于字符串最开头的位置
print [m.start() for m in re.finditer('\A','a\nb\n\nc\n')]
# prints [0]
print [m.start() for m in re.finditer('^', 'a\nb\n\nc\n')]
# prints [0] too

如果在正则表达式的定义中指定了re.MULTILINE
标志，则^
的含义将扩展到换行符\n后面的位置。
这些附加位置由（？这很有效！Xtra credit:你能快速说出它为什么有效。以及为什么？^\s*很重要。我不确定我是否正确理解了这个问题。我想象你只想在点位于行尾的情况下根据点进行分割，也就是说，点和行分隔符之间只有空格。因此我使用了^
w第i个re.M
，表示“行首"因此，\s*？^
引导正则表达式电机直到下一行的开始：我们确定点在行尾之前。然后\s*
消耗可能的以下空白。但是我对我的模式不满意，因为^
与re.m
组合表示a\n
c之后的位置字符。但是，有些操作系统不使用\n
换行符，而只使用\r
。最好知道文本的来源。例如，如果以非二进制模式读取文本文件，Python的通用换行符支持会将所有换行符转换为\n
。我正在Windows操作系统中解析SPSS语法代码，以便错误检查每个命令（以点结尾）。我可以用任何方式读取它：使用open（init_语法，'rb'）as f：谢谢你的解释，我正在研究它！事实上，我的正则表达式模式有点不常见，因为我被os的存在所困扰。linesep
，它给人的印象是行分隔符可以改变，从数据到数据，或者从os到另一个os。也就是说，我的正则表达式模式很愚蠢，因为它假设存在一个始终是构成行分隔符的\n
。需要知道的是：行之间是否会使用始终包含\n
的模式进行分隔？行分隔符可能是\r
，\n
，\r\n
以及其他一些外来模式。最常见的是总是有\n。是吗你的案子？
print [m.start() for m in re.finditer('(?<=\n)', 'a\nb\n\nc\n')]
# prints [2, 4, 5, 7]

print [m.start() for m in re.finditer('^','a\nb\n\nc\n',re.M)]
print [m.start() for m in re.finditer('\A|(?<=\n)','a\nb\n\nc\n')]
# they print [0, 2, 4, 5, 7]

print [m.span() for m in re.finditer('^','a\nb\n\nc\n',re.M)]
# prints [(0, 0), (2, 2), (4, 4), (5, 5), (7, 7)]

re.compile('(?<=\.)\s*?^\s*|\s*\Z|\A\s*',re.M)

re.compile('(?<=\.)\s*?^\s*'
           '|'
           '\s*\Z'
           '|'
           '\A\s*',
           re.M)