Python:Regex在空间上拆分（但保持元素在[]中未拆分）并添加"&引用；在数组中作为换行符_Python_Regex

Python:Regex在空间上拆分（但保持元素在[]中未拆分）并添加"&引用；在数组中作为换行符

python regex

Python:Regex在空间上拆分（但保持元素在[]中未拆分）并添加"&引用；在数组中作为换行符,python,regex,Python,Regex,基本上想象一下，如果我有这样一行： "Hello world I am Lucas [help me] Hi" 我希望re.split（）的结果为： ['Hello' 'world' '' 'I' 'am' 'Lucas' '[help me] '' 'Hi'] 到目前为止，我已经尝试使用 re.split(r'\s+(?=[^()]*(?:\[|\<|$))', stringToSplit) re.split（r'\s+（？=[^（）]*（？：\[\\\\您可以使用匹配方法提取

基本上想象一下，如果我有这样一行：

"Hello world
I am Lucas [help me]
Hi"

我希望

re.split（）

的结果为：

['Hello' 'world' '' 'I' 'am' 'Lucas' '[help me] '' 'Hi']

到目前为止，我已经尝试使用

re.split(r'\s+(?=[^()]*(?:\[|\<|$))', stringToSplit)

re.split（r'\s+（？=[^（）]*（？：\[\\\\您可以使用匹配方法提取方括号内的所有子字符串，或非空白块，或换行符前的空白，并使用
\[[^[]*]|\([^)]*\)|<[^>]*>|\S+|(?=\n)

使用以下工具，您可以：
import regex as re

string = """Hello world
I am Lucas [help me]
Hi"""

rx = re.compile(r'\[[^][]*\](*SKIP)(*FAIL)|(\s+)')

parts = rx.split(string)
print(parts)
# ['Hello', ' ', 'world', '\n', 'I', ' ', 'am', ' ', 'Lucas', ' ', '[help me]', '\n', 'Hi']

它匹配任何不需要的构造，例如[…]
，然后让它们失败。请参阅。这里似乎有几个问题……您卡在上面的第一个部分是什么？拆分？如果我希望[帮助我]（我被困）成为一个元素而不是两个元素，该怎么办？@LuftWoofe查看如何扩展该模式的示例。
re.findall(r"\[[^[]*]|\([^)]*\)|<[^>]*>|\S+|(?=\n)", s)
# => ['Hello', 'world', '', 'I', 'am', 'Lucas', '[help me]', '(help me 2)', '<help me 3>', '', 'Hi']

import regex as re

string = """Hello world
I am Lucas [help me]
Hi"""

rx = re.compile(r'\[[^][]*\](*SKIP)(*FAIL)|(\s+)')

parts = rx.split(string)
print(parts)
# ['Hello', ' ', 'world', '\n', 'I', ' ', 'am', ' ', 'Lucas', ' ', '[help me]', '\n', 'Hi']