Python 从字符串创建两个列表,在括号之间排除和包括字符串
假设我们有一个字符串,如:Python 从字符串创建两个列表,在括号之间排除和包括字符串,python,list,pandas,list-comprehension,Python,List,Pandas,List Comprehension,假设我们有一个字符串,如: s = u'apple banana lemmon (hahaha) dog cat whale (hehehe) red blue black' 我要创建以下列表: including = ['hahaha', 'hehehe'] excluding = ['apple banana lemmon (', ') dog cat whale (', ') red blue black'] 第一个列表直接使用正则表达式: including = re.findall
s = u'apple banana lemmon (hahaha) dog cat whale (hehehe) red blue black'
我要创建以下列表:
including = ['hahaha', 'hehehe']
excluding = ['apple banana lemmon (', ') dog cat whale (', ') red blue black']
第一个列表直接使用正则表达式:
including = re.findall('\((.*?)\)',s)
但我无法从其他列表中获得类似的内容。你能帮我吗?提前谢谢你
excluding = re.split('|'.join(including), s)
对于一个简单的情况,您知道包含的信息将不包含特殊字符或正则表达式定义
如果您不确定是否会出现这种情况:
re.split('|'.join(map(re.escape, including)), s)
这将转义特殊的正则表达式字符,否则会导致re.split函数的功能紊乱您可以使用正后向和正前向在括号之间拆分单词:
>>> re.split(r'(?<=\().*?(?=\))', s)
['apple banana lemmon (', ') dog cat whale (', ') red blue black']
使用正则表达式重新拆分(r'(?)
注意空字符串
没有正则表达式
相同的想法,但不覆盖
s
使用包含列表拆分字符串?
re.split(“|”.join(include),s)
最好使用map(re.escape,include)
否则如果您喜欢(哈哈\d+haha)
在字符串中,正则表达式将\d+
解释为一个或多个数字,而不是一个文本\d+
。这是真的,但我认为它不适用于askee将使用的情况场景(我认为)因为他似乎是在从真实的句子中提取括号信息。那么,我可能是错的,所以Q&A应该不仅仅对最初的提问者有用。因此,有类似问题的人可能需要调用re.escape
。他事先有include
吗?是的,这是简单的正则表达式include=re.findall(“\((.*)\”,s)
这是一个比我的答案更好、更简洁的答案,应该被认为是可以接受的,而不是一个小的澄清,如果可以的话:您的正则表达式假设了已知数量的方括号,使用类似于一个小解析器的东西分成两个列表不是更好吗?
a = re.findall('\)?[^()]*\(?', s)
excluded = a[::2]
included = a[1::2]
print(included, excluded, sep='\n')
['hahaha', 'hehehe', '']
['apple banana lemmon (', ') dog cat whale (', ') red blue black']
a = re.findall('\)?[^()]*\(?', s)
excluded = [*filter(bool, a[::2])]
included = [*filter(bool, a[1::2])]
print(included, excluded, sep='\n')
['hahaha', 'hehehe']
['apple banana lemmon (', ') dog cat whale (', ') red blue black']
from itertools import cycle
def f(s):
c = cycle('()')
a = {'(': 1, ')': 0}
while s:
d = next(c)
i = s.find(d)
if i > -1:
j = a[d]
yield d, s[:i + j]
s = s[i + j:]
else:
yield d, s
break
included = []
excluded = []
for k, v in f(s):
if k == '(':
excluded.append(v)
else:
included.append(v)
print(included, excluded, sep='\n')
['hahaha', 'hehehe']
['apple banana lemmon (', ') dog cat whale (', ') red blue black']
from itertools import cycle
def f(s):
c = cycle('()')
a = {'(': 1, ')': 0}
j = 0
while True:
d = next(c)
i = s.find(d, j)
if i > -1:
k = a[d]
yield d, s[j:i + k]
j = i + k
else:
yield d, s[j:]
break
included = []
excluded = []
for k, v in f(s):
if k == '(':
excluded.append(v)
else:
included.append(v)
print(included, excluded, sep='\n')
['hahaha', 'hehehe']
['apple banana lemmon (', ') dog cat whale (', ') red blue black']