Python 根据条件创建列表列表
我有一个字符串,如下所示:Python 根据条件创建列表列表,python,regex,parsing,Python,Regex,Parsing,我有一个字符串,如下所示: result = """The following table provides the details. acquired, by major class: (US$ in millions) Customer relationships 15year $265 There is another line without space here. Another table starts here: (USS
result = """The following table provides the details.
acquired, by major class:
(US$ in millions) Customer relationships 15year $265
There is another line without space here.
Another table starts here:
(USS in millions) 2018 2017
Income (loss) from continuing operations $298 $129"""
我必须把所有包含3个以上空格的句子放到一个列表中。以下是我迄今为止尝试过的一些东西:
lines = result.splitlines()
table_list = []
for i in range(len(lines)):
if re.search(r' {3,}', lines[i]):
table_list.append(lines[i])
上述代码的结果输出:
['(US$ in millions) Customer relationships 15year $265','(USS in millions) 2018 2017','Income (loss) from continuing operations $298 $129']
[['(US$ in millions) Customer relationships 15year $265'],['(USS in millions) 2018 2017','Income (loss) from continuing operations $298 $129']]
预期输出:
['(US$ in millions) Customer relationships 15year $265','(USS in millions) 2018 2017','Income (loss) from continuing operations $298 $129']
[['(US$ in millions) Customer relationships 15year $265'],['(USS in millions) 2018 2017','Income (loss) from continuing operations $298 $129']]
输出条件的进一步解释:预期输出应为列表列表。当遍历每一行时,如果有连续的句子在两个单词之间包含3个或更多空格,那么所有这些行都应该是主列表中相同列表的一部分。如果一行在两个单词之间不包含3个或更多空格,则会断开链。如果另一行在两个单词之间包含3个或更多空格,则该行将成为主列表中新列表的一部分。使用
itertools.groupby
和re.findall
:
from itertools import groupby
def has_spaces(str_):
return bool(re.findall("\s{3,}", str_))
[list(g) for k, g in groupby(result.splitlines(), key=has_spaces) if k]
输出:
[['(US$ in millions) Customer relationships 15year $265'],
['(USS in millions) 2018 2017',
'Income (loss) from continuing operations $298 $129']]
结果输出不是列表的列表。预期输出具有列表的列表。我在使用条件创建列表列表时遇到问题。您的预期输出在第二个嵌套列表中有两个字符串。这是故意的吗?这背后的逻辑是什么?请不要对这个问题给予负面评价。如果需要更多信息,我很乐意与您分享。@Najeem是的,这是应该的,因为这两个字符串是两个连续的字符串,在两个单词之间包含3个或更多空格。如果您用实际的空格和换行符替换\s和\n,我想会有所帮助。横向滚动并不容易,而且您的文本也不能用于测试解决方案。还有,你说的句子是什么意思。你的全文没有句号。