Python 通过正则表达式捕获组,然后将捕获组拆分为单独的列表项

Python 通过正则表达式捕获组,然后将捕获组拆分为单独的列表项,python,regex,split,Python,Regex,Split,我已在列表中添加了阅读行: l = ['W –-Transportation', 'W23.F5-International_waterways W25.2-Airlines', 'W23.F8-Rivers W25.4-Bus_lines', 'W23.H-Pipelines

我已在列表中添加了阅读行:

l =  ['W –-Transportation',
     'W23.F5-International_waterways                      W25.2-Airlines',
     'W23.F8-Rivers                                       W25.4-Bus_lines',
     'W23.H-Pipelines                                       W25.6-Railroads',
     'W23.H2-Oil_pipelines                                W25.8-Shipping_lines',
     'W23.H4-Natural_gas_pipelines                        W27-Transportation_safety',
     'W23.H6-Water_pipelines                              W27.2-Traffic_safety',
     'W23.K-Transportation_system_design                    W29-Navigation',
     'W23.M-Transportation_system_construction              W32-Transportation_research',
     'W23.M2-Transportation_facility_construction         W32.2-Transportation_surveys',
     'W23.M4-Transportation_system_maintenance            W34-Transportation_education',
     'W23.M4.2-Road_maintenance                        W36-Transportation_policy',
     'W23.M6-Transportation_system_repair                 W38-Transportation_planning',
     'W23.M6.2-Vehicle_repair                          W40-Transportation_aspects',
     'W25-Transportation_industry']
现在,对于每一行,我想捕获两个组,例如-
W23.F5-国际水道和
W25.2-航空公司,并将它们分成两个列表条目

我的预期结果是:

l =  ['W –-Transportation','W23.F5-International_waterways','W25.2-Airlines','W23.F8-Rivers','W25.4-Bus_lines','W23.H-Pipelines','W25.6-Railroads','W23.H2-Oil_pipelines','W25.8-Shipping_lines', .....,'W25-Transportation_industry']

捕获组的正则表达式应该是
([a-z])\s*?([a-z])
,但是我应该如何将捕获组拆分为新的列表项呢?

也许,在
上进行简单拆分。“
在这里可能就行了:

import re

l =  ['W –-Transportation',
     'W23.F5-International_waterways                      W25.2-Airlines',
     'W23.F8-Rivers                                       W25.4-Bus_lines',
     'W23.H-Pipelines                                       W25.6-Railroads',
     'W23.H2-Oil_pipelines                                W25.8-Shipping_lines',
     'W23.H4-Natural_gas_pipelines                        W27-Transportation_safety',
     'W23.H6-Water_pipelines                              W27.2-Traffic_safety',
     'W23.K-Transportation_system_design                    W29-Navigation',
     'W23.M-Transportation_system_construction              W32-Transportation_research',
     'W23.M2-Transportation_facility_construction         W32.2-Transportation_surveys',
     'W23.M4-Transportation_system_maintenance            W34-Transportation_education',
     'W23.M4.2-Road_maintenance                        W36-Transportation_policy',
     'W23.M6-Transportation_system_repair                 W38-Transportation_planning',
     'W23.M6.2-Vehicle_repair                          W40-Transportation_aspects',
     'W25-Transportation_industry']

k = []
for i in l:
    new_string = i.split("  ")
    for j in new_string:
        if j != '':
            k.append(j.strip())


print(k)
输出
也许,在
上进行简单的拆分可以在这里正常工作:

import re

l =  ['W –-Transportation',
     'W23.F5-International_waterways                      W25.2-Airlines',
     'W23.F8-Rivers                                       W25.4-Bus_lines',
     'W23.H-Pipelines                                       W25.6-Railroads',
     'W23.H2-Oil_pipelines                                W25.8-Shipping_lines',
     'W23.H4-Natural_gas_pipelines                        W27-Transportation_safety',
     'W23.H6-Water_pipelines                              W27.2-Traffic_safety',
     'W23.K-Transportation_system_design                    W29-Navigation',
     'W23.M-Transportation_system_construction              W32-Transportation_research',
     'W23.M2-Transportation_facility_construction         W32.2-Transportation_surveys',
     'W23.M4-Transportation_system_maintenance            W34-Transportation_education',
     'W23.M4.2-Road_maintenance                        W36-Transportation_policy',
     'W23.M6-Transportation_system_repair                 W38-Transportation_planning',
     'W23.M6.2-Vehicle_repair                          W40-Transportation_aspects',
     'W25-Transportation_industry']

k = []
for i in l:
    new_string = i.split("  ")
    for j in new_string:
        if j != '':
            k.append(j.strip())


print(k)
输出
发布预期结果关于
'W–-Transportation'
'W25-Transportation\u industry'
之类的项目,您有什么看法?通常,您只需要使用
\s+
拆分每个项目,因为每个值只包含非空白。添加了预期结果。显示的正则表达式不会捕获这些字符串。是否有理由使用正则表达式和组来实现此目的?只在每一行上使用
.split()
可能会简单得多。发布预期结果关于
'W–-Transportation'
'W25-Transportation\u industry'
之类的项目是什么?通常,您只需要使用
\s+
拆分每个项目,因为每个值只包含非空白。添加了预期结果。显示的正则表达式不会捕获这些字符串。是否有理由使用正则表达式和组来实现此目的?在每一行上使用
.split()
可能会简单得多。