Python 解析重复出现的+；排成一行_Python_Regex

Python 解析重复出现的+；排成一行

python regex

Python 解析重复出现的+；排成一行,python,regex,Python,Regex,这是要分析的行：001000000+3 12091992+2 0200+3 我曾经用过： Z = re.compile('(?P<stop_id>\d{9}) (?P<time_displacement>([-|+]\d{0,4})*)', flags=re.UNICODE) m = Z.search('001000000 +3 12091992 +2 0200 +3') if m: yield { 'stop_id': m.group('

这是要分析的行：

001000000+3 12091992+2 0200+3

我曾经用过：

Z = re.compile('(?P<stop_id>\d{9}) (?P<time_displacement>([-|+]\d{0,4})*)', flags=re.UNICODE)

m = Z.search('001000000 +3 12091992 +2 0200 +3')
if m:
    yield {
           'stop_id': m.group('stop_id')
          }
    if m.group('time_displacement'):
        _suffix=_suffix + 1
        yield {
               'time_displacement' + str(_suffix): m.group('time_displacement')
              }

但我需要：

[{'stop_id': '001000000'}, {'time_displacement1': '+3'},{'time_displacement2': '+2'},{'time_displacement1': '+3'}]

（？P\d{9}）|（？P（？：[-+]\d{0,4}））

试试这个。看演示

为什么要使用

收益率？您发布的代码可能是生成器函数的一部分。请考虑发布实际可运行的代码片段…
为什么你想要一个单一元素的列表，而不是把所有的东西都放到一个单一的dict

从您的代码和数据示例中，我不能完全确定应该匹配什么和不应该匹配什么，但希望这能满足您的需要。。。或者相当接近。：）
无论如何，你可以这样做：
import re

fields = ('stop_id', 'time_displacement')
pat = re.compile(r'(\d{9})|([-|+]\d{0,4})')

data = '001000000 +3 12091992 +2 0200 +3'
found = pat.findall(data)
#print found

result = []
suffix = 1
for p1, p2 in found:
    if p2 == '':
        result.append({fields[0]: p1})
    elif p1 == '':
        result.append({fields[1]+str(suffix): p2})
        suffix += 1

print result

输出
[{'stop_id': '001000000'}, {'time_displacement1': '+3'}, {'time_displacement2': '+2'}, {'time_displacement3': '+3'}]

此代码无法生成此输出（特别是的“时间位移”键）
import re

fields = ('stop_id', 'time_displacement')
pat = re.compile(r'(\d{9})|([-|+]\d{0,4})')

data = '001000000 +3 12091992 +2 0200 +3'
found = pat.findall(data)
#print found

result = []
suffix = 1
for p1, p2 in found:
    if p2 == '':
        result.append({fields[0]: p1})
    elif p1 == '':
        result.append({fields[1]+str(suffix): p2})
        suffix += 1

print result

[{'stop_id': '001000000'}, {'time_displacement1': '+3'}, {'time_displacement2': '+2'}, {'time_displacement3': '+3'}]