Python 方法在解析输出后获取两个单独的列表

Python 方法在解析输出后获取两个单独的列表,python,Python,我想从以下输出中获得两个单独的列表:- >>> a = """ ... =================================================================== ... IO Statistics ... Interval: 2.000 secs ... Column #0: COUNT(frame.time)frame.time ... | Column #0 Time |

我想从以下输出中获得两个单独的列表:-

>>> a = """
... ===================================================================
... IO Statistics
... Interval: 2.000 secs
... Column #0: COUNT(frame.time)frame.time
...                 |   Column #0
Time            |          COUNT
... Time            |          COUNT
... 000.000-002.000              1921
... 002.000-004.000              2000
... 004.000-006.000              1999
... 006.000-008.000              1999
... 008.000-010.000              1995
... 010.000-012.000              1997
... 012.000-014.000              1999
... 014.000-016.000              2001
... 016.000-018.000              2004
... 018.000-020.000              1995
... 020.000-022.000              1997
... 022.000-024.000              2007
... 024.000-026.000              2003
... 026.000-028.000              1998
... 028.000-030.000              1995
... 030.000-032.000              1994
... 032.000-034.000              2001
... 034.000-036.000              2008
... 036.000-038.000              1996
... 038.000-040.000              1996
... 040.000-042.000                95
... ===================================================================
... """
>>> print re.findall(r'\s*(?P<first>\d+\.\d+)\-\d+\.\d+\s*(?P<id>\d+)\s*',a)
[('000.000', '1921'), ('002.000', '2000'), ('004.000', '1999'), ('006.000', '1999'), ('008.000', '1995'), ('010.000', '1997'), ('012.000', '1999'), ('014.000', '2001'), ('016.000', '2004'), ('018.000', '1995'), ('020.000', '1997'), ('022.000', '2007'), ('024.000', '2003'), ('026.000', '1998'), ('028.000', '1995'), ('030.000', '1994'), ('032.000', '2001'), ('034.000', '2008'), ('036.000', '1996'), ('038.000', '1996'), ('040.000', '95')]
具有输出的当前代码:-

>>> a = """
... ===================================================================
... IO Statistics
... Interval: 2.000 secs
... Column #0: COUNT(frame.time)frame.time
...                 |   Column #0
Time            |          COUNT
... Time            |          COUNT
... 000.000-002.000              1921
... 002.000-004.000              2000
... 004.000-006.000              1999
... 006.000-008.000              1999
... 008.000-010.000              1995
... 010.000-012.000              1997
... 012.000-014.000              1999
... 014.000-016.000              2001
... 016.000-018.000              2004
... 018.000-020.000              1995
... 020.000-022.000              1997
... 022.000-024.000              2007
... 024.000-026.000              2003
... 026.000-028.000              1998
... 028.000-030.000              1995
... 030.000-032.000              1994
... 032.000-034.000              2001
... 034.000-036.000              2008
... 036.000-038.000              1996
... 038.000-040.000              1996
... 040.000-042.000                95
... ===================================================================
... """
>>> print re.findall(r'\s*(?P<first>\d+\.\d+)\-\d+\.\d+\s*(?P<id>\d+)\s*',a)
[('000.000', '1921'), ('002.000', '2000'), ('004.000', '1999'), ('006.000', '1999'), ('008.000', '1995'), ('010.000', '1997'), ('012.000', '1999'), ('014.000', '2001'), ('016.000', '2004'), ('018.000', '1995'), ('020.000', '1997'), ('022.000', '2007'), ('024.000', '2003'), ('026.000', '1998'), ('028.000', '1995'), ('030.000', '1994'), ('032.000', '2001'), ('034.000', '2008'), ('036.000', '1996'), ('038.000', '1996'), ('040.000', '95')]
如果有人能提出实现所需输出的方法,这将非常有用。

使用
zip(*..)
将输出转换为两个单独的列表:

lst1, lst2 = zip(*re.findall(r'\s*(?P<first>\d+\.\d+)\-\d+\.\d+\s*(?P<id>\d+)\s*',a))
演示:

>>zip(*re.findall(r'\s*(?P\d+\.\d+-\d+\.\d+\.\d+\.\d+\s*(?P\d+\s*,a))
[('000.000', '002.000', '004.000', '006.000', '008.000', '010.000', '012.000', '014.000', '016.000', '018.000', '020.000', '022.000', '024.000', '026.000', '028.000', '030.000', '032.000', '034.000', '036.000', '038.000', '040.000'), ('1921', '2000', '1999', '1999', '1995', '1997', '1999', '2001', '2004', '1995', '1997', '2007', '2003', '1998', '1995', '1994', '2001', '2008', '1996', '1996', '95')]
>>>lst1,lst2=zip(*re.findall(r'\s*(?P\d+\.\d+-\d+\.\d+\.\d+\.\d+\s*(?P\d+\s*,a))
>>>[lst1中i的格式(浮点(i),'.0f')]
['0', '2', '4', '6', '8', '10', '12', '14', '16', '18', '20', '22', '24', '26', '28', '30', '32', '34', '36', '38', '40']
>>>lst2
('1921', '2000', '1999', '1999', '1995', '1997', '1999', '2001', '2004', '1995', '1997', '2007', '2003', '1998', '1995', '1994', '2001', '2008', '1996', '1996', '95')

是的,它成功了!将正则表达式改为第一位(在点之前)>>>lst1,lst2=zip(re.findall(r'\s*(?P\d+)\.\d+\.\d+.\d+\s*(?P\d+)\s*(?P\d+)\s',a)),谢谢Martijn Pieters!lst2不是一个列表,有可能得到列表吗?
lst2肯定是一个列表。请看我的演示。列表带有“[]->['0'、'2'、'4'、'6'、'40']而tuple是带有“()”->(“1921”、“2000”、“1999”…)的,lst2是tuple,无法找到平均值,对于lst1,可以找到平均值。
>>> zip(*re.findall(r'\s*(?P<first>\d+\.\d+)\-\d+\.\d+\s*(?P<id>\d+)\s*',a))
[('000.000', '002.000', '004.000', '006.000', '008.000', '010.000', '012.000', '014.000', '016.000', '018.000', '020.000', '022.000', '024.000', '026.000', '028.000', '030.000', '032.000', '034.000', '036.000', '038.000', '040.000'), ('1921', '2000', '1999', '1999', '1995', '1997', '1999', '2001', '2004', '1995', '1997', '2007', '2003', '1998', '1995', '1994', '2001', '2008', '1996', '1996', '95')]
>>> lst1, lst2 = zip(*re.findall(r'\s*(?P<first>\d+\.\d+)\-\d+\.\d+\s*(?P<id>\d+)\s*',a))
>>> [format(float(i), '.0f') for i in lst1]
['0', '2', '4', '6', '8', '10', '12', '14', '16', '18', '20', '22', '24', '26', '28', '30', '32', '34', '36', '38', '40']
>>> lst2
('1921', '2000', '1999', '1999', '1995', '1997', '1999', '2001', '2004', '1995', '1997', '2007', '2003', '1998', '1995', '1994', '2001', '2008', '1996', '1996', '95')