Python 有没有更好的方法来收集熊猫的唯一索引值?
我有一些数据如下所示:Python 有没有更好的方法来收集熊猫的唯一索引值?,python,pandas,Python,Pandas,我有一些数据如下所示: >>> print totals.sample(4) start end \ time region_type 2016-01-24 02:17:10.238 STACK GUARD 79940452352
>>> print totals.sample(4)
start end \
time region_type
2016-01-24 02:17:10.238 STACK GUARD 79940452352 79940665344
2016-01-23 20:14:17.043 MALLOC metadata 64688259072 64688996352
2016-01-22 23:20:53.752 IOKit 47857778688 47861174272
2016-01-23 08:17:06.561 __DATA 3711964667904 3711979212800
vsize rsdnt dirty swap
time region_type
2016-01-24 02:17:10.238 STACK GUARD 212992 0 0 0
2016-01-23 20:14:17.043 MALLOC metadata 737280 81920 81920 8192
2016-01-22 23:20:53.752 IOKit 3395584 24576 24576 3371008
2016-01-23 08:17:06.561 __DATA 14544896 4907008 618496 4780032
我想知道脏+交换大于1e7的任何行的区域类型:
这是可行的,但似乎相当冗长:
>>> print totals[(totals.dirty + totals.swap) > 1e7].groupby(level='region_type').\
apply(lambda x: 'lol').index.tolist()
['MALLOC_NANO', 'MALLOC_SMALL']
有更好的办法吗
我本以为这会奏效,但它给出了数据集中的所有区域类型,而不是我选择的类型:
totals[(totals.dirty + totals.swap) > 1e7].index.levels[1].tolist()
使用
index.get_level_values
(返回使用的值),而不是index.levels
(返回索引知道的值):
比如说,
In [243]: mask = totals['dirty']+totals['swap'] > 1e3; mask
Out[243]:
time region_type
2016-01-24 02:17:10.238 STACK GUARD False
2016-01-23 20:14:17.043 MALLOC metadata True
2016-01-22 23:20:53.752 IOKit True
2016-01-23 08:17:06.561 __DATA True
dtype: bool
In [244]: result = mask.loc[mask]; result
Out[244]:
time region_type
2016-01-23 20:14:17.043 MALLOC metadata True
2016-01-22 23:20:53.752 IOKit True
2016-01-23 08:17:06.561 __DATA True
dtype: bool
In [245]: result.index.get_level_values('region_type').unique()
Out[245]: array(['MALLOC metadata', 'IOKit', '__DATA'], dtype=object)
In [243]: mask = totals['dirty']+totals['swap'] > 1e3; mask
Out[243]:
time region_type
2016-01-24 02:17:10.238 STACK GUARD False
2016-01-23 20:14:17.043 MALLOC metadata True
2016-01-22 23:20:53.752 IOKit True
2016-01-23 08:17:06.561 __DATA True
dtype: bool
In [244]: result = mask.loc[mask]; result
Out[244]:
time region_type
2016-01-23 20:14:17.043 MALLOC metadata True
2016-01-22 23:20:53.752 IOKit True
2016-01-23 08:17:06.561 __DATA True
dtype: bool
In [245]: result.index.get_level_values('region_type').unique()
Out[245]: array(['MALLOC metadata', 'IOKit', '__DATA'], dtype=object)