Python 在熊猫系列中查找相邻的组_Python_Pandas

Python 在熊猫系列中查找相邻的组

python pandas

Python 在熊猫系列中查找相邻的组,python,pandas,Python,Pandas,我有一个系列的True和False，需要找到True的所有组。这意味着我需要找到相邻True值的开始索引和结束索引下面的代码给出了预期的结果，但是非常缓慢、低效和笨拙 import pandas as pd def groups(ser): g = [] flag = False start = None for idx, s in ser.items(): if flag and not s: g.append((

我有一个系列的

True

和

False

，需要找到

True

的所有组。这意味着我需要找到相邻

True

值的开始索引和结束索引

下面的代码给出了预期的结果，但是非常缓慢、低效和笨拙

import pandas as pd

def groups(ser):
    g = []

    flag = False
    start = None
    for idx, s in ser.items():
        if flag and not s:
            g.append((start, idx-1))
            flag = False
        elif not flag and s:
            start = idx
            flag = True
    if flag:
        g.append((start, idx))
    return g

if __name__ == "__main__":
    ser = pd.Series([1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1], dtype=bool)
    print(ser)

    g = groups(ser)
    print("\ngroups of True:")
    for start, end in g:
        print("from {} until {}".format(start, end))
    pass

输出为：

0      True
1      True
2     False
3     False
4      True
5     False
6     False
7      True
8      True
9      True
10     True
11    False
12     True
13    False
14     True

groups of True:
from 0 until 1
from 4 until 4
from 7 until 10
from 12 until 12
from 14 until 14

也有类似的问题，但non正在寻找组开始/结束的指数

通常在求反运算中使用

cumsum

来检查连续块。例如：

for _,x in s[s].groupby((1-s).cumsum()):
    print(f'from {x.index[0]} to {x.index[-1]}')

输出：

from 0 to 1
from 4 to 4
from 7 to 10
from 12 to 12
from 14 to 14

您可以使用

itertools

：

In [478]: from operator import itemgetter
     ...: from itertools import groupby

In [489]: a = ser[ser].index.tolist() # Create a list of indexes having `True` in `ser` 

In [498]: for k, g in groupby(enumerate(a), lambda ix : ix[0] - ix[1]):
     ...:     l = list(map(itemgetter(1), g))
     ...:     print(f'from {l[0]} to {l[-1]}')
     ...: 
from 0 to 1
from 4 to 4
from 7 to 10
from 12 to 12
from 14 to 14

美好的似乎有效。我仍在努力理解它是如何工作的

s[s]

只返回真值（在我的例子中是9项）。现在，您可以根据原始序列（包含15个元素）的求反的累积和对其进行分组。

groupby

如何在您将其转换为包含9个元素的系列并传递给它包含15个元素的系列的情况下工作。Spandas会将该系列的索引与要分组的系列/数据帧对齐。如果您将numpy数组传递给

groupby

，您将需要相等的长度。谢谢，它显然在工作，因此我将接受它-编辑：现在获得：-）顺便说一句，问得好+1.谢谢，+1但另一个答案稍微快一点！很乐意帮忙。