Python 熊猫：按相邻行将数据帧转换为数据帧列表_Python_Pandas_Pandas Groupby

Python 熊猫：按相邻行将数据帧转换为数据帧列表

python pandas

Python 熊猫：按相邻行将数据帧转换为数据帧列表,python,pandas,pandas-groupby,Python,Pandas,Pandas Groupby,转换此数据帧： A B C 0 '1' '2' '3' 1 '1' '4' '5' 2 '2' '6' '7' 3 '2' '8' '9' 4 '1' '0' '1' 5 '2' '2' '3' 在此数据帧列表中： A B C 0 '1' '2' '3' 1 '1' '4' '5' A B C 2 '2' '6' '7' 3 '2'

转换此数据帧：

    A   B   C
0   '1'   '2'   '3'
1   '1'   '4'   '5'
2   '2'   '6'   '7'
3   '2'   '8'   '9'
4   '1'   '0'   '1'
5   '2'   '2'   '3'

在此数据帧列表中：

    A   B   C
0   '1'   '2'   '3'
1   '1'   '4'   '5'

    A   B   C
2   '2'   '6'   '7'
3   '2'   '8'   '9'

    A   B   C
4   '1'   '0'   '1'

    A   B   C
5   '2'   '2'   '3'

这样，同一组的所有相邻行都在各自的数据帧中。我尝试了groupby和drop_副本的各种组合，但它们并没有处理行的连续性。而diff不喜欢字符串。

数据：

data = StringIO("""

A   B   C
1   2   3
1   4   5
2   6   7
2   8   9
1   0   1
2   2   3

""")

按不等于0的区域中的运行变化总和进行分组：

df = pd.read_table(data, delim_whitespace=True)

for x ,y in df.groupby(df.A.astype('category').cat.codes.diff().ne(0).cumsum()):
    print(y)

   A  B  C
0  1  2  3
1  1  4  5
   A  B  C
2  2  6  7
3  2  8  9
   A  B  C
4  1  0  1
   A  B  C
5  2  2  3

使用

numpy.flatnonzero

查找差异不为零的位置，并

numpy.split

拆分数据帧

a = df.A.values
lodf = np.split(df, np.flatnonzero(a[:-1] != a[1:]) + 1)

print(*lodf, sep='\n\n')

   A  B  C
0  1  2  3
1  1  4  5

   A  B  C
2  2  6  7
3  2  8  9

   A  B  C
4  1  0  1

   A  B  C
5  2  2  3

[y代表x，y在df.groupby（df.A.diff（）.ne（0.cumsum（））]中]

字符串变量如何？如果您有可以强制为数字的字符串，我会这样做。但你是对的，有了弦就不同了