Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/291.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何按Ndaray分组?_Python_Pandas_Numpy - Fatal编程技术网

Python 如何按Ndaray分组?

Python 如何按Ndaray分组?,python,pandas,numpy,Python,Pandas,Numpy,我有数据帧(只是一个示例) 我想按列向量分组,然后按列gp分组。我该怎么做 from dfply import * D >>\ groupby(X.vector, X.gp) >>\ summarize(b=X.sq.sum()) 导致 TypeError:不可损坏的类型:“numpy.ndarray” 我想您需要先在pandas中将列vector转换为元组: print(D['sq'].groupby([D['vector'].apply(tuple)

我有数据帧(只是一个示例)

我想按列向量分组,然后按列gp分组。我该怎么做

from dfply import *
D >>\
    groupby(X.vector, X.gp) >>\
    summarize(b=X.sq.sum())
导致

TypeError:不可损坏的类型:“numpy.ndarray”


我想您需要先在
pandas
中将列
vector
转换为元组:

print(D['sq'].groupby([D['vector'].apply(tuple), D['gp']]).sum().reset_index())
                                     vector  gp   sq
0            (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)   0    0
1          (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)   1    1
2        (4, 5, 6, 7, 8, 9, 10, 11, 12, 13)   0   20
3      (6, 7, 8, 9, 10, 11, 12, 13, 14, 15)   1   34
4    (8, 9, 10, 11, 12, 13, 14, 15, 16, 17)   0  100
5  (10, 11, 12, 13, 14, 15, 16, 17, 18, 19)   1  130
另一种解决方案是先转换列:

D['vector'] = D['vector'].apply(tuple)
print(D.groupby(['vector','gp'])['sq'].sum().reset_index())
                                     vector  gp   sq
0            (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)   0    0
1          (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)   1    1
2        (4, 5, 6, 7, 8, 9, 10, 11, 12, 13)   0   20
3      (6, 7, 8, 9, 10, 11, 12, 13, 14, 15)   1   34
4    (8, 9, 10, 11, 12, 13, 14, 15, 16, 17)   0  100
5  (10, 11, 12, 13, 14, 15, 16, 17, 18, 19)   1  130
Anf如有必要,最后一次转换为
阵列
返回:

D['vector'] = D['vector'].apply(tuple)
df = D.groupby(['vector','gp'])['sq'].sum().reset_index()
df['vector'] = df['vector'].apply(np.array)
print (df)
                                     vector  gp   sq
0            [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]   0    0
1          [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]   1    1
2        [4, 5, 6, 7, 8, 9, 10, 11, 12, 13]   0   20
3      [6, 7, 8, 9, 10, 11, 12, 13, 14, 15]   1   34
4    [8, 9, 10, 11, 12, 13, 14, 15, 16, 17]   0  100
5  [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]   1  130

print (type(df['vector'].iat[0]))
<class 'numpy.ndarray'>

我想您需要先在
pandas
中将列
vector
转换为元组:

print(D['sq'].groupby([D['vector'].apply(tuple), D['gp']]).sum().reset_index())
                                     vector  gp   sq
0            (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)   0    0
1          (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)   1    1
2        (4, 5, 6, 7, 8, 9, 10, 11, 12, 13)   0   20
3      (6, 7, 8, 9, 10, 11, 12, 13, 14, 15)   1   34
4    (8, 9, 10, 11, 12, 13, 14, 15, 16, 17)   0  100
5  (10, 11, 12, 13, 14, 15, 16, 17, 18, 19)   1  130
另一种解决方案是先转换列:

D['vector'] = D['vector'].apply(tuple)
print(D.groupby(['vector','gp'])['sq'].sum().reset_index())
                                     vector  gp   sq
0            (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)   0    0
1          (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)   1    1
2        (4, 5, 6, 7, 8, 9, 10, 11, 12, 13)   0   20
3      (6, 7, 8, 9, 10, 11, 12, 13, 14, 15)   1   34
4    (8, 9, 10, 11, 12, 13, 14, 15, 16, 17)   0  100
5  (10, 11, 12, 13, 14, 15, 16, 17, 18, 19)   1  130
Anf如有必要,最后一次转换为
阵列
返回:

D['vector'] = D['vector'].apply(tuple)
df = D.groupby(['vector','gp'])['sq'].sum().reset_index()
df['vector'] = df['vector'].apply(np.array)
print (df)
                                     vector  gp   sq
0            [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]   0    0
1          [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]   1    1
2        [4, 5, 6, 7, 8, 9, 10, 11, 12, 13]   0   20
3      [6, 7, 8, 9, 10, 11, 12, 13, 14, 15]   1   34
4    [8, 9, 10, 11, 12, 13, 14, 15, 16, 17]   0  100
5  [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]   1  130

print (type(df['vector'].iat[0]))
<class 'numpy.ndarray'>
有点奇怪

D.groupby([D.vector.apply(str), D.gp]).sq.sum().reset_index()
有点奇怪

D.groupby([D.vector.apply(str), D.gp]).sq.sum().reset_index()

列表
不可散列<代码>元组是。我们希望通过
向量
列的元组化版本进行分组。我将使用列表

D.groupby([[tuple(x) for x in D.vector], 'gp']).sq.sum()

                                          gp
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)            0       0
(2, 3, 4, 5, 6, 7, 8, 9, 10, 11)          1       1
(4, 5, 6, 7, 8, 9, 10, 11, 12, 13)        0      20
(6, 7, 8, 9, 10, 11, 12, 13, 14, 15)      1      34
(8, 9, 10, 11, 12, 13, 14, 15, 16, 17)    0     100
(10, 11, 12, 13, 14, 15, 16, 17, 18, 19)  1     130
Name: sq, dtype: int64

要使它恢复到原来的形式。。。多种方法之一

d1 = D.groupby([[tuple(x) for x in D.vector], 'gp']).sq.sum()
d1.reset_index('gp').rename(index=list).rename_axis('vector').reset_index()

                                     vector  gp   sq
0            [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]   0    0
1          [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]   1    1
2        [4, 5, 6, 7, 8, 9, 10, 11, 12, 13]   0   20
3      [6, 7, 8, 9, 10, 11, 12, 13, 14, 15]   1   34
4    [8, 9, 10, 11, 12, 13, 14, 15, 16, 17]   0  100
5  [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]   1  130

列表
不可散列<代码>元组是。我们希望通过
向量
列的元组化版本进行分组。我将使用列表

D.groupby([[tuple(x) for x in D.vector], 'gp']).sq.sum()

                                          gp
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)            0       0
(2, 3, 4, 5, 6, 7, 8, 9, 10, 11)          1       1
(4, 5, 6, 7, 8, 9, 10, 11, 12, 13)        0      20
(6, 7, 8, 9, 10, 11, 12, 13, 14, 15)      1      34
(8, 9, 10, 11, 12, 13, 14, 15, 16, 17)    0     100
(10, 11, 12, 13, 14, 15, 16, 17, 18, 19)  1     130
Name: sq, dtype: int64

要使它恢复到原来的形式。。。多种方法之一

d1 = D.groupby([[tuple(x) for x in D.vector], 'gp']).sq.sum()
d1.reset_index('gp').rename(index=list).rename_axis('vector').reset_index()

                                     vector  gp   sq
0            [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]   0    0
1          [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]   1    1
2        [4, 5, 6, 7, 8, 9, 10, 11, 12, 13]   0   20
3      [6, 7, 8, 9, 10, 11, 12, 13, 14, 15]   1   34
4    [8, 9, 10, 11, 12, 13, 14, 15, 16, 17]   0  100
5  [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]   1  130