Python 如何将pandas groupby()对象存储在具有不同索引的同一变量中

Python 如何将pandas groupby()对象存储在具有不同索引的同一变量中,python,pandas,dataframe,pandas-groupby,Python,Pandas,Dataframe,Pandas Groupby,假设我有一个包含三列的数据框df df= id date value A 02-04-2000 3 A 03-04-2000 8 B 04-04-2000 12 B 02-04-2000 7 C 03-04-2000 5 C 04-04-2000 2 我感兴趣的是根据df['id']列对数据进行分组,并将值存储在变量new中。new应该以这样的方式存储值:当我调用new[1]时,它应该返回与id=a对应的元素,离开id列,而new[2]应该返回与id=B对

假设我有一个包含三列的数据框
df

df=
id  date       value
A  02-04-2000  3
A  03-04-2000  8
B  04-04-2000  12
B  02-04-2000  7
C  03-04-2000  5
C  04-04-2000  2
我感兴趣的是根据
df['id']
列对数据进行分组,并将值存储在变量
new
中。
new
应该以这样的方式存储值:当我调用
new[1]
时,它应该返回与
id=a
对应的元素,离开
id
列,而new[2]应该返回与
id=B
对应的元素,依此类推

示例输出:

new[1]=
date       value
02-04-2000  3
03-04-2000  8

new[2]=
date        value
04-04-2000  12
02-04-2000  7
For all solutions与remove
id
column by一起使用

如果可能,通过
0,1,…
进行索引,输出为
DataFrame
s的列表:

new = [g.drop('id', axis=1) for _, g in df.groupby('id')]
print (new[0])
         date  value
0  02-04-2000      3
1  03-04-2000      8
如果输出是
DataFrame
s的字典,则以下是创建连续组:

new = {k: g.drop('id', axis=1) 
                       for k, g in  df.groupby(df['id'].ne(df['id'].shift()).cumsum())}
print (new[1])
         date  value
0  02-04-2000      3
1  03-04-2000      8
new1 = {k: g.drop('id', axis=1) for k, g in  df.groupby('id')}
print (new1['A'])
         date  value
0  02-04-2000      3
1  03-04-2000      8
print (df)

  id        date  value
0  A  02-04-2000      3 <- 1group
1  A  03-04-2000      8 <- 1group
2  B  04-04-2000     12 <- 2group
3  A  02-04-2000      7 <- 3group
4  A  03-04-2000      5 <- 3group
5  C  04-04-2000      2 <- 4group
    

new = {k: g.drop('id', axis=1) 
                       for k, g in  df.groupby(df['id'].ne(df['id'].shift()).cumsum())}

#first group   
print (new[1])
         date  value
0  02-04-2000      3
1  03-04-2000      8

#fourth group
print (new[3])
         date  value
3  02-04-2000      7
4  03-04-2000      5
类似的解决方案(无连续组):


按协同组分组我尝试在另一个数据中解释:

 print (df)

  id        date  value
0  A  02-04-2000      3
1  A  03-04-2000      8
2  B  04-04-2000     12
3  A  02-04-2000      7
4  A  03-04-2000      5
5  C  04-04-2000      2
    
new = {k: g.drop('id', axis=1) 
                       for k, g in  df.groupby(pd.factorize(df['id'])[0]+1)}


#all A rows is first group
print (new[1])
         date  value
0  02-04-2000      3
1  03-04-2000      8
3  02-04-2000      7
4  03-04-2000      5


#all C rows is third group   
print (new[3])
         date  value
5  04-04-2000      2
按连续组分组:

new = {k: g.drop('id', axis=1) 
                       for k, g in  df.groupby(df['id'].ne(df['id'].shift()).cumsum())}
print (new[1])
         date  value
0  02-04-2000      3
1  03-04-2000      8
new1 = {k: g.drop('id', axis=1) for k, g in  df.groupby('id')}
print (new1['A'])
         date  value
0  02-04-2000      3
1  03-04-2000      8
print (df)

  id        date  value
0  A  02-04-2000      3 <- 1group
1  A  03-04-2000      8 <- 1group
2  B  04-04-2000     12 <- 2group
3  A  02-04-2000      7 <- 3group
4  A  03-04-2000      5 <- 3group
5  C  04-04-2000      2 <- 4group
    

new = {k: g.drop('id', axis=1) 
                       for k, g in  df.groupby(df['id'].ne(df['id'].shift()).cumsum())}

#first group   
print (new[1])
         date  value
0  02-04-2000      3
1  03-04-2000      8

#fourth group
print (new[3])
         date  value
3  02-04-2000      7
4  03-04-2000      5
打印(df)
id日期值
0 A 02-04-2000 3