Python 将groupby转换为具有新列的单行
我希望能够将一个groupby转换为一行,但是如果没有足够的数据,则将该groupby中第二列的值聚合为新列或-99 使用此输入按会话\u id分组后:Python 将groupby转换为具有新列的单行,python,python-3.x,pandas,pandas-groupby,Python,Python 3.x,Pandas,Pandas Groupby,我希望能够将一个groupby转换为一行,但是如果没有足够的数据,则将该groupby中第二列的值聚合为新列或-99 使用此输入按会话\u id分组后: user_id session_id timestamp step impressions n_clicks 0 004A07DM0IDW 1d688ec168932 1541555799 7 2059240 5.0 1 004A07DM0
user_id session_id timestamp step impressions n_clicks
0 004A07DM0IDW 1d688ec168932 1541555799 7 2059240 5.0
1 004A07DM0IDW 1d688ec168932 1541555799 7 2033381 3.0
2 004A07DM0IDW 1d688ec168932 1541555799 7 1724779 4.0
3 004A07DM0IDW 1d688ec168932 1541555799 7 127131 2.0
4 004A07DM0IDW 1d688ec168932 1541555799 7 399441 1.0
5 004A07DM0IDW 1d688ec168932 1541555799 7 103357 3.0
6 004A07DM0IDW 1d688ec168932 1541555799 7 127132 3.0
7 004A07DM0IDW 1d688ec168932 1541555799 7 1167004 1.0
8 004A07DM0IDW 1d688ec168932 1541555799 7 4491766 4.0
9 004A07DM0IDW 1d688ec168932 1541555799 7 2249874 5.0
10 00Y1Z24X8084 26b6d294d66e7 1541651823 3 4476010 4.0
11 00Y1Z24X8084 26b6d294d66e7 1541651823 3 3843244 5.0
我想生产这个产品
user_id session_id timestamp step count_0 count_1 count_2 count... count_24
0 004A07DM0IDW 1d688ec168932 1541555799 7 5.0 3.0 4.0 2.0 -99
1 00Y1Z24X8084 26b6d294d66e7 1541555799 3 4.0 5.0 -99 -99 -99
我们看到的是,用户id
会话id
时间戳
步骤
对于每一行总是相同的。然而,印象是不同的。对于每一行(最多25行),单击列中的值映射到一个count\u x
,但是,如果行数不够,后续值将取-99
由于第一个groupby帧中有10行,这意味着列count\u 10
到count\u 24
的值将为-99。对于第二个groupby框架列,count_2
到count_24
将具有-99。使用:
cols = ['user_id','session_id','timestamp','step']
df['g'] = df.groupby(cols).cumcount()
df = (df.set_index(cols + ['g'])['n_clicks']
.unstack(fill_value=-99)
.reindex(range(25), fill_value=-99, axis=1)
.add_prefix('count_')
.reset_index()
.rename_axis(None, axis=1))
print (df)
user_id session_id timestamp step count_0 count_1 count_2 \
0 004A07DM0IDW 1d688ec168932 1541555799 7 5.0 3.0 4.0
1 00Y1Z24X8084 26b6d294d66e7 1541651823 3 4.0 5.0 -99.0
count_3 count_4 count_5 ... count_15 count_16 count_17 count_18 \
0 2.0 1.0 3.0 ... -99 -99 -99 -99
1 -99.0 -99.0 -99.0 ... -99 -99 -99 -99
count_19 count_20 count_21 count_22 count_23 count_24
0 -99 -99 -99 -99 -99 -99
1 -99 -99 -99 -99 -99 -99
[2 rows x 29 columns]
说明:
范围(25)
按添加缺少的列