在python中转换/重塑数据_Python_Pandas

在python中转换/重塑数据

python pandas

在python中转换/重塑数据,python,pandas,Python,Pandas,我有一个这样的数据集 Agent ID Month values 101 Jan-17 2 101 Feb-17 4 101 Mar-17 3 101 Apr-17 8 101 May-17 12 101 Jun-17 3 101 Dec-17 1 102 Jan-17 2 102 Feb-17 3 102 M

我有一个这样的数据集

Agent ID    Month   values
101         Jan-17  2
101         Feb-17  4
101         Mar-17  3
101         Apr-17  8
101         May-17  12
101         Jun-17  3
101         Dec-17  1
102         Jan-17  2
102         Feb-17  3
102         Mar-17  7
102         Apr-17  3
102         May-17  2
102         Jun-17  11
102         Sep-17  2
102         Oct-17  2
102         Nov-17  1
102         Dec-17  4

我希望它变成这个形状

Agent ID    Month   values  Jan-17  Feb-17  Mar-17  Apr-17  May-17  Jun-17  Sep-17  Oct-17  Nov-17  Dec-17
101 Jan-17  2   2   4   3   8   12  3   0   0   0   1
101 Feb-17  4   2   4   3   8   12  3   0   0   0   1
101 Mar-17  3   2   4   3   8   12  3   0   0   0   1
101 Apr-17  8   2   4   3   8   12  3   0   0   0   1
101 May-17  12  2   4   3   8   12  3   0   0   0   1
101 Jun-17  3   2   4   3   8   12  3   0   0   0   1
101 Dec-17  1   2   4   3   8   12  3   0   0   0   1
102 Jan-17  2   2   3   7   3   2   11  2   2   1   4
102 Feb-17  3   2   3   7   3   2   11  2   2   1   4
102 Mar-17  7   2   3   7   3   2   11  2   2   1   4
102 Apr-17  3   2   3   7   3   2   11  2   2   1   4
102 May-17  2   2   3   7   3   2   11  2   2   1   4
102 Jun-17  11  2   3   7   3   2   11  2   2   1   4
102 Sep-17  2   2   3   7   3   2   11  2   2   1   4
102 Oct-17  2   2   3   7   3   2   11  2   2   1   4
102 Nov-17  1   2   3   7   3   2   11  2   2   1   4
102 Dec-17  4   2   3   7   3   2   11  2   2   1   4

我认为这是先透视，然后合并

df.Month=pd.to_datetime(df.Month,format='%b-%y').dt.strftime('%Y-%m')
s=df.pivot(*df.columns).fillna(0).reset_index()
df=df.merge(s)
df
Out[876]: 
    AgentID    Month  values   ...     2017-10  2017-11  2017-12
0       101  2017-01       2   ...         0.0      0.0      1.0
1       101  2017-02       4   ...         0.0      0.0      1.0
2       101  2017-03       3   ...         0.0      0.0      1.0
3       101  2017-04       8   ...         0.0      0.0      1.0
4       101  2017-05      12   ...         0.0      0.0      1.0
5       101  2017-06       3   ...         0.0      0.0      1.0
6       101  2017-12       1   ...         0.0      0.0      1.0
7       102  2017-01       2   ...         2.0      1.0      4.0
8       102  2017-02       3   ...         2.0      1.0      4.0
9       102  2017-03       7   ...         2.0      1.0      4.0
10      102  2017-04       3   ...         2.0      1.0      4.0
11      102  2017-05       2   ...         2.0      1.0      4.0
12      102  2017-06      11   ...         2.0      1.0      4.0
13      102  2017-09       2   ...         2.0      1.0      4.0
14      102  2017-10       2   ...         2.0      1.0      4.0
15      102  2017-11       1   ...         2.0      1.0      4.0
16      102  2017-12       4   ...         2.0      1.0      4.0
[17 rows x 13 columns]

更多信息

s
Out[878]: 
Month  AgentID  2017-01  2017-02   ...     2017-10  2017-11  2017-12
0          101      2.0      4.0   ...         0.0      0.0      1.0
1          102      2.0      3.0   ...         2.0      1.0      4.0
[2 rows x 11 columns]

使用

pd.crosstab

和

groupby

上的

apply

对

ffill

和

bfill

也可以执行此操作


我使用WenYoBen中的行将df.Month转换为datime格式，以便按照OP的要求正确地保持顺序：
df.Month=pd.to_datetime(df.Month,format='%b-%y').dt.strftime('%Y-%m')
df1 = pd.crosstab(index=[df.AgentID, df.Month, df['values']], columns=df.Month, values=df['values'], aggfunc='first')
df1 = df1.groupby(level=0).apply(lambda x: x.ffill().bfill()).fillna(0).reset_index()


Out[2103]:
Month  AgentID    Month  values  2017-01  2017-02  2017-03  2017-04  2017-05  \
0          101  2017-01       2      2.0      4.0      3.0      8.0     12.0
1          101  2017-02       4      2.0      4.0      3.0      8.0     12.0
2          101  2017-03       3      2.0      4.0      3.0      8.0     12.0
3          101  2017-04       8      2.0      4.0      3.0      8.0     12.0
4          101  2017-05      12      2.0      4.0      3.0      8.0     12.0
5          101  2017-06       3      2.0      4.0      3.0      8.0     12.0
6          101  2017-12       1      2.0      4.0      3.0      8.0     12.0
7          102  2017-01       2      2.0      3.0      7.0      3.0      2.0
8          102  2017-02       3      2.0      3.0      7.0      3.0      2.0
9          102  2017-03       7      2.0      3.0      7.0      3.0      2.0
10         102  2017-04       3      2.0      3.0      7.0      3.0      2.0
11         102  2017-05       2      2.0      3.0      7.0      3.0      2.0
12         102  2017-06      11      2.0      3.0      7.0      3.0      2.0
13         102  2017-09       2      2.0      3.0      7.0      3.0      2.0
14         102  2017-10       2      2.0      3.0      7.0      3.0      2.0
15         102  2017-11       1      2.0      3.0      7.0      3.0      2.0
16         102  2017-12       4      2.0      3.0      7.0      3.0      2.0

Month  2017-06  2017-09  2017-10  2017-11  2017-12
0          3.0      0.0      0.0      0.0      1.0
1          3.0      0.0      0.0      0.0      1.0
2          3.0      0.0      0.0      0.0      1.0
3          3.0      0.0      0.0      0.0      1.0
4          3.0      0.0      0.0      0.0      1.0
5          3.0      0.0      0.0      0.0      1.0
6          3.0      0.0      0.0      0.0      1.0
7         11.0      2.0      2.0      1.0      4.0
8         11.0      2.0      2.0      1.0      4.0
9         11.0      2.0      2.0      1.0      4.0
10        11.0      2.0      2.0      1.0      4.0
11        11.0      2.0      2.0      1.0      4.0
12        11.0      2.0      2.0      1.0      4.0
13        11.0      2.0      2.0      1.0      4.0
14        11.0      2.0      2.0      1.0      4.0
15        11.0      2.0      2.0      1.0      4.0
16        11.0      2.0      2.0      1.0      4.0

我不是在遵循规则logic@QuangHoang原始海报希望最左边的输出列为AgentID，所有其他输出列为months。在输入中，每个月都是一行。