Python Groupby+;从另一列选择条件以创建新列

Python Groupby+;从另一列选择条件以创建新列,python,conditional-statements,pandas-groupby,Python,Conditional Statements,Pandas Groupby,我试图在一个新列(“第二次访问日期”)中捕获“用户”的“访问次数==2”的日期 下面是代码(包括我要创建的新列) 所以我得到: user date visit_num 2nd_visit_date 1 1995-09-01 1 1995-09-02 1 1995-09-02 2 1995-09-02 2 1995-10-03 1 1995-10-04 2 1995-10-04 2

我试图在一个新列(“第二次访问日期”)中捕获“用户”的“访问次数==2”的日期

下面是代码(包括我要创建的新列)

所以我得到:

user    date    visit_num   2nd_visit_date
 1   1995-09-01     1        1995-09-02
 1   1995-09-02     2        1995-09-02
 2   1995-10-03     1        1995-10-04
 2   1995-10-04     2        1995-10-04
 2   1995-10-05     3        1995-10-04
 3   1995-11-07     1        1995-11-08
 3   1995-11-08     2        1995-11-08
 3   1995-11-09     3        1995-11-08
 3   1995-11-10     4        1995-11-08
 3   1995-11-15     5        1995-11-08
 4   1995-12-18     1        1995-12-20
 4   1995-12-20     2        1995-12-20
我尝试了以下代码,但不起作用:

df["2nd_visit_date"] = df.groupby("user")["date"].transform(df['visit_num']==2)

任何帮助都将不胜感激。谢谢。

假设这是您的原始
df

df

   user    date    visit_num
0   1   1995-09-01  1
1   1   1995-09-02  2
2   2   1995-10-03  1
3   2   1995-10-04  2
4   2   1995-10-05  3
5   3   1995-11-07  1
6   3   1995-11-08  2
7   3   1995-11-09  3
8   3   1995-11-10  4
9   3   1995-11-15  5
10  4   1995-12-18  1
11  4   1995-12-20  2
您可以首先为第二次访问创建数据框(并更改列名):

并将其与原始的
df

pd.merge(df, df_2, on='user', how='left')

    user    date    visit_num   2nd_visit_date
0   1   1995-09-01      1         1995-09-02
1   1   1995-09-02      2         1995-09-02
2   2   1995-10-03      1         1995-10-04
3   2   1995-10-04      2         1995-10-04
4   2   1995-10-05      3         1995-10-04
5   3   1995-11-07      1         1995-11-08
6   3   1995-11-08      2         1995-11-08
7   3   1995-11-09      3         1995-11-08
8   3   1995-11-10      4         1995-11-08
9   3   1995-11-15      5         1995-11-08
10  4   1995-12-18      1         1995-12-20
11  4   1995-12-20      2         1995-12-20

假设这是您的原始
df

df

   user    date    visit_num
0   1   1995-09-01  1
1   1   1995-09-02  2
2   2   1995-10-03  1
3   2   1995-10-04  2
4   2   1995-10-05  3
5   3   1995-11-07  1
6   3   1995-11-08  2
7   3   1995-11-09  3
8   3   1995-11-10  4
9   3   1995-11-15  5
10  4   1995-12-18  1
11  4   1995-12-20  2
您可以首先为第二次访问创建数据框(并更改列名):

并将其与原始的
df

pd.merge(df, df_2, on='user', how='left')

    user    date    visit_num   2nd_visit_date
0   1   1995-09-01      1         1995-09-02
1   1   1995-09-02      2         1995-09-02
2   2   1995-10-03      1         1995-10-04
3   2   1995-10-04      2         1995-10-04
4   2   1995-10-05      3         1995-10-04
5   3   1995-11-07      1         1995-11-08
6   3   1995-11-08      2         1995-11-08
7   3   1995-11-09      3         1995-11-08
8   3   1995-11-10      4         1995-11-08
9   3   1995-11-15      5         1995-11-08
10  4   1995-12-18      1         1995-12-20
11  4   1995-12-20      2         1995-12-20