Python 根据其他列中的实际数据更改列中的日期

Python 根据其他列中的实际数据更改列中的日期,python,pandas,dataframe,datetime,Python,Pandas,Dataframe,Datetime,我有以下数据帧: account_id contract_id date_activated 2021-12-01 00:00:00 2021-01-01 00:00:00 2021-02-01 00:00:00 2021-03-01 00:00:00 2021-04-01 00:00:00 2021-05-01 00:00:00 2021-06-01 00:00:00 0 1 A 2020-12-04 200.0 200.0 200.0 0.0 0.0 0.0 0

我有以下数据帧:

 account_id contract_id date_activated  2021-12-01 00:00:00 2021-01-01 00:00:00 2021-02-01 00:00:00 2021-03-01 00:00:00 2021-04-01 00:00:00 2021-05-01 00:00:00 2021-06-01 00:00:00
0   1   A   2020-12-04  200.0   200.0   200.0   0.0 0.0 0.0 0.0
1   1   B   2021-03-09  0.0 0.0 300.0   300.0   300.0   300.0   300.0
2   1   C   2021-04-25  0.0 0.0 0.0 0.0 100.0   100.0   100.0
{'account_id': {0: 1, 1: 1, 2: 1},
 'contract_id': {0: 'A', 1: 'B', 2: 'C'},
 'date_activated': {0: Timestamp('2020-12-04 00:00:00'),
  1: Timestamp('2021-03-09 00:00:00'),
  2: Timestamp('2021-04-25 00:00:00')},
 datetime.datetime(2021, 12, 1, 0, 0): {0: 200.0, 1: 0.0, 2: 0.0},
 datetime.datetime(2021, 1, 1, 0, 0): {0: 200.0, 1: 0.0, 2: 0.0},
 datetime.datetime(2021, 2, 1, 0, 0): {0: 200.0, 1: 300.0, 2: 0.0},
 datetime.datetime(2021, 3, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 0.0},
 datetime.datetime(2021, 4, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 100.0},
 datetime.datetime(2021, 5, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 100.0},
 datetime.datetime(2021, 6, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 100.0}}
我想更改date_activated列,当它不等于第4列显示的每月付款数据时,依此类推(以月份和年份为单位)。修改后的日期_激活将与列标题相同,即它将更改为最早的付款日期,并且只能包含表示当月第一天的日期)。我只需要保留那些与月份和年份相同的内容

我希望输出如下所示:

 account_id contract_id date_activated  2021-12-01 00:00:00 2021-01-01 00:00:00 2021-02-01 00:00:00 2021-03-01 00:00:00 2021-04-01 00:00:00 2021-05-01 00:00:00 2021-06-01 00:00:00
0   1   A   2021-12-01  200.0   200.0   200.0   0.0 0.0 0.0 0.0
1   1   B   2021-02-01  0.0 0.0 300.0   300.0   300.0   300.0   300.0
2   1   C   2021-04-25  0.0 0.0 0.0 0.0 100.0   100.0   100.0
以下是数据帧的字典:

 account_id contract_id date_activated  2021-12-01 00:00:00 2021-01-01 00:00:00 2021-02-01 00:00:00 2021-03-01 00:00:00 2021-04-01 00:00:00 2021-05-01 00:00:00 2021-06-01 00:00:00
0   1   A   2020-12-04  200.0   200.0   200.0   0.0 0.0 0.0 0.0
1   1   B   2021-03-09  0.0 0.0 300.0   300.0   300.0   300.0   300.0
2   1   C   2021-04-25  0.0 0.0 0.0 0.0 100.0   100.0   100.0
{'account_id': {0: 1, 1: 1, 2: 1},
 'contract_id': {0: 'A', 1: 'B', 2: 'C'},
 'date_activated': {0: Timestamp('2020-12-04 00:00:00'),
  1: Timestamp('2021-03-09 00:00:00'),
  2: Timestamp('2021-04-25 00:00:00')},
 datetime.datetime(2021, 12, 1, 0, 0): {0: 200.0, 1: 0.0, 2: 0.0},
 datetime.datetime(2021, 1, 1, 0, 0): {0: 200.0, 1: 0.0, 2: 0.0},
 datetime.datetime(2021, 2, 1, 0, 0): {0: 200.0, 1: 300.0, 2: 0.0},
 datetime.datetime(2021, 3, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 0.0},
 datetime.datetime(2021, 4, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 100.0},
 datetime.datetime(2021, 5, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 100.0},
 datetime.datetime(2021, 6, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 100.0}}
以下是所需输出的字典:

{'account_id': {0: 1, 1: 1, 2: 1},
 'contract_id': {0: 'A', 1: 'B', 2: 'C'},
 'date_activated': {0: Timestamp('2021-12-01 00:00:00'),
  1: Timestamp('2021-02-01 00:00:00'),
  2: Timestamp('2021-04-25 00:00:00')},
 datetime.datetime(2021, 12, 1, 0, 0): {0: 200.0, 1: 0.0, 2: 0.0},
 datetime.datetime(2021, 1, 1, 0, 0): {0: 200.0, 1: 0.0, 2: 0.0},
 datetime.datetime(2021, 2, 1, 0, 0): {0: 200.0, 1: 300.0, 2: 0.0},
 datetime.datetime(2021, 3, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 0.0},
 datetime.datetime(2021, 4, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 100.0},
 datetime.datetime(2021, 5, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 100.0},
 datetime.datetime(2021, 6, 1, 0, 0): {0: 0.0, 1: 300.0, 2: 100.0}}
您可以使用在datetime列中将0替换为
np.nan
后检查非无值的第一个索引

然后使用替换条件为
False
的值

idx=df.iloc[:,3:]替换(0,np.nan).T.apply(pd.Series.first有效索引)
m=(df['date\u activated'].dt.year==idx.dt.year)和(df['date\u activated'].dt.month==idx.dt.month)
df['date\u activated']=df['date\u activated']。其中(m,idx)

为什么只检查年和月的相等性?@MrFuppes OP声明我只需要保留那些与月和年相同的。