Python 合并并替换值数据框
我有两个数据帧,具有相同的日期和客户端id,但数量不同 我尝试获取另一个具有dfA amount值的数据帧,并在dfA不存在时在dfB上保留另一个0Python 合并并替换值数据框,python,pandas,dataframe,pandas-groupby,Python,Pandas,Dataframe,Pandas Groupby,我有两个数据帧,具有相同的日期和客户端id,但数量不同 我尝试获取另一个具有dfA amount值的数据帧,并在dfA不存在时在dfB上保留另一个0 dfA: client_id date amount 0 1 2020-07-11 100 1 1 2020-07-10 90 2 1 2020-07-09 80 3 1 2020-07-12 70 3
dfA:
client_id date amount
0 1 2020-07-11 100
1 1 2020-07-10 90
2 1 2020-07-09 80
3 1 2020-07-12 70
3 1 2020-07-01 86
dfB:
client_id date amount
0 1 2020-07-11 0
1 1 2020-07-10 0
2 1 2020-07-09 0
3 1 2020-07-07 0
4 1 2020-07-06 0
5 1 2020-07-05 0
5 1 2020-07-04 0
3 1 2020-07-03 0
4 1 2020-07-02 0
5 1 2020-07-01 0
我想得到:
dfResult:
client_id date amount
0 1 2020-07-11 100
1 1 2020-07-10 90
2 1 2020-07-09 80
3 1 2020-07-07 70
4 1 2020-07-06 0
5 1 2020-07-05 0
5 1 2020-07-04 0
3 1 2020-07-03 0
4 1 2020-07-02 0
5 1 2020-07-01 86
您可以
concat
df一起,按数量排序,然后删除重复项
dfResult = pd.concat([dfA,dfB]).sort_values(by='amout',ascending = False).drop_duplicates(subset=['client_id','date'],keep='first').reset_index().sort_values(by=['client id','date'],ascending = (True,False))
试试这个
(
dfB.date.map(
dfA.set_index('date')['amount'].to_dict()
).fillna(0.0)
)
或
您好,抱歉,传统类型错误:无法将类型“Timestamp”与类型“date”进行比较dfb dataframedate\u range=pd.date\u range(date\u begin.date(),date\u end.date())data=pd.DataFrame([],columns=['client\u id','date','amount'])data['date']=date\u range这将忽略
客户id
(我猜)这有点重要
(
dfB.merge(
dfA, on=['client_id', 'date'], suffixes=("_x", ""), how='left'
).fillna(0.0).drop(columns=["amount_x"])
)
client_id date amount
0 1 2020-07-11 100.0
1 1 2020-07-10 90.0
2 1 2020-07-09 80.0
3 1 2020-07-07 0.0
4 1 2020-07-06 0.0
5 1 2020-07-05 0.0
5 1 2020-07-04 0.0
3 1 2020-07-03 0.0
4 1 2020-07-02 0.0
5 1 2020-07-01 86.0