帮助我用Python连接这些数据帧
这是我的设想。假设我有这两个数据集-帮助我用Python连接这些数据帧,python,pandas,dataframe,join,merge,Python,Pandas,Dataframe,Join,Merge,这是我的设想。假设我有这两个数据集- dic = {'firstname':['John','John','John','John','John','Susan','Susan', 'Susan','Susan','Susan','Mike','Mike','Mike','Mike', 'Mike'], 'lastname':['Smith','Smith','Smith','Smith','Smith
dic = {'firstname':['John','John','John','John','John','Susan','Susan',
'Susan','Susan','Susan','Mike','Mike','Mike','Mike',
'Mike'],
'lastname':['Smith','Smith','Smith','Smith','Smith','Wilson',
'Wilson','Wilson','Wilson','Wilson','Jones','Jones',
'Jones','Jones','Jones'],
'company':['KFC','BK','KFC','KFC','KFC','BK','BK','WND','WND',
'WND','TB','CHP','TB','CHP','TB'],
'paid':[200,300,250,100,900,650,430,218,946,789,305,750,140,860,310]}
df1 = pd.DataFrame(dic)
print(df1)
及
输出1为-
firstname lastname company paid
0 John Smith KFC 200
1 John Smith BK 300
2 John Smith KFC 250
3 John Smith KFC 100
4 John Smith KFC 900
5 Susan Wilson BK 650
6 Susan Wilson BK 430
7 Susan Wilson WND 218
8 Susan Wilson WND 946
9 Susan Wilson WND 789
10 Mike Jones TB 305
11 Mike Jones CHP 750
12 Mike Jones TB 140
13 Mike Jones CHP 860
14 Mike Jones TB 310
输出2为-
firstname lastname company paid
0 John Smith KFC 1450
1 John Smith BK 300
2 Susan Wilson BK 1080
3 Susan Wilson WND 1953
4 Mike Jones TB 755
5 Mike Jones CHP 1610
我想做的是将df2付费列添加到df1详细视图的每个部分
我假设有一个合并函数可以帮助我,但是我需要一些帮助来编写代码
所以我的理想输出是-
firstname lastname company paid sum_paid
0 John Smith KFC 200 1450
1 John Smith BK 300 300
2 John Smith KFC 250 1450
3 John Smith KFC 100 1450
4 John Smith KFC 900 1450
5 Susan Wilson BK 650 1080
6 Susan Wilson BK 430 1080
7 Susan Wilson WND 218 1953
8 Susan Wilson WND 946 1953
9 Susan Wilson WND 789 1953
10 Mike Jones TB 305 755
11 Mike Jones CHP 750 1610
12 Mike Jones TB 140 755
13 Mike Jones CHP 860 1610
14 Mike Jones TB 310 755
只要这样做:
df = df1.merge(df2, on=['firstname', 'lastname', 'company']).rename(columns={'paid_y': 'sum_paid', 'paid_x': 'paid'})
print(df)
firstname lastname company paid sum_paid
0 John Smith KFC 200 1450
1 John Smith KFC 250 1450
2 John Smith KFC 100 1450
3 John Smith KFC 900 1450
4 John Smith BK 300 300
5 Susan Wilson BK 650 1080
6 Susan Wilson BK 430 1080
7 Susan Wilson WND 218 1953
8 Susan Wilson WND 946 1953
9 Susan Wilson WND 789 1953
10 Mike Jones TB 305 755
11 Mike Jones TB 140 755
12 Mike Jones TB 310 755
13 Mike Jones CHP 750 1610
14 Mike Jones CHP 860 1610
只要这样做:
df = df1.merge(df2, on=['firstname', 'lastname', 'company']).rename(columns={'paid_y': 'sum_paid', 'paid_x': 'paid'})
print(df)
firstname lastname company paid sum_paid
0 John Smith KFC 200 1450
1 John Smith KFC 250 1450
2 John Smith KFC 100 1450
3 John Smith KFC 900 1450
4 John Smith BK 300 300
5 Susan Wilson BK 650 1080
6 Susan Wilson BK 430 1080
7 Susan Wilson WND 218 1953
8 Susan Wilson WND 946 1953
9 Susan Wilson WND 789 1953
10 Mike Jones TB 305 755
11 Mike Jones TB 140 755
12 Mike Jones TB 310 755
13 Mike Jones CHP 750 1610
14 Mike Jones CHP 860 1610
您尝试合并了吗?您的列sum_paid看起来像是df1.groupby(['firstname','lastname','company'])['paid']的结果。transform('sum')?不需要df2@YOBEN_S,再次尝试合并后,我找到了答案。不知道为什么它第一次不起作用这能回答你的问题吗?您尝试合并了吗?您的列sum_paid看起来像是df1.groupby(['firstname','lastname','company'])['paid']的结果。transform('sum')?不需要df2@YOBEN_S,再次尝试合并后,我找到了答案。不知道为什么它第一次不起作用这能回答你的问题吗?