Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/341.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
帮助我用Python连接这些数据帧_Python_Pandas_Dataframe_Join_Merge - Fatal编程技术网

帮助我用Python连接这些数据帧

帮助我用Python连接这些数据帧,python,pandas,dataframe,join,merge,Python,Pandas,Dataframe,Join,Merge,这是我的设想。假设我有这两个数据集- dic = {'firstname':['John','John','John','John','John','Susan','Susan', 'Susan','Susan','Susan','Mike','Mike','Mike','Mike', 'Mike'], 'lastname':['Smith','Smith','Smith','Smith','Smith

这是我的设想。假设我有这两个数据集-

dic = {'firstname':['John','John','John','John','John','Susan','Susan',
                    'Susan','Susan','Susan','Mike','Mike','Mike','Mike',
                    'Mike'],
       'lastname':['Smith','Smith','Smith','Smith','Smith','Wilson',
                   'Wilson','Wilson','Wilson','Wilson','Jones','Jones',
                   'Jones','Jones','Jones'],
       'company':['KFC','BK','KFC','KFC','KFC','BK','BK','WND','WND',
                  'WND','TB','CHP','TB','CHP','TB'],
       'paid':[200,300,250,100,900,650,430,218,946,789,305,750,140,860,310]}
df1 = pd.DataFrame(dic)
print(df1)

输出1为-

   firstname lastname company  paid
0       John    Smith     KFC   200
1       John    Smith      BK   300
2       John    Smith     KFC   250
3       John    Smith     KFC   100
4       John    Smith     KFC   900
5      Susan   Wilson      BK   650
6      Susan   Wilson      BK   430
7      Susan   Wilson     WND   218
8      Susan   Wilson     WND   946
9      Susan   Wilson     WND   789
10      Mike    Jones      TB   305
11      Mike    Jones     CHP   750
12      Mike    Jones      TB   140
13      Mike    Jones     CHP   860
14      Mike    Jones      TB   310
输出2为-

  firstname lastname company  paid
0      John    Smith     KFC  1450
1      John    Smith      BK   300
2     Susan   Wilson      BK  1080
3     Susan   Wilson     WND  1953
4      Mike    Jones      TB   755
5      Mike    Jones     CHP  1610
我想做的是将df2付费列添加到df1详细视图的每个部分

我假设有一个合并函数可以帮助我,但是我需要一些帮助来编写代码

所以我的理想输出是-

   firstname lastname company  paid sum_paid
0       John    Smith     KFC   200     1450
1       John    Smith      BK   300      300
2       John    Smith     KFC   250     1450
3       John    Smith     KFC   100     1450
4       John    Smith     KFC   900     1450
5      Susan   Wilson      BK   650     1080
6      Susan   Wilson      BK   430     1080
7      Susan   Wilson     WND   218     1953
8      Susan   Wilson     WND   946     1953
9      Susan   Wilson     WND   789     1953
10      Mike    Jones      TB   305      755
11      Mike    Jones     CHP   750     1610
12      Mike    Jones      TB   140      755
13      Mike    Jones     CHP   860     1610
14      Mike    Jones      TB   310      755
只要这样做:

df = df1.merge(df2, on=['firstname', 'lastname', 'company']).rename(columns={'paid_y': 'sum_paid', 'paid_x': 'paid'})
print(df)

   firstname lastname company    paid  sum_paid
0       John    Smith     KFC     200    1450
1       John    Smith     KFC     250    1450
2       John    Smith     KFC     100    1450
3       John    Smith     KFC     900    1450
4       John    Smith      BK     300     300
5      Susan   Wilson      BK     650    1080
6      Susan   Wilson      BK     430    1080
7      Susan   Wilson     WND     218    1953
8      Susan   Wilson     WND     946    1953
9      Susan   Wilson     WND     789    1953
10      Mike    Jones      TB     305     755
11      Mike    Jones      TB     140     755
12      Mike    Jones      TB     310     755
13      Mike    Jones     CHP     750    1610
14      Mike    Jones     CHP     860    1610
只要这样做:

df = df1.merge(df2, on=['firstname', 'lastname', 'company']).rename(columns={'paid_y': 'sum_paid', 'paid_x': 'paid'})
print(df)

   firstname lastname company    paid  sum_paid
0       John    Smith     KFC     200    1450
1       John    Smith     KFC     250    1450
2       John    Smith     KFC     100    1450
3       John    Smith     KFC     900    1450
4       John    Smith      BK     300     300
5      Susan   Wilson      BK     650    1080
6      Susan   Wilson      BK     430    1080
7      Susan   Wilson     WND     218    1953
8      Susan   Wilson     WND     946    1953
9      Susan   Wilson     WND     789    1953
10      Mike    Jones      TB     305     755
11      Mike    Jones      TB     140     755
12      Mike    Jones      TB     310     755
13      Mike    Jones     CHP     750    1610
14      Mike    Jones     CHP     860    1610

您尝试合并了吗?您的列sum_paid看起来像是df1.groupby(['firstname','lastname','company'])['paid']的结果。transform('sum')?不需要df2@YOBEN_S,再次尝试合并后,我找到了答案。不知道为什么它第一次不起作用这能回答你的问题吗?您尝试合并了吗?您的列sum_paid看起来像是df1.groupby(['firstname','lastname','company'])['paid']的结果。transform('sum')?不需要df2@YOBEN_S,再次尝试合并后,我找到了答案。不知道为什么它第一次不起作用这能回答你的问题吗?