Python 合并旧款和旧款的合计;新工作

Python 合并旧款和旧款的合计;新工作,python,pandas,dataframe,Python,Pandas,Dataframe,我有一个包含费用和发票值的df。有些工作实际上有两份工作——一份旧工作和一份新工作。我需要对既有旧工单又有新工单的工单的费用和发票价值求和;结果需要是一行。然后,我需要删除包含新作业条目的行 我有一个excel文件,列出每项工作的费用和发票总额。在我的代码中,这是crcy: Job# Expenses Invoice 1 5 2 2 10 27 3 15 33 10

我有一个包含费用和发票值的df。有些工作实际上有两份工作——一份旧工作和一份新工作。我需要对既有旧工单又有新工单的工单的费用和发票价值求和;结果需要是一行。然后,我需要删除包含新作业条目的行

我有一个excel文件,列出每项工作的费用和发票总额。在我的代码中,这是crcy:

Job#    Expenses    Invoice
1          5            2
2         10            27
3         15            33
10        60            4
20        57            21
12         9            36
22        11            18
然后我有一个excel文件,其中列出了旧的和新的工作:

我不太确定这里应该使用哪些熊猫操作,所以我不知道该尝试什么。非常感谢您的建议

import pandas as pd

# Pull in excel data
crcy = pd.read_excel('crcy1.xlsx')
jobs = pd.read_excel('jobs.xlsx')

# Merge on job#
df3 = crcy.merge(jobs, on='Job#', how='outer')
# Drop rows where new job # is pulled in but 
# has no Expense/Invoice entries.
df3 = df3.dropna(thresh=3)

print(df3)
实际结果:

Job#  Expenses  Invoice  New Job#
1       5.0      2.0       0.0
2      10.0     27.0       0.0
3      15.0     33.0       0.0
10     60.0      4.0      20.0
20     57.0     21.0       0.0
12      9.0     36.0      22.0
22     11.0     18.0       0.0
预期结果:

Job#  Expenses  Invoice  New Job#
1       5.0      2.0       0.0
2      10.0     27.0       0.0
3      15.0     33.0       0.0
10    117.0     25.0      20.0
12     20.0     54.0      22.0
试试这个:

# Rename the columns for easier reference
jobs.columns = ['Old Job#', 'New Job#']

# For each job, find if it has an old Job#
crcy = crcy.merge(jobs, left_on='Job#', right_on='New Job#', how='left')

# The Job# that goes into the report is the Old Job#, if it has that
crcy['Report Job#'] = crcy['Old Job#'].combine_first(crcy['Job#'])

crcy.groupby('Report Job#').agg({
    'Expenses': 'sum',
    'Invoice': 'sum',
    'Old Job#': 'first'
})
结果:

             Expenses  Invoice  Old Job#
Report Job#                             
1.0                 5        2       NaN
2.0                10       27       NaN
3.0                15       33       NaN
10.0              117       25      10.0
12.0               20       54      12.0

谢谢你的回复!在groupby语句中,我必须在它前面加上crcy=才能让它工作。
# Rename the columns for easier reference
jobs.columns = ['Old Job#', 'New Job#']

# For each job, find if it has an old Job#
crcy = crcy.merge(jobs, left_on='Job#', right_on='New Job#', how='left')

# The Job# that goes into the report is the Old Job#, if it has that
crcy['Report Job#'] = crcy['Old Job#'].combine_first(crcy['Job#'])

crcy.groupby('Report Job#').agg({
    'Expenses': 'sum',
    'Invoice': 'sum',
    'Old Job#': 'first'
})
             Expenses  Invoice  Old Job#
Report Job#                             
1.0                 5        2       NaN
2.0                10       27       NaN
3.0                15       33       NaN
10.0              117       25      10.0
12.0               20       54      12.0