Python 合并旧款和旧款的合计;新工作
我有一个包含费用和发票值的df。有些工作实际上有两份工作——一份旧工作和一份新工作。我需要对既有旧工单又有新工单的工单的费用和发票价值求和;结果需要是一行。然后,我需要删除包含新作业条目的行 我有一个excel文件,列出每项工作的费用和发票总额。在我的代码中,这是crcy:Python 合并旧款和旧款的合计;新工作,python,pandas,dataframe,Python,Pandas,Dataframe,我有一个包含费用和发票值的df。有些工作实际上有两份工作——一份旧工作和一份新工作。我需要对既有旧工单又有新工单的工单的费用和发票价值求和;结果需要是一行。然后,我需要删除包含新作业条目的行 我有一个excel文件,列出每项工作的费用和发票总额。在我的代码中,这是crcy: Job# Expenses Invoice 1 5 2 2 10 27 3 15 33 10
Job# Expenses Invoice
1 5 2
2 10 27
3 15 33
10 60 4
20 57 21
12 9 36
22 11 18
然后我有一个excel文件,其中列出了旧的和新的工作:
我不太确定这里应该使用哪些熊猫操作,所以我不知道该尝试什么。非常感谢您的建议
import pandas as pd
# Pull in excel data
crcy = pd.read_excel('crcy1.xlsx')
jobs = pd.read_excel('jobs.xlsx')
# Merge on job#
df3 = crcy.merge(jobs, on='Job#', how='outer')
# Drop rows where new job # is pulled in but
# has no Expense/Invoice entries.
df3 = df3.dropna(thresh=3)
print(df3)
实际结果:
Job# Expenses Invoice New Job#
1 5.0 2.0 0.0
2 10.0 27.0 0.0
3 15.0 33.0 0.0
10 60.0 4.0 20.0
20 57.0 21.0 0.0
12 9.0 36.0 22.0
22 11.0 18.0 0.0
预期结果:
Job# Expenses Invoice New Job#
1 5.0 2.0 0.0
2 10.0 27.0 0.0
3 15.0 33.0 0.0
10 117.0 25.0 20.0
12 20.0 54.0 22.0
试试这个:
# Rename the columns for easier reference
jobs.columns = ['Old Job#', 'New Job#']
# For each job, find if it has an old Job#
crcy = crcy.merge(jobs, left_on='Job#', right_on='New Job#', how='left')
# The Job# that goes into the report is the Old Job#, if it has that
crcy['Report Job#'] = crcy['Old Job#'].combine_first(crcy['Job#'])
crcy.groupby('Report Job#').agg({
'Expenses': 'sum',
'Invoice': 'sum',
'Old Job#': 'first'
})
结果:
Expenses Invoice Old Job#
Report Job#
1.0 5 2 NaN
2.0 10 27 NaN
3.0 15 33 NaN
10.0 117 25 10.0
12.0 20 54 12.0
谢谢你的回复!在groupby语句中,我必须在它前面加上crcy=才能让它工作。
# Rename the columns for easier reference
jobs.columns = ['Old Job#', 'New Job#']
# For each job, find if it has an old Job#
crcy = crcy.merge(jobs, left_on='Job#', right_on='New Job#', how='left')
# The Job# that goes into the report is the Old Job#, if it has that
crcy['Report Job#'] = crcy['Old Job#'].combine_first(crcy['Job#'])
crcy.groupby('Report Job#').agg({
'Expenses': 'sum',
'Invoice': 'sum',
'Old Job#': 'first'
})
Expenses Invoice Old Job#
Report Job#
1.0 5 2 NaN
2.0 10 27 NaN
3.0 15 33 NaN
10.0 117 25 10.0
12.0 20 54 12.0