Python 3.x 从数据框格式的支出文件创建每日帐户日志

Python 3.x 从数据框格式的支出文件创建每日帐户日志,python-3.x,pandas,Python 3.x,Pandas,我有一个开支文件,我正试图读取该文件并从中创建每日日志。下面显示了该文件的一小部分,该文件在2015年1月的几天内持续了数年 Date,Checking_Debit,Checking_Addition,Savings_Debit,Savings_Addition 2015-01-07,342.1,0.0,0.0,0.0 2015-01-07,981.0,0.0,0.0,0.0 2015-01-07,3185.0,0.0,0.0,0.0 2015-01-05,55.0,0.0,0.0,0.0 20

我有一个开支文件,我正试图读取该文件并从中创建每日日志。下面显示了该文件的一小部分,该文件在2015年1月的几天内持续了数年

Date,Checking_Debit,Checking_Addition,Savings_Debit,Savings_Addition
2015-01-07,342.1,0.0,0.0,0.0
2015-01-07,981.0,0.0,0.0,0.0
2015-01-07,3185.0,0.0,0.0,0.0
2015-01-05,55.0,0.0,0.0,0.0
2015-01-05,75.0,0.0,0.0,0.0
2015-01-03,287.0,0.0,0.0,0.0
2015-01-02,64.8,0.0,0.0,0.0
2015-01-02,75.0,0.0,0.0,75.0
2015-01-02,1280.0,0.0,0.0,0.0
2015-01-02,245.0,0.0,0.0,0.0
2015-01-01,45.0,0.0,0.0,0.0
在我的代码中,我从变量
checking\u start
savings\u start
开始,这些变量包含支票和储蓄账户的起始值。我想给代码一个开始日期和结束日期,让代码每天迭代,看看当天是否有费用,减去支票和储蓄的借方,再加上支票和储蓄的增加。如果当天没有费用,则应将账目保持与前一天相同的价值。此外,我试图在实现中限制自己使用Pandas数据帧。到目前为止,我的代码是这样的

import pandas as pd
from date time import date
check_start = 8500.0
savings_start = 4000.0
start_date = date(2017, 1, 1)
end_date = date(2017, 1, 8)
df = pd.read_csv(file_name.csv, dtype={'Date': str, 'Checking_Debit': float, 
                                       'Checking_Addition': float, 
                                       'Savings_Debit': float, 
                                       'Savings_Addition': float})
在带有Pandas模块的Python格式中,我如何从开始日期走到结束日期,一天一天,然后查看这些日期是否有费用,然后从支票和储蓄中减去。最后,我应该有一个数组,用于每个日期的支票账户的值,以及当天的储蓄账户的值

结果应以以下格式写入另一个.csv文件

Date,Checking,Savings
2017-01-07,1865.1,3925.0
2017-01-06,6373.2,3925.0
2017-01-05,6373.2,3925.0
2017-01-04,6503.2,3925.0
2017-01-03,6503.2,3925.0
2017-01-02,6790.2,3925.0
2017-01-01,8455.0,4000.0

首先读取您提供的数据,并用它标识数据中的日期列

import pandas as pd

df = pd.read_csv(r"dat.csv", parse_dates=[0],dtype={'Checking_Debit': float, 
                                                               'Checking_Addition': float, 
                                                               'Savings_Debit': float, 
                                                               'Savings_Addition': float})
将日期设置为索引以更好地处理数据

df = df.set_index("Date")
初始化循环的所有变量

check_start = 8500.0
savings_start = 4000.0
start_date = pd.to_datetime('2015/1/1')
end_date = pd.to_datetime('2015/1/8')
delta = pd.Timedelta('1 days') # time that needs to be added to start date
现在将费用数据w.r.t分组到每个日期

grp_df = df.groupby('Date').sum()
现在我们将执行
while
循环以创建每天的费用报告

expense_report = []
while start_date<=end_date:
    if start_date in df.index:
        savings_start += (grp_df.loc[start_date,"Savings_Addition"]-grp_df.loc[start_date,"Savings_Debit"])
        check_start += (grp_df.loc[start_date,"Checking_Addition"]-grp_df.loc[start_date,"Checking_Debit"])
        expense_report.append([start_date,check_start,savings_start])
    elif start_date not in df.index:
        expense_report.append([start_date,check_start,savings_start])

    start_date += delta
您可以通过以下方式保存到csv

df_exp_rpt.to_csv("filename.csv")

注意:保存列的值是4075,而不是3925.0,因为原始数据中保存添加列的值是75。你能举一个例子说明你的输出应该是什么样子吗?当然,我刚刚添加了所需的输出。到目前为止你做了哪些尝试?您发布的唯一代码实际上并没有演示对此目标的尝试,它只是加载一个CSV并设置变量。感谢Sahil,这是一个好的开始,但它遇到了我到目前为止遇到的相同问题。解决方案将于1月4日和1月6日停止,因为在这两天没有任何费用。但是,当我查看您的解决方案时,似乎可以在df.index行中的
if start\u日期之后放置一条else语句,该语句应该能够处理该问题。
df_exp_rpt.to_csv("filename.csv")