Python 如何更新';余额';基于数据框中其他列值的列

Python 如何更新';余额';基于数据框中其他列值的列,python,python-3.x,pandas,dataframe,Python,Python 3.x,Pandas,Dataframe,我有以下数据帧 Date Status Amount Balance 0 06-10-2000 Deposit 40.00 40.0 1 09-12-2002 Withdraw 1000.00 NaN 2 27-06-2001 Deposit 47.00 NaN 3 07-12-2021 Withdraw 100.00 NaN 4 06-10-2022 Deposit 120.00

我有以下数据帧

         Date    Status   Amount  Balance
0  06-10-2000   Deposit    40.00     40.0
1  09-12-2002  Withdraw  1000.00      NaN
2  27-06-2001   Deposit    47.00      NaN
3  07-12-2021  Withdraw   100.00      NaN
4  06-10-2022   Deposit   120.00      NaN
5  06-10-2000   Deposit    40.00      NaN
6  09-12-2024  Withdraw    50.00      NaN

目标是根据存款还是取款更新余额,初始余额=起始金额。因此,将其硬编码为40.0

下面是我的代码,不知怎的,我没有得到预期的结果

预期结果:

         Date    Status   Amount  Balance
0  06-10-2000   Deposit    40.00     40.0
1  09-12-2002  Withdraw  1000.00    -960.0
2  27-06-2001   Deposit    47.00    -913.0
3  07-12-2021  Withdraw   100.00    -1013.0
4  06-10-2022   Deposit   120.00    -893.0
5  06-10-2000   Deposit    40.00    -853.0
6  09-12-2024  Withdraw    50.00    -903.0

代码中我做错了什么,代码如下

import pandas as pd
with open(r"transactions.txt", "r") as Account:
    details = Account.read().split(",")
print("details of txt",details)

df=pd.DataFrame(details)

fg=df[0].str.extract('(?P<Date>.*) (?P<Status>.*) (?P<Amount>.*)')
print(fg)

fg['Amount'] = fg.Amount.str.replace('$','') #removing $ sign
#setting first row value of balance as 40, as equal to amount in 1st row
fg.loc[fg.index[0], 'Balance'] = 40.00 
print(fg)

for index, row in fg.iterrows():
    if index==0:
        continue
    if fg.loc[index,'Status']=='Deposit':
        print("reached here")
        fg.at[float(index),'Balance']=sum(fg.loc[float(index),'Amount'],fg.loc[float(index-1),'Balance'])
    elif fg.loc[index,'Status']=='withdraw':  
        fg.at[float(index),'Balance']=fg.loc[float(index),'Amount']-fg.loc[float(index-1),'Balance']

    print(fg)
将熊猫作为pd导入
以未结(r“transactions.txt”、“r”)作为账户:
详细信息=Account.read().split(“,”)
打印(“txt的详细信息”,详细信息)
df=pd.DataFrame(详细信息)
fg=df[0].str.extract('(?P.*)(?P.*)(?P.*))
打印(前景)
fg['Amount']=fg.Amount.str.replace('$,'')#删除$符号
#将余额的第一行值设置为40,等于第一行的金额
最终位置[最终索引[0],“余额”]=40.00
打印(前景)
对于索引,fg.iterrows()中的行:
如果索引==0:
持续
如果fg.loc[索引,'Status']=“存款”:
打印(“到达此处”)
fg.at[浮动(指数),“余额”]=总和(fg.loc[浮动(指数),“金额”],fg.loc[浮动(指数-1),“余额])
elif fg.loc[索引,'Status']=='draw':
fg.at[浮动(指数),“余额”]=fg.loc[浮动(指数),“金额”]-fg.loc[浮动(指数-1),“余额”]
打印(前景)

IIUC,
np.where
cumsum

df['Balance'] = np.where(df['Status'].eq('Deposit'),df['Amount'], df['Amount'] * -1)

df['Balance'] = df['Balance'].cumsum()

         Date    Status  Amount  Balance
0  06-10-2000   Deposit    40.0     40.0
1  09-12-2002  Withdraw  1000.0   -960.0
2  27-06-2001   Deposit    47.0   -913.0
3  07-12-2021  Withdraw   100.0  -1013.0
4  06-10-2022   Deposit   120.0   -893.0
5  06-10-2000   Deposit    40.0   -853.0
6  09-12-2024  Withdraw    50.0   -903.0

我无法复制解决方案。你的意思是说我根本不需要在数据帧上循环?如果不是,您的解决方案建议如何以及在哪里适合我的上述for循环和If部分?请澄清。@user10083444在pandas中,我们在跨列应用矢量化解决方案时避免循环,我使用了您的源数据,您可以在
fg['Amount']=fg.Amount.str.replace('$,'')
之后删除所有内容,但请确保Amount是一个整数
print(df.dtypes)
Understanderstand@datanovel。感谢您提供了这个优雅的解决方案。我想我以后将不再在数据帧上使用for循环。@user10083444这可能会让人困惑,因为当你学习python时,你会学习如何使用IF语句和for循环,但它们不适合pandas API,后者具有速度和性能。你知道
np.
cumsum
在哪里工作吗?是的,现在我知道了。谷歌搜索了一下,并阅读了scipy文档。也发现了这一极好的解释。