Python 如何更新'；余额'；基于数据框中其他列值的列_Python_Python 3.x_Pandas_Dataframe

Python 如何更新'；余额'；基于数据框中其他列值的列

python python-3.x pandas dataframe

Python 如何更新'；余额'；基于数据框中其他列值的列,python,python-3.x,pandas,dataframe,Python,Python 3.x,Pandas,Dataframe,我有以下数据帧 Date Status Amount Balance 0 06-10-2000 Deposit 40.00 40.0 1 09-12-2002 Withdraw 1000.00 NaN 2 27-06-2001 Deposit 47.00 NaN 3 07-12-2021 Withdraw 100.00 NaN 4 06-10-2022 Deposit 120.00

我有以下数据帧

         Date    Status   Amount  Balance
0  06-10-2000   Deposit    40.00     40.0
1  09-12-2002  Withdraw  1000.00      NaN
2  27-06-2001   Deposit    47.00      NaN
3  07-12-2021  Withdraw   100.00      NaN
4  06-10-2022   Deposit   120.00      NaN
5  06-10-2000   Deposit    40.00      NaN
6  09-12-2024  Withdraw    50.00      NaN

目标是根据存款还是取款更新余额，初始余额=起始金额。因此，将其硬编码为40.0

下面是我的代码，不知怎的，我没有得到预期的结果

预期结果：

         Date    Status   Amount  Balance
0  06-10-2000   Deposit    40.00     40.0
1  09-12-2002  Withdraw  1000.00    -960.0
2  27-06-2001   Deposit    47.00    -913.0
3  07-12-2021  Withdraw   100.00    -1013.0
4  06-10-2022   Deposit   120.00    -893.0
5  06-10-2000   Deposit    40.00    -853.0
6  09-12-2024  Withdraw    50.00    -903.0

代码中我做错了什么，代码如下

import pandas as pd
with open(r"transactions.txt", "r") as Account:
    details = Account.read().split(",")
print("details of txt",details)

df=pd.DataFrame(details)

fg=df[0].str.extract('(?P<Date>.*) (?P<Status>.*) (?P<Amount>.*)')
print(fg)

fg['Amount'] = fg.Amount.str.replace('$','') #removing $ sign
#setting first row value of balance as 40, as equal to amount in 1st row
fg.loc[fg.index[0], 'Balance'] = 40.00 
print(fg)

for index, row in fg.iterrows():
    if index==0:
        continue
    if fg.loc[index,'Status']=='Deposit':
        print("reached here")
        fg.at[float(index),'Balance']=sum(fg.loc[float(index),'Amount'],fg.loc[float(index-1),'Balance'])
    elif fg.loc[index,'Status']=='withdraw':  
        fg.at[float(index),'Balance']=fg.loc[float(index),'Amount']-fg.loc[float(index-1),'Balance']

    print(fg)

将熊猫作为pd导入
以未结（r“transactions.txt”、“r”）作为账户：
详细信息=Account.read（）.split（“，”）
打印（“txt的详细信息”，详细信息）
df=pd.DataFrame（详细信息）
fg=df[0].str.extract（'（？P.*）（？P.*）（？P.*））
打印（前景）
fg['Amount']=fg.Amount.str.replace（'$，''）#删除$符号
#将余额的第一行值设置为40，等于第一行的金额
最终位置[最终索引[0]，“余额”]=40.00
打印（前景）
对于索引，fg.iterrows（）中的行：
如果索引==0：
持续
如果fg.loc[索引，'Status']=“存款”：
打印（“到达此处”）
fg.at[浮动（指数），“余额”]=总和（fg.loc[浮动（指数），“金额”]，fg.loc[浮动（指数-1），“余额]）
elif fg.loc[索引，'Status']=='draw'：
fg.at[浮动（指数），“余额”]=fg.loc[浮动（指数），“金额”]-fg.loc[浮动（指数-1），“余额”]
打印（前景）
IIUC，np.where
和cumsum

df['Balance'] = np.where(df['Status'].eq('Deposit'),df['Amount'], df['Amount'] * -1)

df['Balance'] = df['Balance'].cumsum()

         Date    Status  Amount  Balance
0  06-10-2000   Deposit    40.0     40.0
1  09-12-2002  Withdraw  1000.0   -960.0
2  27-06-2001   Deposit    47.0   -913.0
3  07-12-2021  Withdraw   100.0  -1013.0
4  06-10-2022   Deposit   120.0   -893.0
5  06-10-2000   Deposit    40.0   -853.0
6  09-12-2024  Withdraw    50.0   -903.0

我无法复制解决方案。你的意思是说我根本不需要在数据帧上循环？如果不是，您的解决方案建议如何以及在哪里适合我的上述for循环和If部分？请澄清。@user10083444在pandas中，我们在跨列应用矢量化解决方案时避免循环，我使用了您的源数据，您可以在fg['Amount']=fg.Amount.str.replace（'$，''）
之后删除所有内容，但请确保Amount是一个整数print（df.dtypes）
Understanderstand@datanovel。感谢您提供了这个优雅的解决方案。我想我以后将不再在数据帧上使用for循环。@user10083444这可能会让人困惑，因为当你学习python时，你会学习如何使用IF语句和for循环，但它们不适合pandas API，后者具有速度和性能。你知道np.
和cumsum
在哪里工作吗？是的，现在我知道了。谷歌搜索了一下，并阅读了scipy文档。也发现了这一极好的解释。