Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/281.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何基于带条件的过滤数据帧派生列_Python_Pandas_Numpy_Dataframe - Fatal编程技术网

Python 如何基于带条件的过滤数据帧派生列

Python 如何基于带条件的过滤数据帧派生列,python,pandas,numpy,dataframe,Python,Pandas,Numpy,Dataframe,上述代码产生两个数据帧 这是数据框中名为top data的数据示例: import easygui as gui import pandas as pd filename = gui.fileopenbox(msg='Please choose the Excel workbook containing the bank data.') #select workbook containing FC and WF data colnames=['1','2','3','4','5','6','7'

上述代码产生两个数据帧

这是数据框中名为top data的数据示例:

import easygui as gui
import pandas as pd
filename = gui.fileopenbox(msg='Please choose the Excel workbook containing the bank data.') #select workbook containing FC and WF data
colnames=['1','2','3','4','5','6','7','8','9','10','11','12'] #define col names because variable number of col won't read unless max col# is defined
dfdata = pd.read_csv(filename,names=colnames) #set dataframe equal to csv file
key = dfdata["12"].isnull() #set criteria for splitting data equal to null value in column 12
dftopdata = dfdata.loc[key] #set new df equal to key criteria
dfbottomdata = dfdata.loc[~key] #set new df NOT equal to key criteria
dftopdata = dftopdata.dropna(axis=1, how='all', thresh=None, subset=None) #drop any column with all values = NaN
dftopdata = dftopdata.dropna(axis=0, how='all', thresh=None, subset=None) #drop any row with all values = NaN
header = dftopdata.iloc[1] #Creates a header variable at row index location 1
dftopdata = dftopdata[2:] #Resets dataframe equal to row 2 and beyond
dftopdata.rename(columns = header, inplace = True) #sets names of columns in the dataframe equal to header
header = dfbottomdata.iloc[0] #Creates a header variable at row index location 0
dfbottomdata = dfbottomdata[1:] #Resets dataframe equal to row 1 and beyond
dfbottomdata.rename(columns = header, inplace = True) #sets names of columns in the dataframe equal to header
这是来自数据框的数据样本,称为底部数据:

Routing        Currency  Account Number  Account Name  Opening Ledger  Credits Amt  Credits Num  Debits Amt  Debits Num  Closing Ledger 
123456789      USD       1111111112      A             717.57          100.00       1            100.72      3           716.85         
123456789      USD       1111111113      B             1,350.30        NaN          0            28.53       1           1,321.77       
123456789      USD       1111111114      C             26,570.34       320.52       1            42.17       1           26,848.69      
123456789      USD       1111111115      D             1,031.95        2,000.00     1            703.95      2           2,328.00       
123456789      USD       1111111116      E             1,000.00        600.00       2            72.03       2           1,527.97  
我想在底部数据df中添加一个名为Balance的新列,该列包含每个银行帐户的余额

底部数据df中给定银行账户最早交易日期的余额应等于第一个数据框中该银行账户的期初分类账价值加上底部数据df中该行的任何贷方或借方

给定银行账户的每个后续交易应等于前一交易日期的余额加上底部数据df行中的任何贷方或借方

以下是我希望底部数据df在分析后的处理方式:

Date        Routing        Currency  Account Number  Account Name  BAI Type            BAI Code  CR Amount  DB Amount  Serial Num  Ref Num   Description                                       
12/10/2019  123456789      USD       1111111112      A             Miscellaneous Fees  7         NaN        28.69      NaN         69650977  MTHLY ANALYSIS CHARGE                             
12/20/2019  123456789      USD       1111111112      A             Misc Credit         1         100        NaN        NaN         70069250  XFR TO DDA FR DDA 001111085716122019RF#1452300... 
12/24/2019  123456789      USD       1111111112      A             Misc Debit          4         NaN        69.08      NaN         70184768  ACCESSIBLEINSURA WEBPAYMENTPCOF PROPERTIES SERIES 
12/24/2019  123456789      USD       1111111112      A             Misc Debit          5         NaN        2.95       NaN         70184769  SEP INSURANC ACH WEBPAYMENTPCOF PROPERTIES SERIES 
12/10/2019  123456789      USD       1111111113      B             Miscellaneous Fees  6         NaN        28.53      NaN         69645166  MTHLY ANALYSIS CHARGE                            
但我不知道下一步该怎么办

我曾想过为每个银行账户创建一个数据框架,但这似乎效率低下

有人能给我指出正确的方向吗

假设dfbottomdata按日期、路由和帐号从最小值到最大值的升序排序,那么下面的代码应该可以工作:

从dftopdata中添加期末分类账价值 dfbottomdata=dfbottomdata.mergedftopdata[['Routing','Account Number','Open Ledger']],on=['Routing','Account Number'] dfbottomdata.renamecolumns={‘期初分类账’:‘余额’},inplace=True 将NaN替换为0进行计算 dfbottomdata['CR Amount'].fillna0,inplace=True dfbottomdata['DB Amount'].fillna0,inplace=True 处理第一行的用例 dfbottomdata.loc[0',Balance']=dfbottomdata.loc[0',Balance']+dfbottomdata.loc[0',CR Amount']-dfbottomdata.loc[0',DB Amount'] 迭代每一行,仅当前一行路由/AccountNumber匹配时应用逻辑 对于范围1中的i,lendfbottomdata: 如果dfbottomdata.loc[i-1',Routing']==dfbottomdata.loc[i',Routing']&dfbottomdata.loc[i-1',账号']==dfbottomdata.loc[i',账号']: dfbottomdata.loc[i,'余额']=dfbottomdata.loc[i-1,'余额']+dfbottomdata.loc[i,'CR金额']-dfbottomdata.loc[i,'DB金额'] 其他: dfbottomdata.loc[i,'余额']=dfbottomdata.loc[i,'余额']+dfbottomdata.loc[i,'CR金额']-dfbottomdata.loc[i,'DB金额']
请将您的示例数据编辑到您的问题文本中,而不是作为图像,以便我们可以重新还原。您可以提供预期的输出数据框架吗?您到底做了什么来解决这个问题?你不能指望别人为你做任何事,对吧?我已经试着回答你所有的评论。如果我能进一步澄清,请告诉我!非常感谢你的回答!我现在正在研究它,看看它是否完成了我需要它做的事情。如果是的话,我会标记为已回答,如果它没有达到我希望的效果,我会提供澄清。它成功了!在你的回答中,我唯一需要改变的是从结帐到开帐。谢谢
    Date        Routing        Currency  Account Number  Account Name  BAI Type            BAI Code  CR Amount  DB Amount  Serial Num  Ref Num   Description                                        Balance           
    12/10/2019  123456789      USD       1111111112      A             Miscellaneous Fees  7         NaN        28.69      NaN         69650977  MTHLY ANALYSIS CHARGE                              688.88            
    12/20/2019  123456789      USD       1111111112      A             Misc Credit         1         100        NaN        NaN         70069250  XFR TO DDA FR DDA 001111085716122019RF#1452300...  788.88            
    12/24/2019  123456789      USD       1111111112      A             Misc Debit          4         NaN        69.08      NaN         70184768  ACCESSIBLEINSURA WEBPAYMENTPCOF PROPERTIES SERIES  719.80            
    12/24/2019  123456789      USD       1111111112      A             Misc Debit          5         NaN        2.95       NaN         70184769  SEP INSURANC ACH WEBPAYMENTPCOF PROPERTIES SERIES  716.85            
    12/10/2019  123456789      USD       1111111113      B             Miscellaneous Fees  6         NaN        28.53      NaN         69645166  MTHLY ANALYSIS CHARGE                              1321.77