Python和Pandas-确定账单是否过期

Python和Pandas-确定账单是否过期,python,pandas,time,Python,Pandas,Time,我有一个包含时间序列和分类数据的数据帧 ╔═════════════════════════════════════════════╗ ║ Name BillDate Bill Status ║ ╠═════════════════════════════════════════════╣ ║ Company A 2015-07-22 15:51:00 Paid ║ ║ Company B 2015-01-31 12:0

我有一个包含时间序列和分类数据的数据帧

  ╔═════════════════════════════════════════════╗
  ║ Name       BillDate             Bill Status ║
  ╠═════════════════════════════════════════════╣
  ║ Company A  2015-07-22 15:51:00  Paid        ║
  ║ Company B  2015-01-31 12:01:00  Unpaid      ║
  ║ Company C  2016-01-12 00:00:00  Unpaid      ║
  ╚═════════════════════════════════════════════╝
我正试图添加另一个栏目,告诉我账单是否过期,原因有两个。第一个因素是当前日期是否为BillDate+180天或更长时间,第二个因素是Bill状态是否为“未付”

我可能对如何做到这一点很感兴趣。我的想法是做以下几点:

   billpayperiod = timedelta(days = 180)
   currentdate = datetime.now()
   df['Bill Due Date'] = df['BillDate'].apply(lambda x: x + billpayperiod)
然后创建一些函数来检查

 currendate > Bill Due Date and Bill Status = unpaid. 
 If True = Overdue
 If False = No Due,
 If Bill Status = paid, then Paid. 
我感谢你对以下方面的想法: 1.这种方法有意义吗 2.帮助创建执行检查的函数

因为我在excel中的表现要好得多,所以我会使用它:

  Create the Bill Date + 180 column (name it DueDate
  Set a cell = currentdate
  Create a new column: formula    IF(BillStatus="Paid","Paid",IF(AND(BillStatus="Unpaid",currentdate>DueDate),"Overdue","Not Overdue"))    

IIUC这将满足您的要求:

In [21]:
df[(((df['BillDate'] - dt.datetime.now()).dt.days).abs() > 180) & (df['Bill Status'] == 'Unpaid')]

Out[21]:
        Name            BillDate Bill Status
1  Company B 2015-01-31 12:01:00      Unpaid
我们可以调用TimeDelta并比较绝对值:

In [25]:
(df['BillDate'] - dt.datetime.now()).dt.days

Out[25]:
0   -182
1   -354
2     -8
Name: BillDate, dtype: int64

In [24]:
(df['BillDate'] - dt.datetime.now()).dt.days
((df['BillDate'] - dt.datetime.now()).dt.days).abs()

Out[24]:
0    182
1    354
2      8
Name: BillDate, dtype: int64
编辑

要设置新状态,您可以定义几个掩码并使用
np。其中

In [29]:
import pandas as pd
import numpy as np
import datetime as dt
overdue = (((df['BillDate'] - dt.datetime.now()).dt.days).abs() > 180) & (df['Bill Status'] == 'Unpaid')
paid = (df['Bill Status'] == 'Paid')
df['new status'] = np.where(paid, 'paid', np.where(overdue, 'overdue', 'no due'))
df
​
Out[29]:
        Name            BillDate Bill Status new status
0  Company A 2015-07-22 15:51:00        Paid       paid
1  Company B 2015-01-31 12:01:00      Unpaid    overdue
2  Company C 2016-01-12 00:00:00      Unpaid     no due

您可以使用以下命令轻松地在pandas中添加列:

#create columns 'newStatus' and set default to No due
df['newStatus'] = 'No Due'
然后可以使用.loc和上面答案中的索引将其设置为特定值

df.loc[indices,column] = value
例如:

#create indices for unpaid bills, and for bills that are due
iUnpaid = df['Bill Status']=='Unpaid'
iDue = (((df['BillDate'] - dt.datetime.now()).dt.days).abs() > 180)

#update corresponding values
df.loc[iUnpaid & iDue,'newStatus'] = 'Due'
df.loc[iUnpaid & ~iDue,'newStatus'] = 'No Due'

向我们展示创建数据框的代码。我实际上是从excel中提取数据,关于如何创建一个类似于上面所述的数据框架,有什么建议或入门知识吗?你的第二个要求没有任何意义,你有一个条件,该条件将设置为逾期,否则不到期,但你再次使用相同的条件说它已支付???谢谢你指出这一点。如果账单状态=已付,那么应该是已付。好的,更新我的回答我正在尝试做的事情,像这样的df[((df['BillDate']-dt.datetime.now()).dt.days.abs()>180)和(df['Bill Status']='Unpaid')创建了一个字符串“已付”、“过期”或“未到期”基于显示的因素。想法?感谢编辑。为我的疏忽道歉,但我在尝试运行代码时遇到错误。我的错误是名称“dt”未定义我将熊猫导入为pd,从datetime导入datetime,timedelta还有什么我需要导入以使代码正常工作的吗?尝试
importt datetime as dt
非常感谢您的帮助!