Python 基于日期列差异筛选数据帧

Python 基于日期列差异筛选数据帧,python,pandas,datetime,Python,Pandas,Datetime,根据两个日期列之间的差异过滤数据帧的最有效方法是什么 例如,基于以下数据帧: CADASTRO RESPOSTA EVAL 0 2021-06-01 2021-06-13 y 1 2021-06-01 2021-06-13 y 2 2021-06-01 2021-06-18 y 3 2021-06-01 2021-06-09 n 4 2021-06-01 2021-06-20 n 5 2021-06-01 2021-0

根据两个日期列之间的差异过滤数据帧的最有效方法是什么

例如,基于以下数据帧:

   CADASTRO    RESPOSTA      EVAL 
0  2021-06-01  2021-06-13    y
1  2021-06-01  2021-06-13    y
2  2021-06-01  2021-06-18    y
3  2021-06-01  2021-06-09    n
4  2021-06-01  2021-06-20    n
5  2021-06-01  2021-06-20    n
如何对其进行筛选,使其仅包含
RESPOSTA
列和
CADASTRO
列之间的差异小于15天的记录?我尝试了以下方法,但没有成功:

import datetime
filtered_df = df[(df.RESPOSTA - df.CADASTRO).days < 15]

通过日期时间访问器访问天数
dt

# Ensure DateTime
df['CADASTRO'] = pd.to_datetime(df['CADASTRO'])
df['RESPOSTA'] = pd.to_datetime(df['RESPOSTA'])
# Access Days through dt.days
filtered_df = df[(df.RESPOSTA - df.CADASTRO).dt.days < 15]

通过日期时间访问器访问天数
dt

# Ensure DateTime
df['CADASTRO'] = pd.to_datetime(df['CADASTRO'])
df['RESPOSTA'] = pd.to_datetime(df['RESPOSTA'])
# Access Days through dt.days
filtered_df = df[(df.RESPOSTA - df.CADASTRO).dt.days < 15]

使用
timedelta
比较天数差异

from datetime import timedelta

df = pd.DataFrame({
    'CADASTRO': ['2021-06-01', '2021-06-01', '2021-06-01', '2021-06-01', '2021-06-01', '2021-06-01'], 
    'RESPOSTA': ['2021-06-13', '2021-06-13', '2021-06-18', '2021-06-09', '2021-06-20', '2021-06-20'],
    'EVAL': ['y', 'y', 'y', 'n', 'n', 'n']
})

df['CADASTRO'] = pd.to_datetime(df['CADASTRO'])
df['RESPOSTA'] = pd.to_datetime(df['RESPOSTA'])
df['temp'] = df['RESPOSTA'] - df['CADASTRO']
df['temp'] = df['temp'].apply(lambda x: 0 if x < timedelta(days=15) else 1)
filtered_df = df.drop(df[df['temp']==0].index).drop(columns=['temp'])
从日期时间导入时间增量
df=pd.DataFrame({
“地籍登记簿”:['2021-06-01','2021-06-01','2021-06-01','2021-06-01','2021-06-01','2021-06-01'],
“RESPOSTA”:['2021-06-13','2021-06-13','2021-06-18','2021-06-09','2021-06-20','2021-06-20'],
‘EVAL’:['y','y','n','n','n']
})
df['CADASTRO']=pd.to_datetime(df['CADASTRO']]
df['RESPOSTA']=pd.to_datetime(df['RESPOSTA'])
df['temp']=df['RESPOSTA']-df['CADASTRO']
df['temp']=df['temp'].应用(如果x
输出滤波


使用
timedelta
比较天数差异

from datetime import timedelta

df = pd.DataFrame({
    'CADASTRO': ['2021-06-01', '2021-06-01', '2021-06-01', '2021-06-01', '2021-06-01', '2021-06-01'], 
    'RESPOSTA': ['2021-06-13', '2021-06-13', '2021-06-18', '2021-06-09', '2021-06-20', '2021-06-20'],
    'EVAL': ['y', 'y', 'y', 'n', 'n', 'n']
})

df['CADASTRO'] = pd.to_datetime(df['CADASTRO'])
df['RESPOSTA'] = pd.to_datetime(df['RESPOSTA'])
df['temp'] = df['RESPOSTA'] - df['CADASTRO']
df['temp'] = df['temp'].apply(lambda x: 0 if x < timedelta(days=15) else 1)
filtered_df = df.drop(df[df['temp']==0].index).drop(columns=['temp'])
从日期时间导入时间增量
df=pd.DataFrame({
“地籍登记簿”:['2021-06-01','2021-06-01','2021-06-01','2021-06-01','2021-06-01','2021-06-01'],
“RESPOSTA”:['2021-06-13','2021-06-13','2021-06-18','2021-06-09','2021-06-20','2021-06-20'],
‘EVAL’:['y','y','n','n','n']
})
df['CADASTRO']=pd.to_datetime(df['CADASTRO']]
df['RESPOSTA']=pd.to_datetime(df['RESPOSTA'])
df['temp']=df['RESPOSTA']-df['CADASTRO']
df['temp']=df['temp'].应用(如果x
输出滤波


filtered_df=df.loc[(df.RESPOSTA-df.CADASTRO).days<15,:]
?谢谢,但这不起作用。检查下面亨利的答案,这正是我想要的。
过滤的_df=df.loc[(df.RESPOSTA-df.CADASTRO)。天<15,:]
?谢谢,但这不起作用。看看下面亨利的答案,这正是我想要的。