Python 如何从设定的日期仅获取7天的数据?
我有一个这样的数据集(见下文),我删除了大多数列,只保留了那些需要操作的列:-Python 如何从设定的日期仅获取7天的数据?,python,pandas,Python,Pandas,我有一个这样的数据集(见下文),我删除了大多数列,只保留了那些需要操作的列:- shortId Created_date pid1 pid2 Game_Play_Date abc 01-05-19 abc def 01-05-19 abc 01-05-19 abc pqr 01-05-19 abc 01-05-19 xyz abc 02-05-19 abc 01-05-19 qwe abc 03-05-19 abc 01-05-19 pqr abc
shortId Created_date pid1 pid2 Game_Play_Date
abc 01-05-19 abc def 01-05-19
abc 01-05-19 abc pqr 01-05-19
abc 01-05-19 xyz abc 02-05-19
abc 01-05-19 qwe abc 03-05-19
abc 01-05-19 pqr abc 04-05-19
xyz 02-05-19 def xyz 02-05-19
xyz 02-05-19 pqr xyz 07-05-19
xyz 02-05-19 xyz pqr 07-05-19
xyz 02-05-19 xyz abc 15-05-19
xyz 02-05-19 xyz def 21-05-19
我需要从创建日期起7天内获取数据,因此,如果ID已于2019年5月1日创建,我需要2019年5月7日之前的数据,该数据将在游戏日期栏中给出,以此类推
我曾尝试将数据分为30天的部分,但这确实令人困惑,并不理想
理想的结果是这样的(每个shortId只剩下7天的数据,具体取决于游戏时间)
首先将这两列转换为datetimes,然后减去天数并与之进行比较,或者通过
Timedelta
进行比较,然后通过以下方式进行过滤:
cols=['Created_date','Game_Play_date']
df[cols]=df[cols].apply(pd.to_datetime,dayfirst=True)
df=df[df['Game\u Play\u Date'].sub(df['Created\u Date']).dt.days
Created_date shortId pid1 pid2 Game_Play_Date
01-05-19 abc abc def 01-05-19
01-05-19 abc abc pqr 01-05-19
01-05-19 abc xyz abc 02-05-19
01-05-19 abc qwe abc 03-05-19
01-05-19 abc pqr abc 04-05-19
02-05-19 xyz def xyz 02-05-19
02-05-19 xyz pqr xyz 07-05-19
02-05-19 xyz xyz pqr 07-05-19
cols = ['Created_date','Game_Play_Date']
df[cols] = df[cols].apply(pd.to_datetime, dayfirst=True)
df = df[df['Game_Play_Date'].sub(df['Created_date']).dt.days <=7]
#alternative
#df = df[df['Game_Play_Date'].sub(df['Created_date']) <=pd.Timedelta('7 days')]
print (df)
shortId Created_date pid1 pid2 Game_Play_Date
0 abc 2019-05-01 abc def 2019-05-01
1 abc 2019-05-01 abc pqr 2019-05-01
2 abc 2019-05-01 xyz abc 2019-05-02
3 abc 2019-05-01 qwe abc 2019-05-03
4 abc 2019-05-01 pqr abc 2019-05-04
5 xyz 2019-05-02 def xyz 2019-05-02
6 xyz 2019-05-02 pqr xyz 2019-05-07
7 xyz 2019-05-02 xyz pqr 2019-05-07