Python 在数据帧中查找丢失的时间戳
我在dataframe中有以下数据集Python 在数据帧中查找丢失的时间戳,python,pandas,datetime,timestamp,Python,Pandas,Datetime,Timestamp,我在dataframe中有以下数据集 Time_stamp x y '2012-01-01 00:00:00' 8.97 1310.03 '2012-01-01 00:10:00' 9.91 1684.52 '2012-01-01 00:40:00' 9.64 1532.05 '2012-01-01 00:50:00' 11.84 1997.87 '2012-01-01
Time_stamp x y
'2012-01-01 00:00:00' 8.97 1310.03
'2012-01-01 00:10:00' 9.91 1684.52
'2012-01-01 00:40:00' 9.64 1532.05
'2012-01-01 00:50:00' 11.84 1997.87
'2012-01-01 00:60:00' 11.69 2135.76
'2012-01-01 01:00:00' 12.14 2149.54
'2012-01-01 01:10:00' 13.43 2056.35
'2012-01-01 01:20:00' 9.88 1633.45
'2012-01-01 01:30:00' 9.01 1315.85
'2012-01-01 01:50:00' 8.33 1141.84
如您所见,每10分钟记录一次数据。但是,缺少时间戳及其相应的值,例如,'2012-01-01 00:20:00'
和'2012-01-01 00:30:00'
。我想找到这样丢失的时间戳,并用nan
替换它们相应的值。像这样的
timestamp x y
`'2012-01-01 00:20:00'` nan nan
`'2012-01-01 00:30:00'` nan nan
任何关于如何在没有太多代码行的情况下高效地执行此操作的想法。首先将值转换为日期时间,
2012-01-01 00:60:00
中的60Min
无效,因此替换为NaT
,删除错误值NaT
,然后创建DatetimeIndex
,并通过以下方式添加缺少的日期时间:
df['Time_stamp'] = pd.to_datetime(df['Time_stamp'].str.strip("'"), errors='coerce')
df = df.dropna(subset=['Time_stamp']).set_index('Time_stamp').asfreq('10Min')
print (df)
x y
Time_stamp
2012-01-01 00:00:00 8.97 1310.03
2012-01-01 00:10:00 9.91 1684.52
2012-01-01 00:20:00 NaN NaN
2012-01-01 00:30:00 NaN NaN
2012-01-01 00:40:00 9.64 1532.05
2012-01-01 00:50:00 11.84 1997.87
2012-01-01 01:00:00 12.14 2149.54
2012-01-01 01:10:00 13.43 2056.35
2012-01-01 01:20:00 9.88 1633.45
2012-01-01 01:30:00 9.01 1315.85
2012-01-01 01:40:00 NaN NaN
2012-01-01 01:50:00 8.33 1141.84