Python:如何读取列表中的所有元素并从dataframe检索相应的值
我有一个数据帧Python:如何读取列表中的所有元素并从dataframe检索相应的值,python,pandas,Python,Pandas,我有一个数据帧df,如下所示: loc end_time ts file TPHD 2019-06-03 16:45:30 43619.4375 trial.csv TPCL 2019-06-03 16:30:00 43619.5520 trial.csv TPHD 2019-06-03 16:15:30 43619.6774 trial.csv TPBL 2019-06-03 16:15:30 43619.4479
df
,如下所示:
loc end_time ts file
TPHD 2019-06-03 16:45:30 43619.4375 trial.csv
TPCL 2019-06-03 16:30:00 43619.5520 trial.csv
TPHD 2019-06-03 16:15:30 43619.6774 trial.csv
TPBL 2019-06-03 16:15:30 43619.4479 trial.csv
TPBL 2019-06-03 14:43:45 43619.6982 mgrflash.csv
TPCL 2019-06-03 13:15:00 43619.4375 mgrflash.csv
TPCL 2019-06-03 11:15:30 43619.6875 mgrflash.csv
TPCL 2019-06-03 10:45:00 43619.6137 trial.csv
TPBL 2019-06-03 10:30:00 43619.6774 mgrflash.csv
TPHD 2019-06-03 10:30:00 43619.4690 mgrflash.csv
df_test = df.drop_duplicates(['loc','file'])
location = ['TPCL','TPBL','TPHD']
now_dt = dt.now()
for i in location:
if i in df_test['loc'].tolist():
t_update_loc = df_test.loc[df_test['loc']==i,'end_time']
d = []
for j in t_update_loc.tolist():
diff = now_dt - j
d.append(diff)
loc time_stamp
TPCL 2019-06-03 16:30:00
TPBL 2019-06-03 16:15:30
TPHD 2019-06-03 16:45:30
目标:我想找到每个位置和文件的end\u time
和指定时间(如datetime.datetime.now()
)之间的时差。为此,我尝试了以下方法:
loc end_time ts file
TPHD 2019-06-03 16:45:30 43619.4375 trial.csv
TPCL 2019-06-03 16:30:00 43619.5520 trial.csv
TPHD 2019-06-03 16:15:30 43619.6774 trial.csv
TPBL 2019-06-03 16:15:30 43619.4479 trial.csv
TPBL 2019-06-03 14:43:45 43619.6982 mgrflash.csv
TPCL 2019-06-03 13:15:00 43619.4375 mgrflash.csv
TPCL 2019-06-03 11:15:30 43619.6875 mgrflash.csv
TPCL 2019-06-03 10:45:00 43619.6137 trial.csv
TPBL 2019-06-03 10:30:00 43619.6774 mgrflash.csv
TPHD 2019-06-03 10:30:00 43619.4690 mgrflash.csv
df_test = df.drop_duplicates(['loc','file'])
location = ['TPCL','TPBL','TPHD']
now_dt = dt.now()
for i in location:
if i in df_test['loc'].tolist():
t_update_loc = df_test.loc[df_test['loc']==i,'end_time']
d = []
for j in t_update_loc.tolist():
diff = now_dt - j
d.append(diff)
loc time_stamp
TPCL 2019-06-03 16:30:00
TPBL 2019-06-03 16:15:30
TPHD 2019-06-03 16:45:30
上述代码显示的结果不正确。事实上,它只拾取TPHD
值。以上的输出如下:
_libs.tslibs.timedeltas.Timedelta 1 140 days 02:55:06.056170
_libs.tslibs.timedeltas.Timedelta 1 140 days 09:10:36.056170
理想情况下,我希望数据帧如下所示:
loc end_time ts file
TPHD 2019-06-03 16:45:30 43619.4375 trial.csv
TPCL 2019-06-03 16:30:00 43619.5520 trial.csv
TPHD 2019-06-03 16:15:30 43619.6774 trial.csv
TPBL 2019-06-03 16:15:30 43619.4479 trial.csv
TPBL 2019-06-03 14:43:45 43619.6982 mgrflash.csv
TPCL 2019-06-03 13:15:00 43619.4375 mgrflash.csv
TPCL 2019-06-03 11:15:30 43619.6875 mgrflash.csv
TPCL 2019-06-03 10:45:00 43619.6137 trial.csv
TPBL 2019-06-03 10:30:00 43619.6774 mgrflash.csv
TPHD 2019-06-03 10:30:00 43619.4690 mgrflash.csv
df_test = df.drop_duplicates(['loc','file'])
location = ['TPCL','TPBL','TPHD']
now_dt = dt.now()
for i in location:
if i in df_test['loc'].tolist():
t_update_loc = df_test.loc[df_test['loc']==i,'end_time']
d = []
for j in t_update_loc.tolist():
diff = now_dt - j
d.append(diff)
loc time_stamp
TPCL 2019-06-03 16:30:00
TPBL 2019-06-03 16:15:30
TPHD 2019-06-03 16:45:30
如何获取上述数据帧 如果我理解得很好,您可以简单地使用:
df['diff']=datetime.datetime.now()-df.end\u time
假设end\u time
是一个datetime对象
导入日期时间
df=df.drop\u duplicates(['loc'])。assign(time\u stamp=lambda x:datetime.datetime.now()-x['end\u time'])
没问题。但是如何获得上面的数据帧。尤其是在运行位置中的for i循环时,为什么没有正确读取位置
列表为什么需要for循环?在您的预期输出中,为什么每个位置只有一行?我以为你想要一个地址和文件名?请进一步解释如何获得输出df