Python:如何读取列表中的所有元素并从dataframe检索相应的值

Python:如何读取列表中的所有元素并从dataframe检索相应的值,python,pandas,Python,Pandas,我有一个数据帧df,如下所示: loc end_time ts file TPHD 2019-06-03 16:45:30 43619.4375 trial.csv TPCL 2019-06-03 16:30:00 43619.5520 trial.csv TPHD 2019-06-03 16:15:30 43619.6774 trial.csv TPBL 2019-06-03 16:15:30 43619.4479

我有一个数据帧
df
,如下所示:

loc     end_time            ts          file
TPHD    2019-06-03 16:45:30 43619.4375  trial.csv
TPCL    2019-06-03 16:30:00 43619.5520  trial.csv
TPHD    2019-06-03 16:15:30 43619.6774  trial.csv
TPBL    2019-06-03 16:15:30 43619.4479  trial.csv
TPBL    2019-06-03 14:43:45 43619.6982  mgrflash.csv
TPCL    2019-06-03 13:15:00 43619.4375  mgrflash.csv
TPCL    2019-06-03 11:15:30 43619.6875  mgrflash.csv
TPCL    2019-06-03 10:45:00 43619.6137  trial.csv
TPBL    2019-06-03 10:30:00 43619.6774  mgrflash.csv
TPHD    2019-06-03 10:30:00 43619.4690  mgrflash.csv
df_test = df.drop_duplicates(['loc','file'])
location = ['TPCL','TPBL','TPHD']
now_dt = dt.now()
for i in location:
    if i in df_test['loc'].tolist():
        t_update_loc = df_test.loc[df_test['loc']==i,'end_time']
        d = []
        for j in t_update_loc.tolist():
            diff = now_dt - j
            d.append(diff)
loc    time_stamp
TPCL   2019-06-03 16:30:00
TPBL   2019-06-03 16:15:30
TPHD   2019-06-03 16:45:30
目标:我想找到每个位置和文件的
end\u time
和指定时间(如
datetime.datetime.now()
)之间的时差。为此,我尝试了以下方法:

loc     end_time            ts          file
TPHD    2019-06-03 16:45:30 43619.4375  trial.csv
TPCL    2019-06-03 16:30:00 43619.5520  trial.csv
TPHD    2019-06-03 16:15:30 43619.6774  trial.csv
TPBL    2019-06-03 16:15:30 43619.4479  trial.csv
TPBL    2019-06-03 14:43:45 43619.6982  mgrflash.csv
TPCL    2019-06-03 13:15:00 43619.4375  mgrflash.csv
TPCL    2019-06-03 11:15:30 43619.6875  mgrflash.csv
TPCL    2019-06-03 10:45:00 43619.6137  trial.csv
TPBL    2019-06-03 10:30:00 43619.6774  mgrflash.csv
TPHD    2019-06-03 10:30:00 43619.4690  mgrflash.csv
df_test = df.drop_duplicates(['loc','file'])
location = ['TPCL','TPBL','TPHD']
now_dt = dt.now()
for i in location:
    if i in df_test['loc'].tolist():
        t_update_loc = df_test.loc[df_test['loc']==i,'end_time']
        d = []
        for j in t_update_loc.tolist():
            diff = now_dt - j
            d.append(diff)
loc    time_stamp
TPCL   2019-06-03 16:30:00
TPBL   2019-06-03 16:15:30
TPHD   2019-06-03 16:45:30
上述代码显示的结果不正确。事实上,它只拾取
TPHD
值。以上的输出如下:

_libs.tslibs.timedeltas.Timedelta  1     140 days 02:55:06.056170
_libs.tslibs.timedeltas.Timedelta  1     140 days 09:10:36.056170
理想情况下,我希望数据帧如下所示:

loc     end_time            ts          file
TPHD    2019-06-03 16:45:30 43619.4375  trial.csv
TPCL    2019-06-03 16:30:00 43619.5520  trial.csv
TPHD    2019-06-03 16:15:30 43619.6774  trial.csv
TPBL    2019-06-03 16:15:30 43619.4479  trial.csv
TPBL    2019-06-03 14:43:45 43619.6982  mgrflash.csv
TPCL    2019-06-03 13:15:00 43619.4375  mgrflash.csv
TPCL    2019-06-03 11:15:30 43619.6875  mgrflash.csv
TPCL    2019-06-03 10:45:00 43619.6137  trial.csv
TPBL    2019-06-03 10:30:00 43619.6774  mgrflash.csv
TPHD    2019-06-03 10:30:00 43619.4690  mgrflash.csv
df_test = df.drop_duplicates(['loc','file'])
location = ['TPCL','TPBL','TPHD']
now_dt = dt.now()
for i in location:
    if i in df_test['loc'].tolist():
        t_update_loc = df_test.loc[df_test['loc']==i,'end_time']
        d = []
        for j in t_update_loc.tolist():
            diff = now_dt - j
            d.append(diff)
loc    time_stamp
TPCL   2019-06-03 16:30:00
TPBL   2019-06-03 16:15:30
TPHD   2019-06-03 16:45:30

如何获取上述数据帧

如果我理解得很好,您可以简单地使用:


df['diff']=datetime.datetime.now()-df.end\u time
假设
end\u time
是一个datetime对象

导入日期时间


df=df.drop\u duplicates(['loc'])。assign(time\u stamp=lambda x:datetime.datetime.now()-x['end\u time'])

没问题。但是如何获得上面的数据帧。尤其是在运行位置中的for i循环时,为什么没有正确读取
位置
列表为什么需要for循环?在您的预期输出中,为什么每个位置只有一行?我以为你想要一个地址和文件名?请进一步解释如何获得输出df