Python 当满足条件时，如何查找第一行和第二行之间的时差？_Python_Pandas

Python 当满足条件时，如何查找第一行和第二行之间的时差？

python pandas

Python 当满足条件时，如何查找第一行和第二行之间的时差？,python,pandas,Python,Pandas,我需要找到开始和col1超过值20的时刻之间的时差（以分钟为单位）对于以下数据，答案应为72分钟（从20:00:19到21:12:00） df： date_time col1 2018-03-04 20:00:19 9 2018-03-04 21:10:00 13 2018-03-04 21:12:00 21 2018-03-04 21:15:00 25 我怎么做？这是我当前的代码片段： df.index = pd.to_datetime(df['date_t

我需要找到开始和

col1

超过值

的时刻之间的时差（以分钟为单位）

对于以下数据，答案应为72分钟（从20:00:19到21:12:00）

df

：

date_time            col1
2018-03-04 20:00:19  9
2018-03-04 21:10:00  13
2018-03-04 21:12:00  21
2018-03-04 21:15:00  25

我怎么做？这是我当前的代码片段：

df.index = pd.to_datetime(df['date_time'])
start = df.index[0]
row_id = df.index[df['col1'] > 20]
time_val = start - df.index[row_id]

一艘班轮：

ans = pd.to_datetime(df.groupby(df.col1>20).first().date_time).diff().dt.total_seconds()/60

输出：你可以试试这个：

for index, row in df1.iterrows():
    if row['col'] > 20:
        total_seconds = int((df1['date_time'][0] - row['date_time']).total_seconds())
        minutes, remainder = divmod(total_seconds,60)
        print('{} mins'.format(minutes))
        break

将列转换为所需输出后：

df.date_time=pd.to_datetime(df.date_time)
df.col1=pd.to_numeric(df.col1)
id=df[df.col1>20].col1.idxmin()
diff=(df.iloc[id].date_time-df.iloc[0].date_time).seconds/60

假设

'date\u time'

是数据类型datetime。我们可以使用

diff

获得

Timedelta

和

cumsum

获得累计

Timedelta

。然后我们可以在

df.col1.gt（20）

Timedelta

有一个

total_seconds

方法，可以除以

df.date_time.diff().fillna(0).cumsum()[df.col1.gt(20).idxmax()].total_seconds() / 60

71.68333333333334

或者您可以除以另一个

Timedelta

df.date_time.diff().fillna(0).cumsum()[df.col1.gt(20).idxmax()] / pd.Timedelta(1, unit='m')

71.68333333333334

IIUC我正在使用

ptp

df.loc[df.col1.le(20).shift().cumprod().ne(0),'date_time'].ptp()
Out[1232]: Timedelta('0 days 01:11:41')

您能打印预期的输出吗？真正的问题，为什么

idxmax（）

？那不是最大的超过20的邮票吗？（vs返回最小值，这是我们想要的）@Yuca

idxmax

返回第一个出现的最大值，它在布尔数组中是第一个

True

这种方法让我心碎，非常酷，谢谢分享@ALollz只是max-min的简短版本：-）我选择了你的答案，因为对我来说，详细的解决方案比单行的解决方案更可取。谢谢@Tatik，是的，有时候能够理解答案会更好！谢谢很好的解决方案，尤卡。虽然我选择了另一种解决方案，因为sinlge-line方法不太适合我的情况（可读性差）。明智的选择，快乐的编码

df.date_time.diff().fillna(0).cumsum()[df.col1.gt(20).idxmax()] / pd.Timedelta(1, unit='m')

71.68333333333334

df.loc[df.col1.le(20).shift().cumprod().ne(0),'date_time'].ptp()
Out[1232]: Timedelta('0 days 01:11:41')