Python 从具有不同长度值的字典生成多索引数据帧

Python 从具有不同长度值的字典生成多索引数据帧,python,pandas,Python,Pandas,我有以下字典: dic = {'T1':["2013-11-12 17:35:00", "2013-11-12 17:36:00", "2013-11-12 17:37:00", "2013-11-12 17:38:00", "2013-11-12 17:40:00", "2013-11-12 17:41:00", "2013-11-12 17:42:00"], 'T2':["2013-11-12 12:15:00", "2013-11-12 12:16:00",

我有以下字典:

dic = {'T1':["2013-11-12 17:35:00", "2013-11-12 17:36:00", "2013-11-12 17:37:00", "2013-11-12 17:38:00", 
               "2013-11-12 17:40:00", "2013-11-12 17:41:00", "2013-11-12 17:42:00"], 'T2':["2013-11-12 12:15:00", "2013-11-12 12:16:00", "2013-11-13 16:32:00", "2013-11-13 16:33:00", 
               "2013-11-13 16:34:00"]}
我想从中生成以下
多索引
数据帧:

                      T1                                            T2
         Start                   Stop                   Start                Stop
   2013-11-12 17:35:00  2013-11-12 17:38:00     2013-11-12 12:15:00  2013-11-12 12:16:00
   2013-11-12 17:40:00  2013-11-12 17:42:00     2013-11-13 16:32:00  2013-11-13 16:34:00
数据帧描述的是传感器T1或T2的一些事件开始和结束的时间。如果两次事件之间的时间差小于1分钟,我假设是同一事件继续,而当该时间差大于1分钟时,则表示新事件开始


非常感谢您的帮助:)

您可以计算连续时间戳之间的差异,并在差异不超过1分钟时形成一个掩码:

df['mask'] = (df[key].diff() / np.timedelta64(1, 'm')) != 1
然后获取掩码的总和,以确定哪些行属于哪个组:

df['group'] = df['mask'].cumsum()
产生如下结果:

                   T2   mask  group
0 2013-11-12 12:15:00   True      1
1 2013-11-12 12:16:00  False      1
2 2013-11-13 16:32:00   True      2
3 2013-11-13 16:33:00  False      2
4 2013-11-13 16:34:00  False      2

                   T1  mask  group
0 2013-11-12 17:38:00  True      1
1 2013-11-12 17:40:00  True      2
2 2013-11-12 17:42:00  True      3
现在,按
group
列分组,并为每个组找到第一个和最后一个时间戳:

result[key] = df.groupby(['group'])[key].agg(['first', 'last'])

屈服

                       T1                                      T2                    
                    Start                Stop               Start                Stop
group                                                                                
1     2013-11-12 17:35:00 2013-11-12 17:38:00 2013-11-12 12:15:00 2013-11-12 12:16:00
2     2013-11-12 17:40:00 2013-11-12 17:42:00 2013-11-13 16:32:00 2013-11-13 16:34:00

如果dic={code>T1':[“2013-11-12 17:38:00”,“2013-11-12 17:40:00”,“2013-11-12 17:42:00”]}?@unutbu我们总共会有3起事故,所以应该是:开始:2013-11-12 17:38:00----停止:2013-11-12 17:38:00,开始:2013-11-12 17:40:00----停止:2013-11-12 17:40:00,开始:2013-11-12 17:42:00----停止:2013-11-12 17:42:00
                       T1                                      T2                    
                    Start                Stop               Start                Stop
group                                                                                
1     2013-11-12 17:35:00 2013-11-12 17:38:00 2013-11-12 12:15:00 2013-11-12 12:16:00
2     2013-11-12 17:40:00 2013-11-12 17:42:00 2013-11-13 16:32:00 2013-11-13 16:34:00