Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/15.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 在dataframe中的datetime列中查找时隙数_Python_Python 3.x_Pandas_Datetime_Dataframe - Fatal编程技术网

Python 在dataframe中的datetime列中查找时隙数

Python 在dataframe中的datetime列中查找时隙数,python,python-3.x,pandas,datetime,dataframe,Python,Python 3.x,Pandas,Datetime,Dataframe,我的问题与我先前的问题有关。但是,它是不同的。所以我创建了一个新的职位 我想在pandas dataframe的datetime列中找到按“id1”分组的10分钟持续时间 我的桌子: id1 date_time adress a_size reom 2005-8-20 21:51:10 75157.5413 ceifwekd reom 2005-8-20 22:51:10

我的问题与我先前的问题有关。但是,它是不同的。所以我创建了一个新的职位

我想在pandas dataframe的datetime列中找到按“id1”分组的10分钟持续时间

我的桌子:

 id1       date_time               adress       a_size        
 reom      2005-8-20 21:51:10      75157.5413   ceifwekd    
 reom      2005-8-20 22:51:10      3571.37946   ceifwekd    
 reom      2005-8-20 11:21:01      3571.37946   tnohcve     
 reom      2005-8-20 11:31:05      97439.219    tnohcve     
 penr      2005-8-20 17:07:16     97439.219    ceifwekd     
 penr      2005-8-20 19:10:37      7391.6258    ceifwekd    
 ....
我需要

 id1       date_time               adress       a_size        10mins_num_by_id1
 reom      2005-8-20 21:51:10      75157.5413   ceifwekd    7
 reom      2005-8-20 21:56:10      3571.37946   ceifwekd    7
 reom      2005-8-20 22:21:01      3571.37946   tnohcve     7
 reom      2005-8-20 22:51:11      97439.219    tnohcve     7
 penr      2005-8-20 17:07:16     97439.219    ceifwekd     2
 penr      2005-8-20 17:17:37      7391.6258    ceifwekd    2
 ....
为了

我得到了7,因为从21:51:10到22:51:11,它有7个10分钟的时间段,按“id1”分组

为了

我得到了2分,因为从17:07:16到17:17:37有2个10分钟的时间段,按“id1”分组

我的代码:

 df['10_min'] = df.groupby(['id1']).apply(lambda x: x['date_time'].dt.floor('10Min').count())
但我在新专栏上找到了NaN

感谢使用最大和最小
datetime
s之间的差异,然后使用时间增量并将其转换为
10Min
s时隙:

df['date_time'] = pd.to_datetime(df['date_time'])

df['new'] = (df.groupby('id1')['date_time']
               .transform(lambda x: x.max() - x.min())
               .dt.ceil('10Min')
               .dt.total_seconds()
               .div(600)
               .astype(int))
print (df)

    id1           date_time       adress    a_size  new
0  reom 2005-08-20 21:51:10  75157.54130  ceifwekd    7
1  reom 2005-08-20 22:51:10   3571.37946  ceifwekd    7
2  reom 2005-08-20 22:21:01   3571.37946   tnohcve    7
3  reom 2005-08-20 22:51:11  97439.21900   tnohcve    7
4  penr 2005-08-20 17:07:16  97439.21900  ceifwekd    2
5  penr 2005-08-20 17:17:37   7391.62580  ceifwekd    2

我们可以将
groupby
transform
一起使用,得到
max-min
,然后除以10分钟。最后,我们使用
numpy.ceil
来总结:

df['10mins_num_by_id1'] = np.ceil(df.groupby(['id1'])['date_time']\
                                 .transform(lambda x: x.max() - x.min()) / pd.Timedelta('10 minutes'))

print(df)

print(df)
    id1           date_time       adress    a_size  10mins_num_by_id1
0  reom 2005-08-20 21:51:10  75157.54130  ceifwekd                7.0
1  reom 2005-08-20 22:56:10   3571.37946  ceifwekd                7.0
2  reom 2005-08-20 22:21:01   3571.37946   tnohcve                7.0
3  reom 2005-08-20 22:51:11  97439.21900   tnohcve                7.0
4  penr 2005-08-20 17:07:16  97439.21900  ceifwekd                2.0
5  penr 2005-08-20 17:17:37   7391.62580  ceifwekd                2.0
df['date_time'] = pd.to_datetime(df['date_time'])

df['new'] = (df.groupby('id1')['date_time']
               .transform(lambda x: x.max() - x.min())
               .dt.ceil('10Min')
               .dt.total_seconds()
               .div(600)
               .astype(int))
print (df)

    id1           date_time       adress    a_size  new
0  reom 2005-08-20 21:51:10  75157.54130  ceifwekd    7
1  reom 2005-08-20 22:51:10   3571.37946  ceifwekd    7
2  reom 2005-08-20 22:21:01   3571.37946   tnohcve    7
3  reom 2005-08-20 22:51:11  97439.21900   tnohcve    7
4  penr 2005-08-20 17:07:16  97439.21900  ceifwekd    2
5  penr 2005-08-20 17:17:37   7391.62580  ceifwekd    2
df['10mins_num_by_id1'] = np.ceil(df.groupby(['id1'])['date_time']\
                                 .transform(lambda x: x.max() - x.min()) / pd.Timedelta('10 minutes'))

print(df)

print(df)
    id1           date_time       adress    a_size  10mins_num_by_id1
0  reom 2005-08-20 21:51:10  75157.54130  ceifwekd                7.0
1  reom 2005-08-20 22:56:10   3571.37946  ceifwekd                7.0
2  reom 2005-08-20 22:21:01   3571.37946   tnohcve                7.0
3  reom 2005-08-20 22:51:11  97439.21900   tnohcve                7.0
4  penr 2005-08-20 17:07:16  97439.21900  ceifwekd                2.0
5  penr 2005-08-20 17:17:37   7391.62580  ceifwekd                2.0