查找python中出现的时间_Python_Pandas

查找python中出现的时间

python pandas

查找python中出现的时间,python,pandas,Python,Pandas,我有一个来自cvs的数据帧我想知道在23:00到23:50之间，列“First”的行在什么时候变为0的概率更大 Date First Second 0 2019-01-09 22:59:00 0 20 1 2019-01-09 23:04:00 14 32 2 2019-01-09 23:10:00 9 27 3 2019-01-09 23:11:00 7

我有一个来自cvs的数据帧

我想知道在23:00到23:50之间，列“First”的行在什么时候变为0的概率更大

                      Date First Second
0      2019-01-09 22:59:00     0     20
1      2019-01-09 23:04:00    14     32
2      2019-01-09 23:10:00     9     27
3      2019-01-09 23:11:00     7     27
4      2019-01-09 23:12:00     7     26
5      2019-01-09 23:13:00     7     26
6      2019-01-09 23:14:00     7     25
7      2019-01-09 23:15:00     6     25
8      2019-01-09 23:16:00     5     23
9      2019-01-09 23:17:00     4     22
10     2019-01-09 23:18:00     3     22
...                    ...   ...    ...
134761 2019-05-05 21:20:00    18     36
134762 2019-05-05 21:21:00    16     35
134763 2019-05-05 21:22:00    15     34
134764 2019-05-05 21:23:00    14     33

我使用此代码选择预期时间：

heure = df.set_index('Date').between_time('23:00:00','23:50:00')

但我无法抽出时间

如果您有任何建议：）

谢谢

Robin

使用访问器怎么样？为您的用例更新了端到端示例

import pandas as pd
import numpy as np

df = pd.DataFrame(
    {
        'date': [
            pd.to_datetime('2019-01-09 22:59:00'),
            pd.to_datetime('2019-01-09 23:00:00'),
            pd.to_datetime('2019-01-09 23:49:59'),
            pd.to_datetime('2019-01-09 23:50:00'),
            pd.to_datetime('2019-01-09 23:51:00'),
        ],
        'value': [0, 0, 5, 6, 1]
    }        
)

# A mask to split the datset into two groups, based on the time.

df['in_range'] = np.where((df['date'].dt.hour == 23) & (df['date'].dt.minute < 50), 'In Range', 'Out of Range')

# A column that tests the condition you mentioned

df['condition'] = df['value'] == 0

# Group and get the average, which is the likelihood that value == 0, per group.

print(df.groupby('in_range')['condition'].mean())

使用访问器怎么样？为您的用例更新了端到端示例

import pandas as pd
import numpy as np

df = pd.DataFrame(
    {
        'date': [
            pd.to_datetime('2019-01-09 22:59:00'),
            pd.to_datetime('2019-01-09 23:00:00'),
            pd.to_datetime('2019-01-09 23:49:59'),
            pd.to_datetime('2019-01-09 23:50:00'),
            pd.to_datetime('2019-01-09 23:51:00'),
        ],
        'value': [0, 0, 5, 6, 1]
    }        
)

# A mask to split the datset into two groups, based on the time.

df['in_range'] = np.where((df['date'].dt.hour == 23) & (df['date'].dt.minute < 50), 'In Range', 'Out of Range')

# A column that tests the condition you mentioned

df['condition'] = df['value'] == 0

# Group and get the average, which is the likelihood that value == 0, per group.

print(df.groupby('in_range')['condition'].mean())

基于时间的过滤器。然后找到第一个为0的最常见时间

try:
    (df.set_index('Date').between_time('23:00:00','23:50:00').reset_index()
       .loc[lambda x: x.First == 0].Date.dt.time.value_counts().index[0])
except IndexError:
    print('No matches')

这将返回一个

datetime.time

，或者对于示例数据，它将打印不匹配的内容，因为指定时间之间没有0。

基于时间进行筛选。然后找到第一个为0的最常见时间

try:
    (df.set_index('Date').between_time('23:00:00','23:50:00').reset_index()
       .loc[lambda x: x.First == 0].Date.dt.time.value_counts().index[0])
except IndexError:
    print('No matches')

这将返回一个

datetime.time

，或者对于示例数据，它将打印不匹配，因为在指定的时间之间没有0。

您应该首先将“日期”列数据转换为datetime类型，并且您可以使用@smj提到的

dt

应用索引方法

import pandas as pd

df = pd.read_csv('./sample.csv')
df['Date'] = pd.to_datetime(df['Date'])
print df[(df['Date'].dt.hour == 23) & (df['Date'].dt.minute < 50)]

将熊猫作为pd导入
df=pd.read_csv（'./sample.csv'）
df['Date']=pd.to_datetime（df['Date']）
打印df[（df['Date'].dt.hour==23）和（df['Date'].dt.minute<50）]

您应该首先将“日期”列数据转换为日期时间类型，您可以使用@smj提到的

dt

应用索引方法

import pandas as pd

df = pd.read_csv('./sample.csv')
df['Date'] = pd.to_datetime(df['Date'])
print df[(df['Date'].dt.hour == 23) & (df['Date'].dt.minute < 50)]

将熊猫作为pd导入
df=pd.read_csv（'./sample.csv'）
df['Date']=pd.to_datetime（df['Date']）
打印df[（df['Date'].dt.hour==23）和（df['Date'].dt.minute<50）]