Python 根据列的小时数范围创建新数据框_Python_Pandas_Sorting_Group By

Python 根据列的小时数范围创建新数据框

python pandas sorting

Python 根据列的小时数范围创建新数据框,python,pandas,sorting,group-by,Python,Pandas,Sorting,Group By,我的df是： ordinal id latitude longitude timestamp epoch day_of_week 1.0 38 44.9484 7.7728 2016-06-01 08:18:46.000 1.464769 Wednesday 2.0 38 44.9503 7.7748 2016-06-01 08:28:05.000 1.464770 Wednesday 3.0

我的df是：

  ordinal id latitude longitude timestamp               epoch       day_of_week
  1.0     38 44.9484  7.7728    2016-06-01 08:18:46.000 1.464769    Wednesday
  2.0     38 44.9503  7.7748    2016-06-01 08:28:05.000 1.464770    Wednesday
  3.0     38 44.9503  7.7748    2016-06-01 08:38:09.000 1.464770    Wednesday

我想根据小时范围创建新的df1、df2、df3：例：从

2016-06-01 08:00:00.000

到

2016-06-01 09:00:00.000

（从8点到9点）我想要

1.0     38 44.9484  7.7728    2016-06-01 08:18:46.000 1.464769    Wednesday
2.0     38 44.9503  7.7748    2016-06-01 08:28:05.000 1.464770    Wednesday

我想24小时都这样做。如果可能的话，我想通过可以应用于整个列的代码来实现，或者我可以一个接一个地实现。您不需要描述为什么要生成特定于小时的原始数据切片。一般来说，这将被认为是不好的做法或不pythonic

我建议使用

groupby

根据小时对数据进行分组，这样可以循环浏览这些切片，这里是数据帧

group

下面是一个简单的工作示例：

import pandas as pd
import numpy as np

iN = 100
data_char = np.random.randint(0, 100, size=100)
timestamp = pd.date_range(start='2018-04-24', end='2018-04-25', periods=100)

data = {'data_char': data_char, 'timestamp': timestamp}
df = pd.DataFrame.from_dict(data)

for hour, group in df.groupby(df['timestamp'].dt.hour):
    print(hour)
    print(group)

您没有描述为什么要生成特定于小时的原始数据切片。一般来说，这将被认为是不好的做法或不pythonic

我建议使用

groupby

根据小时对数据进行分组，这样可以循环浏览这些切片，这里是数据帧

group

下面是一个简单的工作示例：

import pandas as pd
import numpy as np

iN = 100
data_char = np.random.randint(0, 100, size=100)
timestamp = pd.date_range(start='2018-04-24', end='2018-04-25', periods=100)

data = {'data_char': data_char, 'timestamp': timestamp}
df = pd.DataFrame.from_dict(data)

for hour, group in df.groupby(df['timestamp'].dt.hour):
    print(hour)
    print(group)

第三排被丢弃的原因是什么？第三排被丢弃的原因是什么？