Python 获取时间日期范围
我仍在学习python,这是一个有点复杂的问题, 有这样一个表Python 获取时间日期范围,python,pandas,Python,Pandas,我仍在学习python,这是一个有点复杂的问题, 有这样一个表pandas.DataFrame: SAMPLE_TIME TempBottom TempTop TempOut State Bypass 0 2015-07-15 16:41:56 48.625 55.812 43.875 1 1 1 2015-07-15 16:42:55 48.750 55.812
pandas.DataFrame
:
SAMPLE_TIME TempBottom TempTop TempOut State Bypass
0 2015-07-15 16:41:56 48.625 55.812 43.875 1 1
1 2015-07-15 16:42:55 48.750 55.812 43.875 1 1
2 2015-07-15 16:43:55 48.937 55.812 43.875 1 1
3 2015-07-15 16:44:56 49.125 55.812 43.812 1 1
4 2015-07-15 16:45:55 49.312 55.812 43.812 1 1
这是一个大数据集,在几周内每分钟都有条目。
我试图得到每天的范围,所以基本上忽略时间,按天分割
编辑 我忘了提到这是使用
pd.read\u csv()
从csv导入的,我认为这意味着SMAPLE\u TIME
不是DatetimeIndex
您可以
df['SAMPLE_TIME'] = pd.to_datetime(df['SAMPLE_TIME'])
df.set_index('SAMPLE_TIME', inplace=True)
df_by_days = df.groupby(pd.TimeGrouper('D')).agg()
如中所述应用各种聚合函数。如果您提供了一些关于您希望聚合什么以及如何聚合的详细信息,很高兴添加一个示例。您可以尝试:
#set to datetimeindex
df['SAMPLE_TIME'] = pd.to_datetime(df['SAMPLE_TIME'])
print df
SAMPLE_TIME TempBottom TempTop TempOut State Bypass
0 2015-07-05 16:41:56 48.625 55.812 43.875 1 1
1 2015-07-05 16:42:55 48.750 55.812 43.875 1 1
2 2015-07-23 16:43:55 48.937 55.812 43.875 1 1
3 2015-07-23 16:44:56 49.125 55.812 43.812 1 1
4 2015-07-25 16:45:55 49.312 55.812 43.812 1 1
df = df.set_index('SAMPLE_TIME')
g1 = df.groupby(lambda x: x.day)
for d,g in g1:
print d
print g
5
TempBottom TempTop TempOut State Bypass
SAMPLE_TIME
2015-07-05 16:41:56 48.625 55.812 43.875 1 1
2015-07-05 16:42:55 48.750 55.812 43.875 1 1
23
TempBottom TempTop TempOut State Bypass
SAMPLE_TIME
2015-07-23 16:43:55 48.937 55.812 43.875 1 1
2015-07-23 16:44:56 49.125 55.812 43.812 1 1
25
TempBottom TempTop TempOut State Bypass
SAMPLE_TIME
2015-07-25 16:45:55 49.312 55.812 43.812 1 1
或者您可以按天分组并按总和汇总:
df = df.set_index('SAMPLE_TIME')
g1 = df.groupby(lambda x: x.day).agg(sum)
print g1
TempBottom TempTop TempOut State Bypass
5 97.375 111.624 87.750 2 2
23 98.062 111.624 87.687 2 2
25 49.312 55.812 43.812 1 1
df['SAMPLE_TIME'] = pd.to_datetime(df['SAMPLE_TIME'])
df = df.set_index('SAMPLE_TIME')
g1 = df.groupby([lambda x: x.year, lambda x: x.month, lambda x: x.day]).agg(sum)
print g1
TempBottom TempTop TempOut State Bypass
2015 7 5 97.375 111.624 87.750 2 2
23 98.062 111.624 87.687 2 2
25 49.312 55.812 43.812 1 1
或按年、月、日分组,按总和汇总:
df = df.set_index('SAMPLE_TIME')
g1 = df.groupby(lambda x: x.day).agg(sum)
print g1
TempBottom TempTop TempOut State Bypass
5 97.375 111.624 87.750 2 2
23 98.062 111.624 87.687 2 2
25 49.312 55.812 43.812 1 1
df['SAMPLE_TIME'] = pd.to_datetime(df['SAMPLE_TIME'])
df = df.set_index('SAMPLE_TIME')
g1 = df.groupby([lambda x: x.year, lambda x: x.month, lambda x: x.day]).agg(sum)
print g1
TempBottom TempTop TempOut State Bypass
2015 7 5 97.375 111.624 87.750 2 2
23 98.062 111.624 87.687 2 2
25 49.312 55.812 43.812 1 1
我得到了这个错误
TimeGrouper没有定义
我得到了一个新的错误TypeError:axis必须是一个DatetimeIndex,但是得到了一个'Index'的实例
我猜这是因为它是作为字符串从CSV导入的,我编辑了我的问题只是将示例时间列转换为DateTime
。