Python 使用聚合函数计数在数据帧上创建Timegrouper_Python_Pandas

Python 使用聚合函数计数在数据帧上创建Timegrouper

python pandas

Python 使用聚合函数计数在数据帧上创建Timegrouper,python,pandas,Python,Pandas,我正在使用Excel数据框上的Timegrouper，并尝试使用Date作为列标题，Time作为行进行Pviot，Y上的聚合计数为“Barton LLC” 尝试使用重采样、pivot和timegrouper，但出现了一系列错误 import pandas as pd import numpy as np df = pd.read_excel("data.xlsx") ndf = df[df['Type'].eq('df')].pivot_table(columns= ['Y'],values='

我正在使用Excel数据框上的Timegrouper，并尝试使用Date作为列标题，Time作为行进行Pviot，Y上的聚合计数为“Barton LLC”

尝试使用重采样、pivot和timegrouper，但出现了一系列错误

import pandas as pd
import numpy as np
df = pd.read_excel("data.xlsx")
ndf = df[df['Type'].eq('df')].pivot_table(columns= ['Y'],values='Y',
index=pd.Grouper(key='D',freq='H'),aggfunc='count',fill_value=0)

结果

         2014-01-01,2014-01-02,2014-02-07
 02:21    3,NaN,NaN              
 21:21    NaN,4,NaN
 04:34    NaN,NaN,2

您可以拆分

date

和

time

中的

datetime

列，并使用：

请注意，您缺少日期的一个计数

2014-01-02 21:21:01

您可以拆分

日期

和

时间

中的

日期时间

列，并使用：

请注意，您缺少一个日期计数

2014-01-02 21:21:01

用于将

datetime

s转换为自定义字符串：

df.D = pd.to_datetime(df.D)

ndf = pd.crosstab(df['D'].dt.strftime('%H:%M').rename('H'), df['D'].dt.strftime('%Y-%m-%d')) 
print (ndf)
D      2014-01-01  2014-01-02  2014-02-07
H                                        
02:21           3           0           0
04:34           0           0           2
21:21           0           5           0

用于将日期时间转换为自定义字符串：

df.D = pd.to_datetime(df.D)

ndf = pd.crosstab(df['D'].dt.strftime('%H:%M').rename('H'), df['D'].dt.strftime('%Y-%m-%d')) 
print (ndf)
D      2014-01-01  2014-01-02  2014-02-07
H                                        
02:21           3           0           0
04:34           0           0           2
21:21           0           5           0

预期的结果是什么？预期的结果是什么？尝试了，但出现了错误~/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py in_uuu-getitem_uuuuuuuuuuuu86返回自。uu-getitem_u多级（键）2687其他：->2688返回自。uGetItem_column（键）2689 2690 def_getitem_column（self，键）：pandas.\u-libs.hashtable.PyObjectHashTable.get_item（）pandas/\u libs/hashtable\u class\u helper.pxi在pandas.\u libs.hashtable.PyObjectHashTable.get\u item（）键错误：'D'adding

df.D=pd.to\u datetime（df.D）

开始时？@yatu-如何仅使用HH:MM丢弃秒。用

df['time']=df['D'].dt.strftime（'%H:%M'）更改第二行，是的，它工作正常。我们可以像这样在pivot中使用cumcount。pivot_表（df，'D'，'time'，'date'，aggfunc='cumcount'）尝试过，但出现错误~/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py in_uuu-getitem_uuuuuuuuuuuu86返回自。uu-getitem_u多级（键）2687其他：->2688返回自。uGetItem_column（键）2689 2690 def_getitem_column（self，键）：pandas.\u-libs.hashtable.PyObjectHashTable.get_item（）pandas/\u libs/hashtable\u class\u helper.pxi在pandas.\u libs.hashtable.PyObjectHashTable.get\u item（）键错误：'D'addingdf.D=pd.to\u datetime（df.D）
开始时？@yatu-如何仅使用HH:MM丢弃秒。用df['time']=df['D'].dt.strftime（'%H:%M'）更改第二行，是的，它工作正常。我们可以像这样在pivot中使用cumcount。pivot_表（df，'D'，'time'，'date'，aggfunc='cumcount'）没有尝试运行时出错~/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py in_uuuugetItem_uuuuuuuuuuuuu86返回self.\u getitem_2687其他：->2688返回self.\u getitem_column（key）2689 2690 def_getitem_column（self，key）：pandas/\u libs/index.pyx in pandas.\u libs.index.IndexEngine.get_loc（）@J.RAM-什么是print（df.columns）
？print（df.columns）索引（['X'，'Y'，'Z'，'D']，dtype='object'）@J.RAM-和print（df.dtypes）
？它在将df.D=pd.to添加到_datetime（df.D）df['date']=df['D'].dt.date['time']=df['D'].dt'].time尝试运行时没有错误~/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py in_uuuugetItem_uuuuuuuuuuuuu86返回self.\u getitem_2687其他：->2688返回self.\u getitem_column（key）2689 2690 def_getitem_column（self，key）：pandas/\u libs/index.pyx in pandas.\u libs.index.IndexEngine.get_loc（）@J.RAM-什么是print（df.columns）
？print（df.columns）索引（['X'，'Y'，'Z'，'D']，dtype='object'）@J.RAM-和print（df.dtypes）？它在将df.D=pd.添加到_datetime（df.D）df['date']=df['D'].dt.date['time']=df['D'].dt.time]之后工作
df.D = pd.to_datetime(df.D)

ndf = pd.crosstab(df['D'].dt.strftime('%H:%M').rename('H'), df['D'].dt.strftime('%Y-%m-%d')) 
print (ndf)
D      2014-01-01  2014-01-02  2014-02-07
H                                        
02:21           3           0           0
04:34           0           0           2
21:21           0           5           0

ndf = pd.crosstab(df['D'].dt.time.rename('T'), df['D'].dt.date) 
print (ndf)
D         2014-01-01  2014-01-02  2014-02-07
T                                           
02:21:51           3           0           0
04:34:50           0           0           2
21:21:01           0           5           0