Python 计算值;在多索引数据透视中的列和索引中
我的多索引透视图如下所示:Python 计算值;在多索引数据透视中的列和索引中,python,pandas,dataframe,pivot,multi-index,Python,Pandas,Dataframe,Pivot,Multi Index,我的多索引透视图如下所示: Date 2019-10-01 11:00 2019-10-01 12:00 2019-10-01 13:00 ... 2019-10-29 17:00 ID 25 24 25 ... 24 H_name
Date 2019-10-01 11:00 2019-10-01 12:00 2019-10-01 13:00 ... 2019-10-29 17:00
ID 25 24 25 ... 24
H_name
Hospital1 12 15 16 ... 12
Hospital2 10 17 14 ... 12
Hospital3 15 20 12 ... 12
我想得到:
Date 2019-10-01 2019-10-02 2019-10-03
ID 25.45 24.33 23.71
H_name
Hospital1 253 287 261
Hospital2 212 232 264
Hospital3 221 219 223
“H_name”的值是一天中所有小时的总和,“ID”是一天中所有小时的平均值。谢谢你的帮助=)
支点前我的df
H_name Date ID Value
0 Hospital1 2019-10-01 11:00 25 12
1 Hospital2 2019-10-01 11:00 25 10
2 Hospital3 2019-10-01 11:00 25 15
3 Hospital1 2019-10-01 12:00 24 15
4 Hospital2 2019-10-01 12:00 24 17
5 Hospital3 2019-10-01 12:00 24 20
.... .... ... ...
680 Hospital1 2019-10-30 15:00 20 11
681 Hospital2 2019-10-30 15:00 20 18
682 Hospital3 2019-10-30 15:00 20 17
如果我理解正确,您希望按日期对数据进行分组(
Value
bynp.sum
和ID
bynp.mean
),然后制作透视表:
import numpy as np
import pandas as pd
h_name = ['Hospital1', 'Hospital2', 'Hospital3', 'Hospital1', 'Hospital2', 'Hospital3',
'Hospital1', 'Hospital2', 'Hospital3', 'Hospital1', 'Hospital2', 'Hospital3']
date = ['2019-10-01 11:00', '2019-10-01 11:00', '2019-10-01 11:00', '2019-10-01 12:00', '2019-10-01 12:00', '2019-10-01 12:00',
'2019-10-02 11:00', '2019-10-02 11:00', '2019-10-02 11:00', '2019-10-02 12:00', '2019-10-02 12:00', '2019-10-02 12:00']
ids = [25, 25, 25, 24, 24, 24,
23, 23, 23, 22, 22, 22]
value = [12, 10, 15, 15, 17, 20,
15, 16, 17, 14, 13, 22]
df = pd.DataFrame({'H_name': h_name, 'Date': date, 'ID': ids, 'Value': value})
df['Date'] = pd.to_datetime(df['Date'], utc=False)
print(df)
df
中的数据如下所示:
H_name Date ID Value
0 Hospital1 2019-10-01 11:00:00 25 12
1 Hospital2 2019-10-01 11:00:00 25 10
2 Hospital3 2019-10-01 11:00:00 25 15
3 Hospital1 2019-10-01 12:00:00 24 15
4 Hospital2 2019-10-01 12:00:00 24 17
5 Hospital3 2019-10-01 12:00:00 24 20
6 Hospital1 2019-10-02 11:00:00 23 15
7 Hospital2 2019-10-02 11:00:00 23 16
8 Hospital3 2019-10-02 11:00:00 23 17
9 Hospital1 2019-10-02 12:00:00 22 14
10 Hospital2 2019-10-02 12:00:00 22 13
11 Hospital3 2019-10-02 12:00:00 22 22
然后:
印刷品:
Date_1 2019-10-01 2019-10-02
ID 24.5 22.5
H_name
Hospital1 27 29
Hospital2 27 29
Hospital3 35 39
谢谢你的回复。这正是我所说的那种事情。但是当我的
df['Date']
是pandas.\u libs.tslibs.Timestamp.Timestamp
不是str
时,你知道如何规范化时间序列吗?@SimiWien我已经更新了我的答案。基本上,您可以df['Date\u 1']=df.Date.dt.Date
Date_1 2019-10-01 2019-10-02
ID 24.5 22.5
H_name
Hospital1 27 29
Hospital2 27 29
Hospital3 35 39