如何在Python中使用df水平透视csv中的表?

如何在Python中使用df水平透视csv中的表?,python,csv,pandas,dataframe,pivot,Python,Csv,Pandas,Dataframe,Pivot,我有这种格式的数据- MonthYear HPI Div State_fips 1-1993 105.45 7 5 2-1993 105.58 7 5 3-1993 106.23 7 5 4-1993 106.63 7 5 Required Pivot Table as: Stafips 1-1993 2-1993 3-1993 4-1993 5 105.45 105.58 106.23 106.63 每月HPI分区州fips

我有这种格式的数据-

MonthYear HPI Div State_fips 1-1993 105.45 7 5 2-1993 105.58 7 5 3-1993 106.23 7 5 4-1993 106.63 7 5 Required Pivot Table as: Stafips 1-1993 2-1993 3-1993 4-1993 5 105.45 105.58 106.23 106.63 每月HPI分区州fips 1-1993 105.45 7 5 2-1993 105.58 7 5 3-1993 106.23 7 5 4-1993 106.63 7 5 所需的数据透视表为: Stafips 1-1993 2-1993 3-1993 4-1993 5 105.45 105.58 106.23 106.63 (对熊猫来说是全新的)

使用或:

但如果重复,则需要使用
groupby
进行聚合,或者,
mean
可以更改为
sum
median
,…:

print (df)
  MonthYear     HPI  Div  State_fips
0    1-1993  105.45    7           5
1    2-1993  105.58    7           5
2    3-1993  106.23    7           5
3    4-1993  100.00    7           5 <-duplicates same 4-1993, 5
4    4-1993  200.00    7           5 <-duplicates same 4-1993, 5

df1 = df.pivot_table(index='State_fips', columns='MonthYear', values='HPI', aggfunc='mean')
MonthYear   1-1993  2-1993  3-1993  4-1993
State_fips                                
5           105.45  105.58  106.23   150.0 <- (100+200/2) = 150

df1 = df.groupby(['State_fips', 'MonthYear'])['HPI'].mean().unstack()
MonthYear   1-1993  2-1993  3-1993  4-1993
State_fips                                
5           105.45  105.58  106.23   150.0 <- (100+200/2) = 150
print (df)
  MonthYear     HPI  Div  State_fips
0    1-1993  105.45    7           5
1    2-1993  105.58    7           5
2    3-1993  106.23    7           5
3    4-1993  100.00    7           5 <-duplicates same 4-1993, 5
4    4-1993  200.00    7           5 <-duplicates same 4-1993, 5

df1 = df.pivot_table(index='State_fips', columns='MonthYear', values='HPI', aggfunc='mean')
MonthYear   1-1993  2-1993  3-1993  4-1993
State_fips                                
5           105.45  105.58  106.23   150.0 <- (100+200/2) = 150

df1 = df.groupby(['State_fips', 'MonthYear'])['HPI'].mean().unstack()
MonthYear   1-1993  2-1993  3-1993  4-1993
State_fips                                
5           105.45  105.58  106.23   150.0 <- (100+200/2) = 150
df1 = df1.reset_index().rename_axis(None, axis=1)
print (df1)
   State_fips  1-1993  2-1993  3-1993  4-1993
0           5  105.45  105.58  106.23   150.0