Python 理解多索引
因此,我在csv中有这样一个示例数据集:-Python 理解多索引,python,pandas,Python,Pandas,因此,我在csv中有这样一个示例数据集:- name team date score John A 3/9/12 100 John B 3/9/12 99 Jane B 4/9/12 102 Peter A 9/9/12 103 Josie C 11/9/12 111 Rachel A 30/10/12 98 Kate B 31/10/12 103 D
name team date score
John A 3/9/12 100
John B 3/9/12 99
Jane B 4/9/12 102
Peter A 9/9/12 103
Josie C 11/9/12 111
Rachel A 30/10/12 98
Kate B 31/10/12 103
David C 1/11/12 104
执行下列各项:-
from pandas.io.parsers import read_csv
df = read_csv("data/Workbook1.csv", index_col=["team", "name"])
df
date score
team name
A John 3/9/12 100
B John 3/9/12 99
Jane 4/9/12 102
A Peter 9/9/12 103
C Josie 11/9/12 111
A Rachel 30/10/12 98
B Kate 31/10/12 103
C David 1/11/12 104
如何进一步压缩第一个索引(“团队”),以避免重复值?成为:-
date score
team name
A John 3/9/12 100
Peter 9/9/12 103
Rachel 30/10/12 98
B John 3/9/12 99
Jane 4/9/12 102
Kate 31/10/12 103
C Josie 11/9/12 111
David 1/11/12 104
我自己想出来的
df = read_csv("data/Workbook1.csv")
df
name team date score
0 John A 3/9/12 100
1 John B 3/9/12 99
2 Jane B 4/9/12 102
3 Peter A 9/9/12 103
4 Josie C 11/9/12 111
5 Rachel A 30/10/12 98
6 Kate B 31/10/12 103
7 David C 1/11/12 104
df2 = df.pivot('team', 'name').stack()
df2
date score
team name
A John 3/9/12 100
Peter 9/9/12 103
Rachel 30/10/12 98
B Jane 4/9/12 102
John 3/9/12 99
Kate 31/10/12 103
C David 1/11/12 104
Josie 11/9/12 111
作为替代解决方案,如果出于任何原因,您希望在read_csv语句中保留多重索引 相同的数据集
df = pd.read_csv("Workbook1.csv", index_col=["team", "name"])
df.stack().unstack()
date score
team name
A John 3/9/2012 100
Peter 9/9/2012 103
Rachel 30/10/12 98
B Jane 4/9/2012 102
John 3/9/2012 99
Kate 31/10/12 103
C David 1/11/2012 104
Josie 11/9/2012 111