如何从具有多层头的Python数据帧中检索数据?
我有以下格式的CSV文件:如何从具有多层头的Python数据帧中检索数据?,python,pandas,dataframe,Python,Pandas,Dataframe,我有以下格式的CSV文件: Level1_head1 Level1_head2 Level1_head3 Level2_head1 Level2_head2 Level2_head3 ID S0000001 someValue someValue someValue S0000002 someValue someValue someValue
Level1_head1 Level1_head2 Level1_head3
Level2_head1 Level2_head2 Level2_head3
ID
S0000001 someValue someValue someValue
S0000002 someValue someValue someValue
S0000003 someValue someValue someValue
S0000004 someValue someValue someValue
S0000005 someValue someValue someValue
请注意,ID
上方的单元格是空的,ID
右侧的单元格也是空的
我已将上述数据放入Python Dataframe对象df
,并尝试从中提取列ID
:
df = pd.read_csv("data.csv", header=[0,1], index_col=0)
date_series = df[0:]
但是,我得到了整个数据帧,而不是单个列。输出数据帧时,显示如下:
Level2_head1 Level2_head2 Level2_head3
ID
S0000001 someValue someValue someValue
S0000002 someValue someValue someValue
S0000003 someValue someValue someValue
S0000004 someValue someValue someValue
S0000005 someValue someValue someValue
我也试过:
date_series = df['ID']
及
然而,对于前者,我有一个键错误,df
找不到值为'ID'的键。对于后者,我有一个错误,即df
没有属性“ID”
我现在完全糊涂了。如何检索包含ID的第一列(ID)?您不能使用
date\u series=df['ID']
,因为ID
是索引的名称
但是使用get first columnindex
到Series
:
print df
Level1_head1 Level1_head2 Level1_head3
Level2_head1 Level2_head2 Level2_head3
ID
S0000001 someValue someValue someValue
S0000002 someValue someValue someValue
S0000003 someValue someValue someValue
S0000004 someValue someValue someValue
S0000005 someValue someValue someValue
print df.index.name
ID
print df.index
Index([u'S0000001', u'S0000002', u'S0000003', u'S0000004', u'S0000005'], dtype='object', name=u'ID')
print df.index.to_series()
ID
S0000001 S0000001
S0000002 S0000002
S0000003 S0000003
S0000004 S0000004
S0000005 S0000005
Name: ID, dtype: object
#if you need reset index
print df.index.to_series().reset_index(drop=True)
0 S0000001
1 S0000002
2 S0000003
3 S0000004
4 S0000005
Name: ID, dtype: object
print pd.Series(df.index)
0 S0000001
1 S0000002
2 S0000003
3 S0000004
4 S0000005
Name: ID, dtype: object
或使用pd.Series的解决方案
:
print df
Level1_head1 Level1_head2 Level1_head3
Level2_head1 Level2_head2 Level2_head3
ID
S0000001 someValue someValue someValue
S0000002 someValue someValue someValue
S0000003 someValue someValue someValue
S0000004 someValue someValue someValue
S0000005 someValue someValue someValue
print df.index.name
ID
print df.index
Index([u'S0000001', u'S0000002', u'S0000003', u'S0000004', u'S0000005'], dtype='object', name=u'ID')
print df.index.to_series()
ID
S0000001 S0000001
S0000002 S0000002
S0000003 S0000003
S0000004 S0000004
S0000005 S0000005
Name: ID, dtype: object
#if you need reset index
print df.index.to_series().reset_index(drop=True)
0 S0000001
1 S0000002
2 S0000003
3 S0000004
4 S0000005
Name: ID, dtype: object
print pd.Series(df.index)
0 S0000001
1 S0000002
2 S0000003
3 S0000004
4 S0000005
Name: ID, dtype: object
如果我的回答是有帮助的,别忘了,投赞成票。谢谢