Python 在数据帧中将索引值用作类别值
我有以下数据帧:Python 在数据帧中将索引值用作类别值,python,python-3.x,pandas,dataframe,Python,Python 3.x,Pandas,Dataframe,我有以下数据帧: beat1 beat2 beat3 beat4 beat5 beat6 beat7 filename M40_HC_503d.dat 0.7456 0.8574 0.7695 0.8698 0.8315 0.7908 0.8823 M30_HC_461d.dat
beat1 beat2 beat3 beat4 beat5 beat6 beat7
filename
M40_HC_503d.dat 0.7456 0.8574 0.7695 0.8698 0.8315 0.7908 0.8823
M30_HC_461d.dat 0.7672 0.6682 0.7452 0.6853 0.7488 0.6782 0.6648
M24_HC_459d.dat 0.6041 0.6439 0.5870 0.7452 0.6714 0.6684 0.6198
M48_HC_543d.dat 0.8949 0.8570 0.9338 1.0545 1.0681 1.0775 0.8425
M40_HC_506d.dat 0.7862 0.8917 0.9357 0.8250 0.8521 0.7146 0.7125
我想制作另一个数据帧,其中列名beat1到beat7将是索引,它将有两列。在此数据帧的第一列中,值将是从beat1到beat7的所有值,第二列将是值来源的文件名。大概是这样的:
values filename
ind
0 0.7456 M40_HC_503d.dat
1 0.8574 M40_HC_503d.dat
2 0.7695 M40_HC_503d.dat
3 0.8698 M40_HC_503d.dat
4 0.8315 M40_HC_503d.dat
5 0.7908 M40_HC_503d.dat
6 0.8823 M40_HC_503d.dat
7 0.7672 M30_HC_461d.dat
8 0.6682 M30_HC_461d.dat
9 0.7452 M30_HC_461d.dat
10 0.6853 M30_HC_461d.dat
11 0.7488 M30_HC_461d.dat
12 0.6782 M30_HC_461d.dat
13 0.6648 M30_HC_461d.dat
我尝试了很多方法,包括转置等等,但都不管用。有什么想法吗?我想你需要:
或者可能:
df = df.stack().reset_index(0, name='values').reset_index(drop=True)
print (df)
filename values
0 M40_HC_503d.dat 0.7456
1 M40_HC_503d.dat 0.8574
2 M40_HC_503d.dat 0.7695
3 M40_HC_503d.dat 0.8698
4 M40_HC_503d.dat 0.8315
5 M40_HC_503d.dat 0.7908
6 M40_HC_503d.dat 0.8823
7 M30_HC_461d.dat 0.7672
8 M30_HC_461d.dat 0.6682
9 M30_HC_461d.dat 0.7452
10 M30_HC_461d.dat 0.6853
...
...
如果需要更改索引:
df = df.stack().reset_index(0, name='values')
df.index = df.index.str.extract('(\d+)', expand=False)
print (df)
filename values
1 M40_HC_503d.dat 0.7456
2 M40_HC_503d.dat 0.8574
3 M40_HC_503d.dat 0.7695
4 M40_HC_503d.dat 0.8698
5 M40_HC_503d.dat 0.8315
6 M40_HC_503d.dat 0.7908
7 M40_HC_503d.dat 0.8823
1 M30_HC_461d.dat 0.7672
2 M30_HC_461d.dat 0.6682
...
...
为我工作。谢谢
df = df.stack().reset_index(0, name='values')
df.index = df.index.str.extract('(\d+)', expand=False)
print (df)
filename values
1 M40_HC_503d.dat 0.7456
2 M40_HC_503d.dat 0.8574
3 M40_HC_503d.dat 0.7695
4 M40_HC_503d.dat 0.8698
5 M40_HC_503d.dat 0.8315
6 M40_HC_503d.dat 0.7908
7 M40_HC_503d.dat 0.8823
1 M30_HC_461d.dat 0.7672
2 M30_HC_461d.dat 0.6682
...
...
v = df.values
i = df.index.values
pd.DataFrame(
np.hstack([v.reshape(-1, 1), i.repeat(v.shape[1])[:, None]]),
columns=['values', 'filename']
)
values filename
0 0.7456 M40_HC_503d.dat
1 0.8574 M40_HC_503d.dat
2 0.7695 M40_HC_503d.dat
3 0.8698 M40_HC_503d.dat
4 0.8315 M40_HC_503d.dat
5 0.7908 M40_HC_503d.dat
6 0.8823 M40_HC_503d.dat
7 0.7672 M30_HC_461d.dat
8 0.6682 M30_HC_461d.dat
9 0.7452 M30_HC_461d.dat
...