Python 从定义的列开始拆分数据帧,但保留前两列
我有以下数据框:Python 从定义的列开始拆分数据帧,但保留前两列,python,pandas,Python,Pandas,我有以下数据框: import pandas as pd df = pd.DataFrame({'Probe' : ['a', 'b', 'c', 'd','e'], 'Gene' : ['one', 'two','three','four','five'], 'X' : randn(5), 'Y' : randn(5)}) 看起来是这样的: In [20]: df Out[20]: Gene Probe
import pandas as pd
df = pd.DataFrame({'Probe' : ['a', 'b', 'c', 'd','e'],
'Gene' : ['one', 'two','three','four','five'],
'X' : randn(5), 'Y' : randn(5)})
看起来是这样的:
In [20]: df
Out[20]:
Gene Probe X Y
0 one a 0.104504 1.089442
1 two b 0.030071 0.696786
2 three c 1.224704 1.077867
3 four d -0.052333 0.034292
4 five e -0.283872 0.602743
我要做的是将此数据框拆分为列X
并保留
第一列和第二列屈服:
Gene Probe X
0 one a 0.104504
1 two b 0.030071
2 three c 1.224704
3 four d -0.052333
4 five e -0.283872
及
我试过了,但它确实达到了我的预期:
for dfs in df.groupby(['Probe','Gene']):
print dfs
正确的方法是什么?这将是一个开始:
df_x = df.loc[:, ['Gene', 'Probe', 'X']]
df_y = df.loc[:, ['Gene', 'Probe', 'Y']]
您可以使用
difference
删除不感兴趣的列,然后再选择列:
In [9]:
X = df[df.columns.difference(['Y'])]
Y = df[df.columns.difference(['X'])]
print(X)
Y
Gene Probe X
0 one a 1.231749
1 two b 0.519425
2 three c 0.849960
3 four d -0.077796
4 five e 1.224163
Out[9]:
Gene Probe Y
0 one a 0.022695
1 two b 0.500311
2 three c -0.163624
3 four d 0.411491
4 five e 1.305214
所以你想要两个数据帧?
In [9]:
X = df[df.columns.difference(['Y'])]
Y = df[df.columns.difference(['X'])]
print(X)
Y
Gene Probe X
0 one a 1.231749
1 two b 0.519425
2 three c 0.849960
3 four d -0.077796
4 five e 1.224163
Out[9]:
Gene Probe Y
0 one a 0.022695
1 two b 0.500311
2 three c -0.163624
3 four d 0.411491
4 five e 1.305214