Python 从定义的列开始拆分数据帧，但保留前两列_Python_Pandas

Python 从定义的列开始拆分数据帧，但保留前两列

python pandas

Python 从定义的列开始拆分数据帧，但保留前两列,python,pandas,Python,Pandas,我有以下数据框： import pandas as pd df = pd.DataFrame({'Probe' : ['a', 'b', 'c', 'd','e'], 'Gene' : ['one', 'two','three','four','five'], 'X' : randn(5), 'Y' : randn(5)}) 看起来是这样的： In [20]: df Out[20]: Gene Probe

我有以下数据框：

import pandas as pd
df = pd.DataFrame({'Probe' : ['a', 'b', 'c', 'd','e'],
                 'Gene' : ['one', 'two','three','four','five'],
                 'X' : randn(5), 'Y' : randn(5)})

看起来是这样的：

In [20]: df
Out[20]:
    Gene Probe         X         Y
0    one     a  0.104504  1.089442
1    two     b  0.030071  0.696786
2  three     c  1.224704  1.077867
3   four     d -0.052333  0.034292
4   five     e -0.283872  0.602743

我要做的是将此数据框拆分为列

并保留第一列和第二列屈服：

    Gene Probe         X
0    one     a  0.104504
1    two     b  0.030071
2  three     c  1.224704
3   four     d -0.052333
4   five     e -0.283872

及

我试过了，但它确实达到了我的预期：

for dfs in df.groupby(['Probe','Gene']):
    print dfs

正确的方法是什么？

这将是一个开始：

df_x = df.loc[:, ['Gene', 'Probe', 'X']]
df_y = df.loc[:, ['Gene', 'Probe', 'Y']]

您可以使用

difference

删除不感兴趣的列，然后再选择列：

In [9]:

X = df[df.columns.difference(['Y'])]
Y = df[df.columns.difference(['X'])]
print(X)
Y
    Gene Probe         X
0    one     a  1.231749
1    two     b  0.519425
2  three     c  0.849960
3   four     d -0.077796
4   five     e  1.224163
Out[9]:
    Gene Probe         Y
0    one     a  0.022695
1    two     b  0.500311
2  three     c -0.163624
3   four     d  0.411491
4   five     e  1.305214

所以你想要两个数据帧？

In [9]:

X = df[df.columns.difference(['Y'])]
Y = df[df.columns.difference(['X'])]
print(X)
Y
    Gene Probe         X
0    one     a  1.231749
1    two     b  0.519425
2  three     c  0.849960
3   four     d -0.077796
4   five     e  1.224163
Out[9]:
    Gene Probe         Y
0    one     a  0.022695
1    two     b  0.500311
2  three     c -0.163624
3   four     d  0.411491
4   five     e  1.305214