Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/288.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 从定义的列开始拆分数据帧,但保留前两列_Python_Pandas - Fatal编程技术网

Python 从定义的列开始拆分数据帧,但保留前两列

Python 从定义的列开始拆分数据帧,但保留前两列,python,pandas,Python,Pandas,我有以下数据框: import pandas as pd df = pd.DataFrame({'Probe' : ['a', 'b', 'c', 'd','e'], 'Gene' : ['one', 'two','three','four','five'], 'X' : randn(5), 'Y' : randn(5)}) 看起来是这样的: In [20]: df Out[20]: Gene Probe

我有以下数据框:

import pandas as pd
df = pd.DataFrame({'Probe' : ['a', 'b', 'c', 'd','e'],
                 'Gene' : ['one', 'two','three','four','five'],
                 'X' : randn(5), 'Y' : randn(5)})
看起来是这样的:

In [20]: df
Out[20]:
    Gene Probe         X         Y
0    one     a  0.104504  1.089442
1    two     b  0.030071  0.696786
2  three     c  1.224704  1.077867
3   four     d -0.052333  0.034292
4   five     e -0.283872  0.602743
我要做的是将此数据框拆分为列
X
并保留 第一列和第二列屈服:

    Gene Probe         X
0    one     a  0.104504
1    two     b  0.030071
2  three     c  1.224704
3   four     d -0.052333
4   five     e -0.283872

我试过了,但它确实达到了我的预期:

for dfs in df.groupby(['Probe','Gene']):
    print dfs
正确的方法是什么?

这将是一个开始:

df_x = df.loc[:, ['Gene', 'Probe', 'X']]
df_y = df.loc[:, ['Gene', 'Probe', 'Y']]

您可以使用
difference
删除不感兴趣的列,然后再选择列:

In [9]:

X = df[df.columns.difference(['Y'])]
Y = df[df.columns.difference(['X'])]
print(X)
Y
    Gene Probe         X
0    one     a  1.231749
1    two     b  0.519425
2  three     c  0.849960
3   four     d -0.077796
4   five     e  1.224163
Out[9]:
    Gene Probe         Y
0    one     a  0.022695
1    two     b  0.500311
2  three     c -0.163624
3   four     d  0.411491
4   five     e  1.305214

所以你想要两个数据帧?
In [9]:

X = df[df.columns.difference(['Y'])]
Y = df[df.columns.difference(['X'])]
print(X)
Y
    Gene Probe         X
0    one     a  1.231749
1    two     b  0.519425
2  three     c  0.849960
3   four     d -0.077796
4   five     e  1.224163
Out[9]:
    Gene Probe         Y
0    one     a  0.022695
1    two     b  0.500311
2  three     c -0.163624
3   four     d  0.411491
4   five     e  1.305214