Python 连接数据帧而不删除行名称

Python 连接数据帧而不删除行名称,python,pandas,Python,Pandas,我有两个数据帧df1和df2。它们是使用以下代码创建的: import pandas as pd df1 = pd.DataFrame([["Probe1", "Gene1", 3,11], ["Probe1", "Gene2", 6,10], ["Probe2","Gene2", 13,18]], columns=['probe', 'gene', 'Sample1', "Sample2"]).

我有两个数据帧
df1
df2
。它们是使用以下代码创建的:

import pandas as pd
df1 = pd.DataFrame([["Probe1", "Gene1", 3,11], 
                    ["Probe1", "Gene2", 6,10],
                    ["Probe2","Gene2", 13,18]], 
        columns=['probe', 'gene', 'Sample1', "Sample2"]).set_index(['probe', 'gene'])
df1.columns.names = ['Sample']
# Note that number of samples can be more than two


df2 = df1.copy()
df2[df2>0] = 1.00
看起来是这样的:

In [74]: df1
Out[74]:
Sample        Sample1  Sample2
probe  gene
Probe1 Gene1        3       11
       Gene2        6       10
Probe2 Gene2       13       18

In [75]: df2
Out[75]:
Sample        Sample1  Sample2
probe  gene
Probe1 Gene1        1        1
       Gene2        1        1
Probe2 Gene2        1        1
PROBE  GENE      SMPL1    SMPL2 PROBE  GENE      SMPL1    SMPL2
Probe1 Gene1        3       11  Probe1 Gene1      1        1
Probe1 Gene2        6       10  Probe1 Gene2      1        1
Probe2 Gene2       13       18  Probe2 Gene2      1        1
我想做的是将这两个数据帧连接起来,以便最终写入CSV文件,如下所示:

In [74]: df1
Out[74]:
Sample        Sample1  Sample2
probe  gene
Probe1 Gene1        3       11
       Gene2        6       10
Probe2 Gene2       13       18

In [75]: df2
Out[75]:
Sample        Sample1  Sample2
probe  gene
Probe1 Gene1        1        1
       Gene2        1        1
Probe2 Gene2        1        1
PROBE  GENE      SMPL1    SMPL2 PROBE  GENE      SMPL1    SMPL2
Probe1 Gene1        3       11  Probe1 Gene1      1        1
Probe1 Gene2        6       10  Probe1 Gene2      1        1
Probe2 Gene2       13       18  Probe2 Gene2      1        1
我被这个问题困住了:
pd.concat(ndf,axis=1)


正确的方法是什么?

重置索引应该可以满足您的需要

pd.concat([df1.reset_index(),df2.reset_index()],axis=1)
输出:

Sample   probe   gene  Sample1  Sample2   probe   gene  Sample1  Sample2

0       Probe1  Gene1        3       11  Probe1  Gene1        1        1
1       Probe1  Gene2        6       10  Probe1  Gene2        1        1
2       Probe2  Gene2       13       18  Probe2  Gene2        1        1
In [1424]: df1.join(df2, rsuffix='df2').reset_index()
Out[1424]: 
Sample   probe   gene  Sample1  Sample2  Sample1df2  Sample2df2
0       Probe1  Gene1        3       11           1           1
1       Probe1  Gene2        6       10           1           1
2       Probe2  Gene2       13       18           1           1

重置索引应该可以满足您的需要

pd.concat([df1.reset_index(),df2.reset_index()],axis=1)
输出:

Sample   probe   gene  Sample1  Sample2   probe   gene  Sample1  Sample2

0       Probe1  Gene1        3       11  Probe1  Gene1        1        1
1       Probe1  Gene2        6       10  Probe1  Gene2        1        1
2       Probe2  Gene2       13       18  Probe2  Gene2        1        1
In [1424]: df1.join(df2, rsuffix='df2').reset_index()
Out[1424]: 
Sample   probe   gene  Sample1  Sample2  Sample1df2  Sample2df2
0       Probe1  Gene1        3       11           1           1
1       Probe1  Gene2        6       10           1           1
2       Probe2  Gene2       13       18           1           1

试试这个,我归纳为4个示例:

import pandas as pd
df1 = pd.DataFrame([["Probe1", "Gene1", 3,11,30,100], 
                   ["Probe1", "Gene2", 6,10,100,23],
                   ["Probe2","Gene2", 13,18,20,77]], 
        columns=['probe', 'gene', 'Sample1', "Sample2","Sample3","Sample4"]).set_index(['probe', 'gene'])
df1.columns.names = ['Sample']


df2 = df1.copy()
df2[df2>0] = 1.00
ndf = [df1,df2]
fdf = pd.concat(ndf,axis=1)
fdf.reset_index(inplace=True)

ins1 = df1.shape[1]+2
ins2 = ins1 + 1
print ins1,ins2
fdf.insert(ins1,'probe2',fdf['probe'])
fdf.insert(ins2,'gene2',fdf['gene'])
fdf
给予


试试这个,我归纳为4个示例:

import pandas as pd
df1 = pd.DataFrame([["Probe1", "Gene1", 3,11,30,100], 
                   ["Probe1", "Gene2", 6,10,100,23],
                   ["Probe2","Gene2", 13,18,20,77]], 
        columns=['probe', 'gene', 'Sample1', "Sample2","Sample3","Sample4"]).set_index(['probe', 'gene'])
df1.columns.names = ['Sample']


df2 = df1.copy()
df2[df2>0] = 1.00
ndf = [df1,df2]
fdf = pd.concat(ndf,axis=1)
fdf.reset_index(inplace=True)

ins1 = df1.shape[1]+2
ins2 = ins1 + 1
print ins1,ins2
fdf.insert(ins1,'probe2',fdf['probe'])
fdf.insert(ins2,'gene2',fdf['gene'])
fdf
给予


用户
加入
,然后
重置索引

In [1422]: df1
Out[1422]: 
Sample        Sample1  Sample2
probe  gene                   
Probe1 Gene1        3       11
       Gene2        6       10
Probe2 Gene2       13       18

In [1423]: df2
Out[1423]: 
Sample        Sample1  Sample2
probe  gene                   
Probe1 Gene1        1        1
       Gene2        1        1
Probe2 Gene2        1        1
输出:

Sample   probe   gene  Sample1  Sample2   probe   gene  Sample1  Sample2

0       Probe1  Gene1        3       11  Probe1  Gene1        1        1
1       Probe1  Gene2        6       10  Probe1  Gene2        1        1
2       Probe2  Gene2       13       18  Probe2  Gene2        1        1
In [1424]: df1.join(df2, rsuffix='df2').reset_index()
Out[1424]: 
Sample   probe   gene  Sample1  Sample2  Sample1df2  Sample2df2
0       Probe1  Gene1        3       11           1           1
1       Probe1  Gene2        6       10           1           1
2       Probe2  Gene2       13       18           1           1

用户
加入
,然后
重置索引

In [1422]: df1
Out[1422]: 
Sample        Sample1  Sample2
probe  gene                   
Probe1 Gene1        3       11
       Gene2        6       10
Probe2 Gene2       13       18

In [1423]: df2
Out[1423]: 
Sample        Sample1  Sample2
probe  gene                   
Probe1 Gene1        1        1
       Gene2        1        1
Probe2 Gene2        1        1
输出:

Sample   probe   gene  Sample1  Sample2   probe   gene  Sample1  Sample2

0       Probe1  Gene1        3       11  Probe1  Gene1        1        1
1       Probe1  Gene2        6       10  Probe1  Gene2        1        1
2       Probe2  Gene2       13       18  Probe2  Gene2        1        1
In [1424]: df1.join(df2, rsuffix='df2').reset_index()
Out[1424]: 
Sample   probe   gene  Sample1  Sample2  Sample1df2  Sample2df2
0       Probe1  Gene1        3       11           1           1
1       Probe1  Gene2        6       10           1           1
2       Probe2  Gene2       13       18           1           1

不,没有!我无法复制你的结果。@pdubois哇!编辑。不,它没有!我无法复制你的结果。@pdubois哇!编辑。