Python 连接数据帧而不删除行名称
我有两个数据帧Python 连接数据帧而不删除行名称,python,pandas,Python,Pandas,我有两个数据帧df1和df2。它们是使用以下代码创建的: import pandas as pd df1 = pd.DataFrame([["Probe1", "Gene1", 3,11], ["Probe1", "Gene2", 6,10], ["Probe2","Gene2", 13,18]], columns=['probe', 'gene', 'Sample1', "Sample2"]).
df1
和df2
。它们是使用以下代码创建的:
import pandas as pd
df1 = pd.DataFrame([["Probe1", "Gene1", 3,11],
["Probe1", "Gene2", 6,10],
["Probe2","Gene2", 13,18]],
columns=['probe', 'gene', 'Sample1', "Sample2"]).set_index(['probe', 'gene'])
df1.columns.names = ['Sample']
# Note that number of samples can be more than two
df2 = df1.copy()
df2[df2>0] = 1.00
看起来是这样的:
In [74]: df1
Out[74]:
Sample Sample1 Sample2
probe gene
Probe1 Gene1 3 11
Gene2 6 10
Probe2 Gene2 13 18
In [75]: df2
Out[75]:
Sample Sample1 Sample2
probe gene
Probe1 Gene1 1 1
Gene2 1 1
Probe2 Gene2 1 1
PROBE GENE SMPL1 SMPL2 PROBE GENE SMPL1 SMPL2
Probe1 Gene1 3 11 Probe1 Gene1 1 1
Probe1 Gene2 6 10 Probe1 Gene2 1 1
Probe2 Gene2 13 18 Probe2 Gene2 1 1
我想做的是将这两个数据帧连接起来,以便最终写入CSV文件,如下所示:
In [74]: df1
Out[74]:
Sample Sample1 Sample2
probe gene
Probe1 Gene1 3 11
Gene2 6 10
Probe2 Gene2 13 18
In [75]: df2
Out[75]:
Sample Sample1 Sample2
probe gene
Probe1 Gene1 1 1
Gene2 1 1
Probe2 Gene2 1 1
PROBE GENE SMPL1 SMPL2 PROBE GENE SMPL1 SMPL2
Probe1 Gene1 3 11 Probe1 Gene1 1 1
Probe1 Gene2 6 10 Probe1 Gene2 1 1
Probe2 Gene2 13 18 Probe2 Gene2 1 1
我被这个问题困住了:pd.concat(ndf,axis=1)
正确的方法是什么?重置索引应该可以满足您的需要
pd.concat([df1.reset_index(),df2.reset_index()],axis=1)
输出:
Sample probe gene Sample1 Sample2 probe gene Sample1 Sample2
0 Probe1 Gene1 3 11 Probe1 Gene1 1 1
1 Probe1 Gene2 6 10 Probe1 Gene2 1 1
2 Probe2 Gene2 13 18 Probe2 Gene2 1 1
In [1424]: df1.join(df2, rsuffix='df2').reset_index()
Out[1424]:
Sample probe gene Sample1 Sample2 Sample1df2 Sample2df2
0 Probe1 Gene1 3 11 1 1
1 Probe1 Gene2 6 10 1 1
2 Probe2 Gene2 13 18 1 1
重置索引应该可以满足您的需要
pd.concat([df1.reset_index(),df2.reset_index()],axis=1)
输出:
Sample probe gene Sample1 Sample2 probe gene Sample1 Sample2
0 Probe1 Gene1 3 11 Probe1 Gene1 1 1
1 Probe1 Gene2 6 10 Probe1 Gene2 1 1
2 Probe2 Gene2 13 18 Probe2 Gene2 1 1
In [1424]: df1.join(df2, rsuffix='df2').reset_index()
Out[1424]:
Sample probe gene Sample1 Sample2 Sample1df2 Sample2df2
0 Probe1 Gene1 3 11 1 1
1 Probe1 Gene2 6 10 1 1
2 Probe2 Gene2 13 18 1 1
试试这个,我归纳为4个示例:
import pandas as pd
df1 = pd.DataFrame([["Probe1", "Gene1", 3,11,30,100],
["Probe1", "Gene2", 6,10,100,23],
["Probe2","Gene2", 13,18,20,77]],
columns=['probe', 'gene', 'Sample1', "Sample2","Sample3","Sample4"]).set_index(['probe', 'gene'])
df1.columns.names = ['Sample']
df2 = df1.copy()
df2[df2>0] = 1.00
ndf = [df1,df2]
fdf = pd.concat(ndf,axis=1)
fdf.reset_index(inplace=True)
ins1 = df1.shape[1]+2
ins2 = ins1 + 1
print ins1,ins2
fdf.insert(ins1,'probe2',fdf['probe'])
fdf.insert(ins2,'gene2',fdf['gene'])
fdf
给予
试试这个,我归纳为4个示例:
import pandas as pd
df1 = pd.DataFrame([["Probe1", "Gene1", 3,11,30,100],
["Probe1", "Gene2", 6,10,100,23],
["Probe2","Gene2", 13,18,20,77]],
columns=['probe', 'gene', 'Sample1', "Sample2","Sample3","Sample4"]).set_index(['probe', 'gene'])
df1.columns.names = ['Sample']
df2 = df1.copy()
df2[df2>0] = 1.00
ndf = [df1,df2]
fdf = pd.concat(ndf,axis=1)
fdf.reset_index(inplace=True)
ins1 = df1.shape[1]+2
ins2 = ins1 + 1
print ins1,ins2
fdf.insert(ins1,'probe2',fdf['probe'])
fdf.insert(ins2,'gene2',fdf['gene'])
fdf
给予
用户
加入
,然后重置索引
:
In [1422]: df1
Out[1422]:
Sample Sample1 Sample2
probe gene
Probe1 Gene1 3 11
Gene2 6 10
Probe2 Gene2 13 18
In [1423]: df2
Out[1423]:
Sample Sample1 Sample2
probe gene
Probe1 Gene1 1 1
Gene2 1 1
Probe2 Gene2 1 1
输出:
Sample probe gene Sample1 Sample2 probe gene Sample1 Sample2
0 Probe1 Gene1 3 11 Probe1 Gene1 1 1
1 Probe1 Gene2 6 10 Probe1 Gene2 1 1
2 Probe2 Gene2 13 18 Probe2 Gene2 1 1
In [1424]: df1.join(df2, rsuffix='df2').reset_index()
Out[1424]:
Sample probe gene Sample1 Sample2 Sample1df2 Sample2df2
0 Probe1 Gene1 3 11 1 1
1 Probe1 Gene2 6 10 1 1
2 Probe2 Gene2 13 18 1 1
用户
加入
,然后重置索引
:
In [1422]: df1
Out[1422]:
Sample Sample1 Sample2
probe gene
Probe1 Gene1 3 11
Gene2 6 10
Probe2 Gene2 13 18
In [1423]: df2
Out[1423]:
Sample Sample1 Sample2
probe gene
Probe1 Gene1 1 1
Gene2 1 1
Probe2 Gene2 1 1
输出:
Sample probe gene Sample1 Sample2 probe gene Sample1 Sample2
0 Probe1 Gene1 3 11 Probe1 Gene1 1 1
1 Probe1 Gene2 6 10 Probe1 Gene2 1 1
2 Probe2 Gene2 13 18 Probe2 Gene2 1 1
In [1424]: df1.join(df2, rsuffix='df2').reset_index()
Out[1424]:
Sample probe gene Sample1 Sample2 Sample1df2 Sample2df2
0 Probe1 Gene1 3 11 1 1
1 Probe1 Gene2 6 10 1 1
2 Probe2 Gene2 13 18 1 1
不,没有!我无法复制你的结果。@pdubois哇!编辑。不,它没有!我无法复制你的结果。@pdubois哇!编辑。