Python 连接三个表,并对表中的列进行特定排序
我有三个数据帧,它们只相差一列。它们列出了平均值、标准偏差和计数,如下所示: 表1Python 连接三个表,并对表中的列进行特定排序,python,pandas,dataframe,Python,Pandas,Dataframe,我有三个数据帧,它们只相差一列。它们列出了平均值、标准偏差和计数,如下所示: 表1 Name Treatment Pool_mean ATP 1Week 100 ATP 4Week 500 ATP 16Weeks 1500 GTP 4Week 1000 GTP 1Week 250 GTP 16Weeks 12000 表2 Name Treatment Pool_std ATP 1Week 2 ATP 4Week 5 ATP
Name Treatment Pool_mean
ATP 1Week 100
ATP 4Week 500
ATP 16Weeks 1500
GTP 4Week 1000
GTP 1Week 250
GTP 16Weeks 12000
表2
Name Treatment Pool_std
ATP 1Week 2
ATP 4Week 5
ATP 16Weeks 15
GTP 4Week 7
GTP 1Week 2
GTP 16Weeks 30
表3
Name Treatment Pool_count
ATP 1Week 3
ATP 4Week 5
ATP 16Weeks 4
GTP 4Week 5
GTP 1Week 3
GTP 16Weeks 4
我需要一张这样的桌子:
1Week 1Week 1Week 4Weeks 4Weeks 4Weeks 16Weeks 16Weeks 16 Weeks
pool_mean pool_std pool_count pool_mean pool_std pool_count pool_mean pool_std pool_count
Name ATP 100 2 3 500 5 5 1500 15 4
Name GTP 250 2 3 1000 7 5 12000 30 4
我只是不知道该怎么办。我写了这么多代码:
df1 = pd.DataFrame(averages)
df2 = pd.DataFrame(stddev)
df3 = pd.DataFrame(count)
dfs = [df1, df2, df3]
dfs1 = pd.concat(dfs, axis=1).T.drop_duplicates().T
print(dfs1)
dfs1.to_csv('pool_merged.csv')
但它只是把我的专栏放进一个非常简单的文件,这很好,但不是我需要的。在这一点上我真的迷路了(我对这一点很陌生)
任何帮助都将不胜感激。您可以尝试以下解决方案,使用设置索引和取消堆栈,然后使用natsort旋转和排序索引:
import natsort as ns
dfs = [df1,df2,df3]
out = (pd.concat([i.set_index(['Name','Treatment']).unstack() for i in dfs],axis=1)
.swaplevel(axis=1))
out = out.reindex(columns=ns.natsorted(out.columns.get_level_values(0).unique()),level=0)
在数据帧中,您有4周和4周的额外s,这是预期的吗?不,这只是我的拼写错误
print(out)
Treatment 1Week 4Week \
Pool_mean Pool_std Pool_count Pool_mean Pool_std Pool_count
Name
ATP 100 2 3 500 5 5
GTP 250 2 3 1000 7 5
Treatment 16Weeks
Pool_mean Pool_std Pool_count
Name
ATP 1500 15 4
GTP 12000 30 4