python在合并后对列进行排序_Python_Pandas

python在合并后对列进行排序

python pandas

python在合并后对列进行排序,python,pandas,Python,Pandas,我合并了两个带有多个重叠列的数据帧。我想把重叠的柱子并排放在一起 merge = df1.merge(df2, how='outer') 输出： A,B,C,D,A_x,B_x,C_x,D_x 我希望输出为： A,A_x,B,B_x,C,C_x,D,D_x 我可以明确地做到这一点，但我有很多列，希望得到“动态”解决方案。使用。排序索引（axis=1）：使用.sort\u索引（轴=1）：虽然这里的MaxU-answer可能是正确的答案，但在合并后，您始终可以使用以下： df[sorted

我合并了两个带有多个重叠列的数据帧。我想把重叠的柱子并排放在一起

merge = df1.merge(df2, how='outer')

输出：

A,B,C,D,A_x,B_x,C_x,D_x

我希望输出为：

A,A_x,B,B_x,C,C_x,D,D_x

我可以明确地做到这一点，但我有很多列，希望得到“动态”解决方案。

使用

。排序索引（axis=1）

：

使用

.sort\u索引（轴=1）

：

虽然这里的MaxU-answer可能是正确的答案，但在合并后，您始终可以使用以下：

df[sorted(df.columns)]
原因是如果要使用另一个密钥（与lambda一起使用）：
例如：

import pandas as pd cols = 'A,B,C,D,A_x,B_x,C_x,D_x'.split(',') df = pd.DataFrame(columns=cols) df.loc[0] = list(range(len(cols))) df[sorted(df.columns)]
返回：

A A_x B B_x C C_x D D_x 0 0 4 1 5 2 6 3 7

虽然这里的MaxU-answer可能是正确的答案，但在合并后，您始终可以使用以下：

df[sorted(df.columns)]
原因是如果要使用另一个密钥（与lambda一起使用）：
例如：

import pandas as pd cols = 'A,B,C,D,A_x,B_x,C_x,D_x'.split(',') df = pd.DataFrame(columns=cols) df.loc[0] = list(range(len(cols))) df[sorted(df.columns)]
返回：

A A_x B B_x C C_x D D_x 0 0 4 1 5 2 6 3 7

另一种可能是通用的列重新排序解决方案可以是-

cols_order = ['A', 'A_x' , 'B' , 'B_x', 'C' , 'C_x', 'D', 'D_x'] merge = merge[cols_order]

这将根据列表中的顺序cols_order用列的顺序重写数据帧。
可以使用另一种可能通用的列重新排序解决方案-

cols_order = ['A', 'A_x' , 'B' , 'B_x', 'C' , 'C_x', 'D', 'D_x'] merge = merge[cols_order]

# Create initial random data. np.random.seed(0) df1 = pd.DataFrame(np.random.randn(5, 3), columns=list('ABx')) df2 = pd.DataFrame(np.random.randn(5, 3), columns=list('ABy')) df = df1.merge(df2, how='outer', suffixes=['', '_x'], left_index=True, right_index=True) col_order = [] common_columns = df1.columns & df2.columns for c in common_columns: col_order.append(c) col_order.append(c + '_x') # Add non-common columns to right side of dataframe. col_order.extend([c for c in df if c not in common_columns and not c.endswith('_x')]) >>> df[col_order] A A_x B B_x x y 0 1.764052 0.333674 0.400157 1.494079 0.978738 -0.205158 1 2.240893 0.313068 1.867558 -0.854096 -0.977278 -2.552990 2 0.950088 0.653619 -0.151357 0.864436 -0.103219 -0.742165 3 0.410599 2.269755 0.144044 -1.454366 1.454274 0.045759 4 0.761038 -0.187184 0.121675 1.532779 0.443863 1.469359

这将根据列表中的顺序cols_order以列的顺序重写数据框。
oh！我现在知道了，你想在它被合并后应用它（在第二步中），然后它就可以工作了+1@MaxU是的，并且.tolist（）不是必需的，所以我删除了它。我也说得很清楚，你以后再做。哦！我现在知道了，你想在它被合并后应用它（在第二步中），然后它就可以工作了+1@MaxU是的，并且.tolist（）不是必需的，所以我删除了它。我也说得很清楚，你以后再做。
# Create initial random data. np.random.seed(0) df1 = pd.DataFrame(np.random.randn(5, 3), columns=list('ABx')) df2 = pd.DataFrame(np.random.randn(5, 3), columns=list('ABy')) df = df1.merge(df2, how='outer', suffixes=['', '_x'], left_index=True, right_index=True) col_order = [] common_columns = df1.columns & df2.columns for c in common_columns: col_order.append(c) col_order.append(c + '_x') # Add non-common columns to right side of dataframe. col_order.extend([c for c in df if c not in common_columns and not c.endswith('_x')]) >>> df[col_order] A A_x B B_x x y 0 1.764052 0.333674 0.400157 1.494079 0.978738 -0.205158 1 2.240893 0.313068 1.867558 -0.854096 -0.977278 -2.552990 2 0.950088 0.653619 -0.151357 0.864436 -0.103219 -0.742165 3 0.410599 2.269755 0.144044 -1.454366 1.454274 0.045759 4 0.761038 -0.187184 0.121675 1.532779 0.443863 1.469359