Python 将多索引取消堆叠到单行_Python_Pandas_Reshape_Multi Index

Python 将多索引取消堆叠到单行

python pandas

Python 将多索引取消堆叠到单行,python,pandas,reshape,multi-index,Python,Pandas,Reshape,Multi Index,我对简单的熊猫很在行，但在数据重塑和多索引方面却很吃力。我有一个multindex数据框架，看起来是这样的（它不必是multindex，但似乎是正确的做法）名称指数 f1 f2 f3 calc1 calc2 钙狐狸 1. 红色白色毛皮 0.21 1.67 -0.34 2. 0.76 2.20 -1.02 3. 0.01 1.12 -0.22 鸡 1. 白色黄色的羽毛 0.04 1.18 -2.01 2. 0.18 0.73 -1.21 粮食 1. 黄色的纸袋玉米 0.89 1.

我对简单的熊猫很在行，但在数据重塑和多索引方面却很吃力。我有一个multindex数据框架，看起来是这样的（它不必是multindex，但似乎是正确的做法）

名称指数 f1 f2 f3 calc1 calc2 钙狐狸 1. 红色白色毛皮 0.21 1.67 -0.34 2. 0.76 2.20 -1.02 3. 0.01 1.12 -0.22 鸡 1. 白色黄色的羽毛 0.04 1.18 -2.01 2. 0.18 0.73 -1.21 粮食 1. 黄色的纸袋玉米 0.89 1.65 -1.03 2. 0.34 2.45 -0.45 3. 0.87 1.11 -0.97 尝试+重塑为长格式

new_df = df.set_index(['name', 'index', 'f1', 'f2', 'f3']).unstack('index')

或通过

使用以下内容对多索引进行排序：

然后通过+减少多索引：

new_df

：

      name      f1      f2        f3  calc1_1  calc2_1  calc3_1  calc1_2  calc2_2  calc3_2  calc1_3  calc2_3  calc3_3
0  chicken   white  yellow  feathers     0.04     1.18    -2.01     0.18     0.73    -1.21      NaN      NaN      NaN
1      fox     red   white       fur     0.21     1.67    -0.34     0.76     2.20    -1.02     0.01     1.12    -0.22
2    grain  yellow     bag      corn     0.89     1.65    -1.03     0.34     2.45    -0.45     0.87     1.11    -0.97

完整代码：

import pandas as pd

df = pd.DataFrame({
    'name': ['fox', 'fox', 'fox', 'chicken', 'chicken', 'grain', 'grain',
             'grain'],
    'index': [1, 2, 3, 1, 2, 1, 2, 3],
    'f1': ['red', 'red', 'red', 'white', 'white', 'yellow', 'yellow', 'yellow'],
    'f2': ['white', 'white', 'white', 'yellow', 'yellow', 'bag', 'bag', 'bag'],
    'f3': ['fur', 'fur', 'fur', 'feathers', 'feathers', 'corn', 'corn', 'corn'],
    'calc1': [0.21, 0.76, 0.01, 0.04, 0.18, 0.89, 0.34, 0.87],
    'calc2': [1.67, 2.2, 1.12, 1.18, 0.73, 1.65, 2.45, 1.11],
    'calc3': [-0.34, -1.02, -0.22, -2.01, -1.21, -1.03, -0.45, -0.97]
})

new_df = (
    df.set_index(['name', 'index', 'f1', 'f2', 'f3'])
        .unstack('index')
        .sort_index(axis=1, level=1)
)

new_df.columns = new_df.columns.map(lambda s: '_'.join(map(str, s)))

new_df = new_df.reset_index()

尝试+重塑为长格式

new_df = df.set_index(['name', 'index', 'f1', 'f2', 'f3']).unstack('index')

或通过

使用以下内容对多索引进行排序：

然后通过+减少多索引：

new_df

：

      name      f1      f2        f3  calc1_1  calc2_1  calc3_1  calc1_2  calc2_2  calc3_2  calc1_3  calc2_3  calc3_3
0  chicken   white  yellow  feathers     0.04     1.18    -2.01     0.18     0.73    -1.21      NaN      NaN      NaN
1      fox     red   white       fur     0.21     1.67    -0.34     0.76     2.20    -1.02     0.01     1.12    -0.22
2    grain  yellow     bag      corn     0.89     1.65    -1.03     0.34     2.45    -0.45     0.87     1.11    -0.97

完整代码：

import pandas as pd

df = pd.DataFrame({
    'name': ['fox', 'fox', 'fox', 'chicken', 'chicken', 'grain', 'grain',
             'grain'],
    'index': [1, 2, 3, 1, 2, 1, 2, 3],
    'f1': ['red', 'red', 'red', 'white', 'white', 'yellow', 'yellow', 'yellow'],
    'f2': ['white', 'white', 'white', 'yellow', 'yellow', 'bag', 'bag', 'bag'],
    'f3': ['fur', 'fur', 'fur', 'feathers', 'feathers', 'corn', 'corn', 'corn'],
    'calc1': [0.21, 0.76, 0.01, 0.04, 0.18, 0.89, 0.34, 0.87],
    'calc2': [1.67, 2.2, 1.12, 1.18, 0.73, 1.65, 2.45, 1.11],
    'calc3': [-0.34, -1.02, -0.22, -2.01, -1.21, -1.03, -0.45, -0.97]
})

new_df = (
    df.set_index(['name', 'index', 'f1', 'f2', 'f3'])
        .unstack('index')
        .sort_index(axis=1, level=1)
)

new_df.columns = new_df.columns.map(lambda s: '_'.join(map(str, s)))

new_df = new_df.reset_index()