Pandas 列表的Dataframe到多索引Dataframe_Pandas_Multi Index

Pandas 列表的Dataframe到多索引Dataframe

pandas

Pandas 列表的Dataframe到多索引Dataframe,pandas,multi-index,Pandas,Multi Index,我有一个列表的数据框架，列表中的每个值代表一个更大数据集的平均值、标准值和数值。我想为列表中的三个值创建一个子索引数据帧示例如下： np.random.seed(2) d={i: {j:[np.random.randint(10) for i in range(0,3)] for j in ['x','y','z']} for i in ['a','b','c']} pd.DataFrame.from_dict(d,orient='index') 其中： x y z a

我有一个列表的数据框架，列表中的每个值代表一个更大数据集的平均值、标准值和数值。我想为列表中的三个值创建一个子索引

数据帧示例如下：

np.random.seed(2)
d={i: {j:[np.random.randint(10) for i in range(0,3)] for j in ['x','y','z']} for i in ['a','b','c']}
pd.DataFrame.from_dict(d,orient='index')

其中：

    x   y   z
a   [1, 4, 5]   [7, 4, 4]   [0, 6, 3]
b   [7, 1, 9]   [1, 3, 8]   [3, 6, 2]
c   [1, 6, 6]   [6, 5, 0]   [6, 5, 9]

我想：

    x              y              z
    mean std count mean std count mean std count
a   1    4   5     7    4   4     0    6   3
b   7    1   9     1    3   8     3    6   2
c   1    6   6     6    5   0     6    5   9

您可以将内部列表与和连接起来，构建多索引列，然后生成新的数据帧：

np.random.seed(2)
d = {
    i: {j: [np.random.randint(10) for i in range(0, 3)] for j in ["x", "y", "z"]}
    for i in ["a", "b", "c"]
}
df = pd.DataFrame.from_dict(d, orient="index")
df

        x          y            z
a   [8, 8, 6]   [2, 8, 7]   [2, 1, 5]
b   [4, 4, 5]   [7, 3, 6]   [4, 3, 7]
c   [6, 1, 3]   [5, 8, 4]   [6, 3, 9]

data = np.vstack([np.concatenate(entry) for entry in df.to_numpy()])
columns = pd.MultiIndex.from_product([df.columns, ["mean", "std", "count"]])
pd.DataFrame(data, columns=columns, index = df.index)


                   x                 y                    z
    mean    std count   mean    std count   mean    std count
a      8    8   6        2      8   7       2       1   5
b      4    4   5        7      3   6       4       3   7
c      6    1   3        5      8   4       6       3   9

您可以将内部列表与和连接起来，构建多索引列，然后生成新的数据帧：

np.random.seed(2)
d = {
    i: {j: [np.random.randint(10) for i in range(0, 3)] for j in ["x", "y", "z"]}
    for i in ["a", "b", "c"]
}
df = pd.DataFrame.from_dict(d, orient="index")
df

        x          y            z
a   [8, 8, 6]   [2, 8, 7]   [2, 1, 5]
b   [4, 4, 5]   [7, 3, 6]   [4, 3, 7]
c   [6, 1, 3]   [5, 8, 4]   [6, 3, 9]

data = np.vstack([np.concatenate(entry) for entry in df.to_numpy()])
columns = pd.MultiIndex.from_product([df.columns, ["mean", "std", "count"]])
pd.DataFrame(data, columns=columns, index = df.index)


                   x                 y                    z
    mean    std count   mean    std count   mean    std count
a      8    8   6        2      8   7       2       1   5
b      4    4   5        7      3   6       4       3   7
c      6    1   3        5      8   4       6       3   9

看起来你可以这样做。如果你放一个np.random.seed2或其他数字，那么随机数据是恒定的。如果你放一个np.random.seed2或其他数字，那么随机数据是恒定的。如果你放一个np.random.seed2或其他数字，那么帮助