Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/281.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 在多索引数据帧中展平一对一映射_Python_Pandas - Fatal编程技术网

Python 在多索引数据帧中展平一对一映射

Python 在多索引数据帧中展平一对一映射,python,pandas,Python,Pandas,我有以下数据结构: from collections import OrderedDict import pandas as pd d = OrderedDict([ ((5, 3, 1), {'y1': 1}), ((5, 3, 2), {'y2': 2}), ((5, 4, 1), {'y1': 10}), ((5, 4, 2), {'y2': 20}), ((6, 3, 1), {'y1': 100}), ((6, 3, 2), {'y2

我有以下数据结构:

from collections import OrderedDict
import pandas as pd

d = OrderedDict([
    ((5, 3, 1), {'y1': 1}),
    ((5, 3, 2), {'y2': 2}),
    ((5, 4, 1), {'y1': 10}),
    ((5, 4, 2), {'y2': 20}),

    ((6, 3, 1), {'y1': 100}),
    ((6, 3, 2), {'y2': 200}),
    ((6, 4, 1), {'y1': 1000}),
    ((6, 4, 2), {'y2': 2000}),
])

df = pd.DataFrame(
    d.values(),
    index=pd.MultiIndex.from_tuples(d.keys(), names=['x3', 'x2', 'x1']),
)
这张桌子看起来像

            y1    y2
x3 x2 x1            
5  3  1      1   NaN
      2    NaN     2
   4  1     10   NaN
      2    NaN    20
6  3  1    100   NaN
      2    NaN   200
   4  1   1000   NaN
      2    NaN  2000
正如您所看到的,x1和我想要展平的列(x1=1:y1,x1=2:y2)之间存在一对一的映射

         y1    y2
x3 x2            
5  3      1     2
   4     10    20
6  3    100   200
   4   1000  2000
我怎么做

编辑:或反过来:

             y
x3 x2 x1            
5  3  1      1
      2      2
   4  1     10
      2     20
6  3  1    100
      2    200
   4  1   1000
      2   2000
给予

您可以用于删除
NaN
,因为创建
系列
、删除
第三级
以及最后一次重塑:

如果需要转换为
int
添加:

编辑:


这符合我的需要,谢谢。也许你也知道另一种方法(见我的编辑)。我自己找到了答案:
df.stack().reset_index(level=3,drop=True)。为了_frame('y')
super,我还添加了一个转换为
int
df2 = df.unstack()
df2.columns = range(4)
df3 = df2.drop([1,2], axis=1)
df3.columns = ["Y1", "Y2"]
df3
print (df.stack().reset_index(level=2,drop=True).unstack(2))
           y1      y2
x3 x2                
5  3      1.0     2.0
   4     10.0    20.0
6  3    100.0   200.0
   4   1000.0  2000.0
print (df.stack().reset_index(level=2,drop=True).unstack(2).astype(int))
         y1    y2
x3 x2            
5  3      1     2
   4     10    20
6  3    100   200
   4   1000  2000
print (df.stack().reset_index(level=3,drop=True).to_frame('y').astype(int))
             y
x3 x2 x1      
5  3  1      1
      2      2
   4  1     10
      2     20
6  3  1    100
      2    200
   4  1   1000
      2   2000