Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/unix/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 将不同的数据帧合并在一起,但索引可能并不总是相同的_Python_Pandas_Dataframe - Fatal编程技术网

Python 将不同的数据帧合并在一起,但索引可能并不总是相同的

Python 将不同的数据帧合并在一起,但索引可能并不总是相同的,python,pandas,dataframe,Python,Pandas,Dataframe,我有11个不同的区域(P01、P02、…、P11),每个区域都有一些由代码(INV 1-1、INV 1-2、…、INV 8-4)标识的设备。问题是,设备的数量随着区域的变化而变化,因此,例如,P01没有代码INV 6-4,但P02有代码INV 6-4。但它们的值将始终位于索引数组中 我有一个名为allEquipAllAreas的数据框,它为每个区域的每个INV保存浮点值。以下是一个例子: P01-INV-1-1 P01-INV-1-2 P01-INV-1-3 P01-INV-1-4 P11-IN

我有11个不同的区域(P01、P02、…、P11),每个区域都有一些由代码(INV 1-1、INV 1-2、…、INV 8-4)标识的设备。问题是,设备的数量随着区域的变化而变化,因此,例如,P01没有代码INV 6-4,但P02有代码INV 6-4。但它们的值将始终位于
索引
数组中

我有一个名为
allEquipAllAreas
的数据框,它为每个区域的每个INV保存浮点值。以下是一个例子:

P01-INV-1-1 P01-INV-1-2 P01-INV-1-3 P01-INV-1-4 P11-INV-7-2 P11-INV-7-3 P11-INV-7-4
   -0.52       1.89         1.61        1.59        2.02        1.29       -0.89
我创建了一个for,用于遍历所有区域并获取与该区域相关的所有设备,因此我希望最终得到一个最终数据帧(
heatmapindf
),如下所示,但我希望将
allequipallarears
放在相应的列上,而不是“NaN”:

         P01  P02  P03  P04  P05  P06  P07  P08  P09  P10  P11
INV 1-1  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
INV 1-2  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
INV 1-3  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
  ...                             ...
INV 8-2  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
INV 8-3  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
INV 8-4  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
我曾尝试将它们合并,但无法实现我想要的,因此,我目前所做的就是:

index = ['INV 1-1','INV 1-2','INV 1-3','INV 1-4','INV 2-1','INV 2-2','INV 2-3','INV 2-4',
 'INV 3-1','INV 3-2','INV 3-3','INV 3-4','INV 4-1','INV 4-2','INV 4-3','INV 4-4',
 'INV 5-1','INV 5-2','INV 5-3','INV 5-4','INV 6-1','INV 6-2','INV 6-3','INV 6-4',
 'INV 7-1','INV 7-2','INV 7-3','INV 7-4','INV 8-1','INV 8-2','INV 8-3','INV 8-4']
columns = ['P01','P02','P03','P04','P05','P06','P07','P08','P09','P10','P11']
heatMapInvdf = pd.DataFrame(index=index, columns=columns)
for area in areas:
    equipInArea = allEquipAllAreas.loc[:,allEquipAllAreas.columns.str.contains('P'+area+'-')]
    equipInArea = equipInArea.reindex(sorted(equipInArea.columns), axis=1).T
    equipInArea.index = equipInArea.index.str.replace(r'P'+area+'-', '')
    heatMapInvdf.merge(equipInArea,how='inner',right_index=True,left_index=True)

非常感谢您的帮助

您在源DF中拥有所需的一切。系统地重塑它

  • 转置
  • 具有多个索引的索引,该索引将拆分原始列名
  • unstack()
    以获取所需的结构
  • droplevel()
    进行清理
P01 P11 INV-1-1 -0.52 楠 INV-1-2 1.89 楠 INV-1-3 1.61 楠 INV-1-4 1.59 楠 INV-7-2 楠 2.02 INV-7-3 楠 1.29 INV-7-4 楠 -0.89
我想:
heatMapInvdf.loc[equinarea.index]['P01']=equinarea.values
但是当打印heatMapInvdf时,这些值仍然是惊人的!非常感谢你的帮助,罗伯!
import io
import numpy as np
df = pd.read_csv(io.StringIO("""P01-INV-1-1 P01-INV-1-2 P01-INV-1-3 P01-INV-1-4 P11-INV-7-2 P11-INV-7-3 P11-INV-7-4
   -0.52       1.89         1.61        1.59        2.02        1.29       -0.89"""), sep="\s+")

heatMapInvdf = (
    # transpose for primary shape that is wanted
    df.T
    # index by multi-index which are from columns
    .set_index(pd.MultiIndex.from_arrays(np.array([c.split("-", 1) for c in df.columns]).T))
    # unstack the P0n part of index
    .unstack(0)
    # remove transitent level from column index
    .droplevel(0, axis=1)

)