Python 将矩阵表中的数据带到按日期分组的行中_Python_Pandas_List_Dataframe_Metadata

Python 将矩阵表中的数据带到按日期分组的行中

python pandas list dataframe

Python 将矩阵表中的数据带到按日期分组的行中,python,pandas,list,dataframe,metadata,Python,Pandas,List,Dataframe,Metadata,我有一个数据集，看起来像这样，这些基本上是10，11和12的利润 Item 10/11 11/11 12/11 A 30 12 10 B 10 5 15 C 5 25

我有一个数据集，看起来像这样，这些基本上是10，11和12的利润

Item            10/11             11/11            12/11
    A                30               12                 10
    B                10                5                 15
    C                5                25                 10
    D                15               10                 18

和另一个数据帧：

Date       Item        A.unit       B.Unit    C.Unit      D.Unit   
10/11       A,D          5            0         0          12
11/11       A,B,C       10            10        5          0
12/11       A           20             0        0           0

表2中出售的单位可以是任何值现在我想要表1中A、B、C和D的计划利润列，所以输出应该是这样的

Date       Item        A.unit    A.Profit   B.Unit  B.Profit  C.Unit     C.Profit   D.Unit      D.Profit 
10/11       A,D          5          30          0     10         0         5           12            15
11/11       A,B,C       10           12         10     5         5         25           0            10
12/11       A           20          10           0     15        0         10            0           18

有谁能帮我在最后一张表中如何获取这两个表的数据。

源数据

newdf = pd.concat([df1.transpose(), df2], axis=1)

处理数据

如果第一个

df1

中的

项

不是索引，第二个

日期

不是索引，则解决方案：

print (df1.index)
RangeIndex(start=0, stop=4, step=1)

print (df2.index)
RangeIndex(start=0, stop=3, step=1)

按

项创建索引，按之前第三列的值先转置和，然后最后排序：
df11 = df1.set_index('Item').T.add_suffix('.Profit')
df = df2.merge(df11, left_on='Date', right_index=True).reset_index()

cols = sorted(df.columns[2:], key=lambda x: x.split('.')[0])
df = df[df.columns[:2].tolist() + cols]
print (df)
    Date   Item  A.unit  A.Profit  B.Unit  B.Profit  C.Unit  C.Profit  D.Unit  \
0  10/11    A,D       5        30       0        10       0         5      12   
1  11/11  A,B,C      10        12      10         5       5        25       0   
2  12/11      A      20        10       0        15       0        10       0   

   D.Profit  
0        15  
1        10  
2        18  

如果第一列是索引：
print (df1.index)
Index(['A', 'B', 'C', 'D'], dtype='object', name='Item')

print (df2.index)
Index(['10/11', '11/11', '12/11'], dtype='object', name='Date')

df11 = df1.T.add_suffix('.Profit')
df = df2.merge(df11, left_index=True, right_index=True).reset_index()

cols = sorted(df.columns[2:], key=lambda x: x.split('.')[0])
df = df[df.columns[:2].tolist() + cols]

创建一个名为a.price的新列并在转置后从项的第一行复制值可能是一种方法。我在执行“不支持的+：'int'和'str'操作数类型-@Alexeyi已更新代码”时遇到此错误。看起来你的列名类型是int。所以我把str（c）放进去。你能再检查一下吗？嗨@Alexey，我仍然没有得到正确的值，只有空白表中有总共156列，列是101.profit，102.profit……@Singh Sonu，嗨，我刚刚在代码中加入了源数据准备，让你对它进行测试，并比较导致错误的差异errors@Alexey-我正在使用两个excel文件导入数据。。在这种情况下，dd1和DD2条款不适用于扩展讨论；这段对话已经结束。
df11 = df1.set_index('Item').T.add_suffix('.Profit')
df = df2.merge(df11, left_on='Date', right_index=True).reset_index()

cols = sorted(df.columns[2:], key=lambda x: x.split('.')[0])
df = df[df.columns[:2].tolist() + cols]
print (df)
    Date   Item  A.unit  A.Profit  B.Unit  B.Profit  C.Unit  C.Profit  D.Unit  \
0  10/11    A,D       5        30       0        10       0         5      12   
1  11/11  A,B,C      10        12      10         5       5        25       0   
2  12/11      A      20        10       0        15       0        10       0   

   D.Profit  
0        15  
1        10  
2        18  

print (df1.index)
Index(['A', 'B', 'C', 'D'], dtype='object', name='Item')

print (df2.index)
Index(['10/11', '11/11', '12/11'], dtype='object', name='Date')

df11 = df1.T.add_suffix('.Profit')
df = df2.merge(df11, left_index=True, right_index=True).reset_index()

cols = sorted(df.columns[2:], key=lambda x: x.split('.')[0])
df = df[df.columns[:2].tolist() + cols]