Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/arrays/13.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/spring/11.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 使用两个值变量熔化数据帧_Python - Fatal编程技术网

Python 使用两个值变量熔化数据帧

Python 使用两个值变量熔化数据帧,python,Python,我有一个跨多个商店和地区的库存和采购数据框架。我正在尝试使用melt来堆叠数据帧,但我需要有两个值列,inventory和purchases,并且不知道如何做到这一点。数据帧如下所示: Region | Store | Inventory_Item_1 | Inventory_Item_2 | Purchase_Item_1 | Purchase_Item_2 ------------------------------------------------------

我有一个跨多个商店和地区的库存和采购数据框架。我正在尝试使用melt来堆叠数据帧,但我需要有两个值列,inventory和purchases,并且不知道如何做到这一点。数据帧如下所示:

Region   |   Store   |  Inventory_Item_1   |  Inventory_Item_2  |  Purchase_Item_1  |  Purchase_Item_2
------------------------------------------------------------------------------------------------------       
 North         A             15                    20                 5                     6
 North         B             20                    25                 7                     8
 North         C             18                    22                 6                     10
 South         D             10                    15                 9                     7
 South         E             12                    12                 10                    8
  Region   |   Store   |      Item              |  Inventory   |   Purchases      
 -----------------------------------------------------------------------------
   North        A         Inventory_Item_1             15             5
   North        A         Inventory_Item_2             20             6
   North        B         Inventory_Item_1             20             7
   North        B         Inventory_Item_2             25             8    
   North        C         Inventory_Item_1             18             6
   North        C         Inventory_Item_2             22             10
   South        D         Inventory_Item_1             10             9
   South        D         Inventory_Item_2             15             7
   South        E         Inventory_Item_1             12             10
   South        E         Inventory_Item_2             12             8
我试图将数据帧转换为以下格式:

Region   |   Store   |  Inventory_Item_1   |  Inventory_Item_2  |  Purchase_Item_1  |  Purchase_Item_2
------------------------------------------------------------------------------------------------------       
 North         A             15                    20                 5                     6
 North         B             20                    25                 7                     8
 North         C             18                    22                 6                     10
 South         D             10                    15                 9                     7
 South         E             12                    12                 10                    8
  Region   |   Store   |      Item              |  Inventory   |   Purchases      
 -----------------------------------------------------------------------------
   North        A         Inventory_Item_1             15             5
   North        A         Inventory_Item_2             20             6
   North        B         Inventory_Item_1             20             7
   North        B         Inventory_Item_2             25             8    
   North        C         Inventory_Item_1             18             6
   North        C         Inventory_Item_2             22             10
   South        D         Inventory_Item_1             10             9
   South        D         Inventory_Item_2             15             7
   South        E         Inventory_Item_1             12             10
   South        E         Inventory_Item_2             12             8
这是我写的,但我不知道如何为库存和采购创建列。请注意,我的完整数据帧要大得多(50多个区域、140多个存储区、15多个项目)


任何帮助或建议都将不胜感激

您可以通过以下步骤到达:

# please always provide minimal working code - we as helpers and answerers 
# otherwise have to invest extra time to generate beginning working code
# and that is unfair - we already spend enough time to solve the problem:
df = pd.DataFrame([
["North","A",15,20,5,6],
["North","B",20,25,7,8],
["North","C",18,22,6,10],
["South","D",10,15,9,7],
["South","E",12,12,10,8]], columns=["Region","Store","Inventory_Item_1","Inventory_Item_2","Purchase_Item_1","Purchase_Item_2"])

# melt the dataframe completely first
df_final = pd.melt(df, id_vars=['Region', 'Store'], value_vars=['Inventory_Item_1', 'Inventory_Item_2', 'Purchase_Item_1', 'Purchase_Item_2'])

# extract inventory and purchase sub data frames
# they have in common the "variable" column (the item number!)
# so let it look exactly the same in both data frames by removing
# unnecessary parts
df_inventory = df_final.loc[[x.startswith("Inventory") for x in df_final.variable],:]
df_inventory.variable = [s.replace("Inventory_", "") for s in df_inventory.variable]
df_purchase = df_final.loc[[x.startswith("Purchase") for x in df_final.variable],:]
df_purchase.variable = [s.replace("Purchase_", "") for s in df_purchase.variable]

# deepcopy the data frames (just to keep old results so that you can inspect them)
df_purchase_ = df_purchase.copy()
df_inventory_ = df_inventory.copy()

# rename the columns to prepare for merging
df_inventory_.columns = ["Region", "Store", "variable", "Inventory"]
df_purchase_.columns = ["Region", "Store", "variable", "Purchase"]

# merge by the three common columns
df_final_1 = pd.merge(df_inventory_, df_purchase_, how="left", left_on=["Region", "Store", "variable"], right_on=["Region", "Store", "variable"])

# sort by the three common columns
df_final_1.sort_values(by=["Region", "Store", "variable"], axis=0)
这是回报

  Region Store variable  Inventory  Purchase
0  North     A   Item_1         15         5
5  North     A   Item_2         20         6
1  North     B   Item_1         20         7
6  North     B   Item_2         25         8
2  North     C   Item_1         18         6
7  North     C   Item_2         22        10
3  South     D   Item_1         10         9
8  South     D   Item_2         15         7
4  South     E   Item_1         12        10
9  South     E   Item_2         12         8

您可以通过以下步骤到达:

# please always provide minimal working code - we as helpers and answerers 
# otherwise have to invest extra time to generate beginning working code
# and that is unfair - we already spend enough time to solve the problem:
df = pd.DataFrame([
["North","A",15,20,5,6],
["North","B",20,25,7,8],
["North","C",18,22,6,10],
["South","D",10,15,9,7],
["South","E",12,12,10,8]], columns=["Region","Store","Inventory_Item_1","Inventory_Item_2","Purchase_Item_1","Purchase_Item_2"])

# melt the dataframe completely first
df_final = pd.melt(df, id_vars=['Region', 'Store'], value_vars=['Inventory_Item_1', 'Inventory_Item_2', 'Purchase_Item_1', 'Purchase_Item_2'])

# extract inventory and purchase sub data frames
# they have in common the "variable" column (the item number!)
# so let it look exactly the same in both data frames by removing
# unnecessary parts
df_inventory = df_final.loc[[x.startswith("Inventory") for x in df_final.variable],:]
df_inventory.variable = [s.replace("Inventory_", "") for s in df_inventory.variable]
df_purchase = df_final.loc[[x.startswith("Purchase") for x in df_final.variable],:]
df_purchase.variable = [s.replace("Purchase_", "") for s in df_purchase.variable]

# deepcopy the data frames (just to keep old results so that you can inspect them)
df_purchase_ = df_purchase.copy()
df_inventory_ = df_inventory.copy()

# rename the columns to prepare for merging
df_inventory_.columns = ["Region", "Store", "variable", "Inventory"]
df_purchase_.columns = ["Region", "Store", "variable", "Purchase"]

# merge by the three common columns
df_final_1 = pd.merge(df_inventory_, df_purchase_, how="left", left_on=["Region", "Store", "variable"], right_on=["Region", "Store", "variable"])

# sort by the three common columns
df_final_1.sort_values(by=["Region", "Store", "variable"], axis=0)
这是回报

  Region Store variable  Inventory  Purchase
0  North     A   Item_1         15         5
5  North     A   Item_2         20         6
1  North     B   Item_1         20         7
6  North     B   Item_2         25         8
2  North     C   Item_1         18         6
7  North     C   Item_2         22        10
3  South     D   Item_1         10         9
8  South     D   Item_2         15         7
4  South     E   Item_1         12        10
9  South     E   Item_2         12         8

我将使用行和列上的分层索引来完成这些操作

对于行,您可以非常轻松地
设置_索引(['Region','Store'])

不过,你必须对专栏有点小技巧。由于您需要访问通过在区域和存储上设置索引而产生的非索引列,因此需要将其
导入到一个自定义函数,该函数将构建所需的元组并创建名称多级列索引

之后,您可以将列堆叠到行索引中,并可以选择重置整行索引,使所有内容再次成为普通列

df=pd.DataFrame({
‘地区’:[‘北’、‘北’、‘北’、‘南’、‘南’],
'商店':['A','B','C','D','E'],
“库存项目1”:[15,20,18,10,12],
“库存项目2”:[20,25,22,15,12],
“采购项目1”:[5,7,6,9,10],
“购买项目2”:[6,8,10,7,8]
})
输出=(
设置索引(['Region','Store'])
.管道(λdf:
df.set_轴(df.columns.str.split(“”,n=1,expand=True),axis='columns')
)
.rename_axis(['Status','Product'],axis='columns'))
.stack(level='Product')
.reset_index()
)
这给了我:

Region Store Product  Inventory  Purchase
 North     A  Item_1         15         5
 North     A  Item_2         20         6
 North     B  Item_1         20         7
 North     B  Item_2         25         8
 North     C  Item_1         18         6
 North     C  Item_2         22        10
 South     D  Item_1         10         9
 South     D  Item_2         15         7
 South     E  Item_1         12        10
 South     E  Item_2         12         8

我将使用行和列上的分层索引来完成这些操作

对于行,您可以非常轻松地
设置_索引(['Region','Store'])

不过,你必须对专栏有点小技巧。由于您需要访问通过在区域和存储上设置索引而产生的非索引列,因此需要将其
导入到一个自定义函数,该函数将构建所需的元组并创建名称多级列索引

之后,您可以将列堆叠到行索引中,并可以选择重置整行索引,使所有内容再次成为普通列

df=pd.DataFrame({
‘地区’:[‘北’、‘北’、‘北’、‘南’、‘南’],
'商店':['A','B','C','D','E'],
“库存项目1”:[15,20,18,10,12],
“库存项目2”:[20,25,22,15,12],
“采购项目1”:[5,7,6,9,10],
“购买项目2”:[6,8,10,7,8]
})
输出=(
设置索引(['Region','Store'])
.管道(λdf:
df.set_轴(df.columns.str.split(“”,n=1,expand=True),axis='columns')
)
.rename_axis(['Status','Product'],axis='columns'))
.stack(level='Product')
.reset_index()
)
这给了我:

Region Store Product  Inventory  Purchase
 North     A  Item_1         15         5
 North     A  Item_2         20         6
 North     B  Item_1         20         7
 North     B  Item_2         25         8
 North     C  Item_1         18         6
 North     C  Item_2         22        10
 South     D  Item_1         10         9
 South     D  Item_2         15         7
 South     E  Item_1         12        10
 South     E  Item_2         12         8
您可以从中使用该功能;目前,您必须从以下位置安装最新的开发版本:

它通过将包含组的正则表达式传递给
names\u pattern
参数来工作。
names\u to
中的“.value”可确保
库存
采购
作为列标题保存,而另一组(
项目_1
项目_2
)被整理成一个新组
项目
您可以从中使用该功能;目前,您必须从以下位置安装最新的开发版本:


它通过将包含组的正则表达式传递给
names\u pattern
参数来工作。
names\u to
中的“.value”可确保
库存
采购
作为列标题保存,而另一组(
项目1
项目2
)则被整理成一个新组
项目

,因此您的目的是为了丢失采购信息?(关于购买项目1和购买项目2的信息丢失)@Gwang JinKim购买项目1和购买项目2只是购买项目1和2。该数据在“采购”列中。这实际上是要点-不应将其命名为“库存\项目\ 1”…-但只是“第1项”、“第2项”。。。否则会很混乱-看到我的解决方案了吗?你的目的是为了散播购买信息?(关于购买项目1和购买项目2的信息丢失)@Gwang JinKim购买项目1和购买项目2只是购买项目1和2。该数据在“采购”列中。这实际上是要点-不应将其命名为“库存\项目\ 1”…-但只是“第1项”、“第2项”。。。否则会非常混乱-请参阅我的解决方案