Python 在pandas中转置非结构化行_Python_Pandas_Transpose

Python 在pandas中转置非结构化行

python pandas

Python 在pandas中转置非结构化行,python,pandas,transpose,Python,Pandas,Transpose,我有这样一个数据集： category UK US Germany sales 100000 48000 36000 budget 50000 20000 14000 n_employees 300 123 134 d

我有这样一个数据集：

category                 UK             US           Germany  
sales                    100000        48000        36000 
budget                   50000         20000        14000
n_employees              300           123          134  
diversified              1             0            1   
sustainability_score     22.8          38.9         34.5
e_commerce               37000         7000         11000   
budget                   25000         10000        10000
n_employees              18            22           7  
traffic                  150 mil       38 mil       12500 
subsidy                  33000         26000        23000  
budget                   14000         6000         6000
own_marketing            0             0            1

UK_main_sales
UK_main_budget
UK_main_n_employees
UK_main_diversified
UK_main_sustainability_score 
UK_e_commerce (we could also add sales but I think it is simpler without sales)
UK_e_commerce_budget
UK_e_commerce_n_employees
UK_e_commerce_traffic
UK_subsidy
UK_subsidy_budget
UK_subsidy_own_marketing

在数据集中，sales变量对应于总部的销售额。

电子商务

是

电子商务

的销售，

电子商务

之后的

预算

实际上是公司

电子商务

部门的预算。同样的情况也适用于

补贴Y

，

补贴

变量对应于

补贴

的销售，

补贴后的预算
变量是补贴
的预算。我想将dataset转换为类似的内容（如果我们以英国为例）：
等等。我试图通过跟踪预算变量对不同部门的变量进行分类，因为它总是在离职后出现，但我没有成功。
英国变量的完整列表应如下所示：
category                 UK             US           Germany  
sales                    100000        48000        36000 
budget                   50000         20000        14000
n_employees              300           123          134  
diversified              1             0            1   
sustainability_score     22.8          38.9         34.5
e_commerce               37000         7000         11000   
budget                   25000         10000        10000
n_employees              18            22           7  
traffic                  150 mil       38 mil       12500 
subsidy                  33000         26000        23000  
budget                   14000         6000         6000
own_marketing            0             0            1

UK_main_sales
UK_main_budget
UK_main_n_employees
UK_main_diversified
UK_main_sustainability_score 
UK_e_commerce (we could also add sales but I think it is simpler without sales)
UK_e_commerce_budget
UK_e_commerce_n_employees
UK_e_commerce_traffic
UK_subsidy
UK_subsidy_budget
UK_subsidy_own_marketing

有什么想法吗？
我认为需要：
#get boolean mask for rows for split
mask = df['category'].isin(['subsidy', 'e_commerce'])

#create NaNs for non match values by where
#replace NaNs by forward fill, first NaNs replace by fillna
#create mask for match values by mask and replace by empty string
#join together 
df['category'] = (df['category'].where(mask).ffill().fillna('main').mask(mask).fillna('') 
                   + '_' + df['category']).str.strip('_')

#reshape by unstack 
df = df.set_index('category').unstack().to_frame().T
#flatten MultiIndex
df.columns = df.columns.map('_'.join)


非常感谢，我怎样才能将“main”添加到第一个变量中，将补贴和电子商务添加到其他变量中？哎呀，给我一秒钟你想只更改budget
s的名称吗？哦，不，我们所有人都不想them@edyvedy13-我好像迷路了（对不起，更多列（可能是下一个6列）的预期输出是什么？