Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/279.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 熊猫从宽到长,带附加字典_Python_Pandas - Fatal编程技术网

Python 熊猫从宽到长,带附加字典

Python 熊猫从宽到长,带附加字典,python,pandas,Python,Pandas,我的数据框看起来像这样 >df ds A B C 01/01/2010 4 2 1 02/01/2010 2 9 3 03/01/2010 1 3 0 其中A&B属于1类,C属于2类 我想把它转换成: ds Category Company Value 01/01/2010 1 A 4 01/01/2010 1 B 2 01/01/2010

我的数据框看起来像这样

>df
ds           A  B  C
01/01/2010   4  2  1
02/01/2010   2  9  3
03/01/2010   1  3  0
其中A&B属于1类,C属于2类

我想把它转换成:

ds           Category  Company  Value
01/01/2010      1         A      4
01/01/2010      1         B      2
01/01/2010      2         C      1
以此类推,以便以后打印。

使用:

如果可以使用多个类别,请通过以下方式创建字典并创建新列:


或与:


我们可以使用后跟 :


非常感谢。1跟进问题。如果类别的数量超过两个(比如10家公司和4个类别,我如何放置多个np.where参数)@Martan-添加了类别列表字典的解决方案。
df['ds'] = pd.to_datetime(df['ds'], format='%d/%m/%Y')

df = df.melt('ds', var_name='Company')
d = {1:['A','B'], 2:['C']}
#swap key values in dict
#http://stackoverflow.com/a/31674731/2901002
d1 = {k: oldk for oldk, oldv in d.items() for k in oldv}

df['Category'] = df['Company'].map(d1)

#alternative1
#df['Category'] = np.where(df['Company'] == 'C', 2, 1)
#alternative2
#df['Category'] = np.where(df['Company'].isin(['A','B']), 1, 2)
df = df.sort_values(['ds','Company']).reset_index(drop=True)
df['ds'] = pd.to_datetime(df['ds'], format='%d/%m/%Y')

df = df.set_index('ds').stack().rename_axis(('ds','Company')).reset_index(name='value')
df['Category'] = np.where(df['Company'] == 'C', 2, 1)
print (df)
          ds Company  value  Category
0 2010-01-01       A      4         1
1 2010-01-01       B      2         1
2 2010-01-01       C      1         2
3 2010-01-02       A      2         1
4 2010-01-02       B      9         1
5 2010-01-02       C      3         2
6 2010-01-03       A      1         1
7 2010-01-03       B      3         1
8 2010-01-03       C      0         2
df2 = pd.melt(df, id_vars=['ds'], value_vars=['A', 'B', 'C'])

df2['Category'] = np.where((df2['variable'] == 'A') | (df2['variable'] == 'B'), 1, 2)