使用不同列中的值在python中创建一个热编码
我有一个关于自助餐厅食物选项的数据框使用不同列中的值在python中创建一个热编码,python,python-3.x,pandas,dataframe,Python,Python 3.x,Pandas,Dataframe,我有一个关于自助餐厅食物选项的数据框df: Meal Food Cooked Percentage Breakfast Yes Attended 1.00 Breakfast Apple No .25 Breakfast Oatmeal Yes .55 Breakfast Skipped Not
df
:
Meal Food Cooked Percentage
Breakfast Yes Attended 1.00
Breakfast Apple No .25
Breakfast Oatmeal Yes .55
Breakfast Skipped Not .20
Lunch Yes Attended 1.00
Lunch Salad No .42
Lunch Pizza Yes .48
Lunch Skipped Not .10
我已经尝试了pd.get_dummies()
包,但这为Cooked
类别列提供了二进制编码。我希望将我的数据集转换为:
Meal Food Cooked Percentage No Yes Not
Breakfast Yes Attended 1.00 .25 .55 .20
Breakfast Apple No .25 .25 .55 .20
Breakfast Oatmeal Yes .55 .25 .55 .20
Breakfast Skipped Not .20 .25 .55 .20
Lunch Yes Attended 1.00 .42 .48 .10
Lunch Salad No .42 .42 .48 .10
Lunch Pizza Yes .48 .42 .48 .10
Lunch Skipped Not .10 .42 .48 .10
因此,我尝试根据第二列中的值将一列的值转换为新列。让我们尝试
pivot
然后reindex
+join
s = df.pivot('Meal','Cooked','Percentage').reindex(df.Meal)
s.index = df.index
df = df.join(s)
df
Out[124]:
Meal Food Cooked Percentage Attended No Not Yes
0 Breakfast Yes Attended 1.00 1.0 0.25 0.2 0.55
1 Breakfast Apple No 0.25 1.0 0.25 0.2 0.55
2 Breakfast Oatmeal Yes 0.55 1.0 0.25 0.2 0.55
3 Breakfast Skipped Not 0.20 1.0 0.25 0.2 0.55
4 Lunch Yes Attended 1.00 1.0 0.42 0.1 0.48
5 Lunch Salad No 0.42 1.0 0.42 0.1 0.48
6 Lunch Pizza Yes 0.48 1.0 0.42 0.1 0.48
7 Lunch Skipped Not 0.10 1.0 0.42 0.1 0.48
最后一排午餐或早餐