Python 熊猫：将列拆分为具有唯一值的多个列_Python_Pandas_Multiple Columns

Python 熊猫：将列拆分为具有唯一值的多个列

python pandas

Python 熊猫：将列拆分为具有唯一值的多个列,python,pandas,multiple-columns,Python,Pandas,Multiple Columns,假设我有以下数据帧： A 0 Me 1 Myself 2 and 3 Irene 4 Me, Myself, and Irene 需要转化为： Me Myself and Irene 0 1 0 0 0 1 0 1 0 0 2 0 0 1 0 3 0 0 0 1 4 1 1 1 1 寻找任何建议。您可以按所有可能的类别使用： df1 = pd.D

假设我有以下数据帧：

   A
0  Me
1  Myself
2  and
3  Irene
4  Me, Myself, and Irene

需要转化为：

   Me  Myself  and  Irene
0  1   0       0    0
1  0   1       0    0
2  0   0       1    0
3  0   0       0    1
4  1   1       1    1

寻找任何建议。

您可以按所有可能的类别使用：

df1 = pd.DataFrame({'A': ['Me', 'Myself', 'and', 'Irene']})
df2= pd.DataFrame({'A': ['Me', 'Myself', 'and']})
df3 = pd.DataFrame({'A': ['Me', 'Myself', 'or', 'Irene']})

all_categories = pd.concat([df1.A, df2.A, df3.A]).unique()
print (all_categories)
['Me' 'Myself' 'and' 'Irene' 'or']

df1 = pd.get_dummies(df1.A).reindex(columns=all_categories, fill_value=0)
print(df1)
   Me  Myself  and  Irene  or
0   1       0    0      0   0
1   0       1    0      0   0
2   0       0    1      0   0
3   0       0    0      1   0

df2 = pd.get_dummies(df2.A).reindex(columns=all_categories, fill_value=0)
print(df2)
   Me  Myself  and  Irene  or
0   1       0    0      0   0
1   0       1    0      0   0
2   0       0    1      0   0

df3 = pd.get_dummies(df3.A).reindex(columns=all_categories, fill_value=0)
print(df3)
   Me  Myself  and  Irene  or
0   1       0    0      0   0
1   0       1    0      0   0
2   0       0    0      0   1
3   0       0    0      1   0

只需获取虚拟对象？

df=pd。获取虚拟对象（df['A']）

应该可以完美地工作。不，它不能。示例：如果要处理多个文件，则虚拟对象仅获取一个文件中的实例。示例：如果我没有Irene，则Irene不会出现在dummy中。但我需要其他文件中的艾琳！明白我的意思吗？你的例子，不清楚，请考虑一下如果你有一行“我，我和艾琳”，哪里是逗号是分隔符呢？@ FAOFFEX——那么更好的是<代码> ST.GETSUMMIES ->代码> DF= Pd。DataFrame（{“A”：[我，我和艾琳] }）打印（DF.A.STR.GETSyDimeMes（‘，’））< /代码>