Python 3.x 使用Python的特征工程_Python 3.x_Pandas_Feature Extraction

Python 3.x 使用Python的特征工程

python-3.x pandas

Python 3.x 使用Python的特征工程,python-3.x,pandas,feature-extraction,Python 3.x,Pandas,Feature Extraction,我有一个熊猫数据集，其中一列如下： Genre ------------ Documentary Documentary Comedy|Mystery|Thriller Animation|Comedy|Family Documentary Documentary|Family Action|Adventure|Fantasy|Sci-F

我有一个熊猫数据集，其中一列如下：

         Genre
        ------------
         Documentary
         Documentary
         Comedy|Mystery|Thriller
         Animation|Comedy|Family
         Documentary
         Documentary|Family
         Action|Adventure|Fantasy|Sci-Fi
         Crime|Drama|Mystery
         Action|Crime|Mystery|Thriller

如何使用每个流派名称创建多个列，并在其包含该流派或0时填充1

预期输出：数据帧

  Documentary  Comedy  Mystery  Thriller  Animation  Family  ......
    1           0       0          0        0          0
    1            0       0          0        0          0
    0            1        1         1        0          0

等等

我尝试使用先将其转换为列表，然后拆分它，但这不是python的方法

我们可以使用

应用

函数或其他一些有效的技术来有效地完成它吗？

使用+：

使用

str.get\u dummies可直接实现简单操作

df1 = df.Genre.str.get_dummies('|')

Out[385]:
   Action  Adventure  Animation  Comedy  Crime  Documentary  Drama  Family  \
0       0          0          0       0      0            1      0       0
1       0          0          0       0      0            1      0       0
2       0          0          0       1      0            0      0       0
3       0          0          1       1      0            0      0       1
4       0          0          0       0      0            1      0       0
5       0          0          0       0      0            1      0       1
6       1          1          0       0      0            0      0       0
7       0          0          0       0      1            0      1       0
8       1          0          0       0      1            0      0       0

   Fantasy  Mystery  Sci-Fi  Thriller
0        0        0       0         0
1        0        0       0         0
2        0        1       0         1
3        0        0       0         0
4        0        0       0         0
5        0        0       0         0
6        1        0       1         0
7        0        1       0         0
8        0        1       0         1

使用

str.get\u dummies可直接实现简单操作

df1 = df.Genre.str.get_dummies('|')

Out[385]:
   Action  Adventure  Animation  Comedy  Crime  Documentary  Drama  Family  \
0       0          0          0       0      0            1      0       0
1       0          0          0       0      0            1      0       0
2       0          0          0       1      0            0      0       0
3       0          0          1       1      0            0      0       1
4       0          0          0       0      0            1      0       0
5       0          0          0       0      0            1      0       1
6       1          1          0       0      0            0      0       0
7       0          0          0       0      1            0      1       0
8       1          0          0       0      1            0      0       0

   Fantasy  Mystery  Sci-Fi  Thriller
0        0        0       0         0
1        0        0       0         0
2        0        1       0         1
3        0        0       0         0
4        0        0       0         0
5        0        0       0         0
6        1        0       1         0
7        0        1       0         0
8        0        1       0         1

获取虚拟对象

<代码>获取虚拟对象？伟大的是否有任何书籍/参考资料或网站包含熊猫和numpy数据框架的深度知识。提前感谢。data['genres'].str.split（“|”）.explode（）将错误设置为：AttributeError:“Series”对象没有属性“explode”。您需要0.25.1Great版本。是否有任何书籍/参考资料或网站包含熊猫和numpy数据框架的深度知识。提前感谢。data['genres'].str.split（'|'）.explode（）将错误设置为：AttributeError:'Series'对象没有属性'explode'，您需要0.25.1版本

df1 = df.Genre.str.get_dummies('|')

Out[385]:
   Action  Adventure  Animation  Comedy  Crime  Documentary  Drama  Family  \
0       0          0          0       0      0            1      0       0
1       0          0          0       0      0            1      0       0
2       0          0          0       1      0            0      0       0
3       0          0          1       1      0            0      0       1
4       0          0          0       0      0            1      0       0
5       0          0          0       0      0            1      0       1
6       1          1          0       0      0            0      0       0
7       0          0          0       0      1            0      1       0
8       1          0          0       0      1            0      0       0

   Fantasy  Mystery  Sci-Fi  Thriller
0        0        0       0         0
1        0        0       0         0
2        0        1       0         1
3        0        0       0         0
4        0        0       0         0
5        0        0       0         0
6        1        0       1         0
7        0        1       0         0
8        0        1       0         1